Skip to content
AndrewRichardSmart edited this page Jan 12, 2019 · 31 revisions

Q: How do I create a subrepo from a folder in old/existing work? (i.e. split from git subtree)

A: This answer assumes your old work is in a git repository, and you want to keep the commit history of the commits relevant to <subdir>.

You have two options depending on where you want to keep the new subrepo. It must be maintained in a separate repo or branch from the old project. Either:

1. You may put the <subdir> into its own separate repository (<remote>), with optional unique branch name (<branch>):

$ cd <old-project>
$ git subrepo init <subdir> -r <remote> [-b <branch>]
$ git subrepo push <subdir>

This is likely what you want, as the shared code will then be in its own repository. You may then use this as a subrepo in your new project via the git subrepo clone command:

$ cd <new-project>
$ git subrepo clone <remote> [<subdir>] [-b <branch>]    # Use <remote>, <branch> as above- where you had put the shared code. <subdir> is where you want the subrepo in this new project.

2. Or we can keep the <subdir> in its own branch in the original repository:

$ cd <old-project>
$ git subrepo init <subdir>
$ git subrepo branch <subdir>
$ git branch
* master
  subrepo/<subdir>    <--This is the branch containing <subdir> and all its history

That branch can be pushed/pulled wherever you need. This might be a bit cleaner for you in your use case. You may then use this as a subrepo in your new project via the git subrepo clone command:

$ cd <new-project>
$ git subrepo clone <old-project> [<subdir>] [-b 'subrepo/<subdir>']    # Use the correct branch name shown above.

Synopsis

Do whichever makes the most sense for your shared code as it pertains to your use case. For one project it may make sense to keep the subrepo in the original repository, but for another it may make more sense to put it into a separate repository (e.g. for GitHub issues tracking, etc).

To migrate changes you've made back to that old project, you may either use git push, git subrepo push, or git subrepo pull depending on the context.

Q: How do I change the subrepo's tracking branch?

A: If you want to switch tracking branch on one of your subrepos, say go to another tag use:

git subrepo clone --force -b <new_remote_branch> <remote>

This will re clone your subrepo and make it track another branch. Note that if you do this changes that you haven't pushed to the subrepo will only be left in the main repo.

Q: Difference between git-subrepo, subtree, and submodule?

A: With subtree/submodule you may find the procedure to update and manage the common code to be more difficult (details) and history cluttering (old critique of git-subrepo from 2014 no longer relevant). With git-subrepo you can make changes within a project to the common code, commit them, and push them upstream with one git subrepo push command.

git-stree has been deprecated in favor of git-subrepo #116.

There are still unsolved problems in git-subrepo, it is essentially still in beta unready for mainstream release. However you may find it satisfies your needs.

git-subrepo subtree submodule
initialize from existing repo/subdir git subrepo init git subtree split git clone
git filter branch
many more commands
put separated common code into new project git subrepo clone git subtree add git submodule add

Q: I have a repository with commits on the order of 10^5 to 10^6 and I am concerned about performance. What is the performance/complexity of subrepo?

A: There are two facets to consider: 1) the setup with git subrepo init, and 2) routine pushing/pulling the subrepo (which requires a search for a common ancestor). For smaller repositories you will encounter hardly any delay.

Initialization performance with git subrepo init

git subrepo init with 34000 commits took 10.5 minutes (taken Feb 15th):

32,000 commit repo with almost 400,000 files. The Demo subdir is only referenced in 112 of those commits. #144

Subtree's split ran much slower than git subrepo init (quote fixed to reflect a performance increase):

rather than 10.5 minutes with subrepo and 6 hours with subtree. #145

Optional performance increases are in the works #142. e.g. option to --squash history into single commit may speed up git subrepo init from the 10.5 minutes to ~3 minutes as robe070 found above in #145. However squashing the history will loose the commit messages, and may not be what you want.

Initialization only has to be done once.

Performance/complexity when pushing/pulling the subrepo

Thanks to grimmySwe the complexity of subrepo pushing/pulling is now excellent (#138 pushed Feb 9th). This follows a tree-hash common-ancestor algorithm (which may change with #142):

  • Algorithm to find common tree ancestor (done in merge-base):
  • (S=Subrepo, C=Clone of subrepo)
  • Best case S+C (We always go through all commits at least once to detect duplicates)
  • Worst case of 2S+C
  • As C and S grows the time spent finding merge base will increase. The average case should often be like the best case because if you pull/push regularly you will find matches sooner. A reverse search is made (from most recent commit to oldest).
  • But... worst case is having a common clone commit and then have a small change in you C that you never push.

Performance at 10^5 commits using git-subrepo v0.3.1 currently appears to be around 4 minutes, but is much faster using a different git-subrepo branch #265.

In git-subrepo v0.3.0-0.3.1 consider squashing the history in order to have optimal push/pull performance. In 0.4.0 squashing the history won't help much other than initialization time.

Q: What is currently going on in git-subrepo development? Status?

A: In master (v0.3.0-0.3.1) a new tree hash based implementation was used. It had been found to have several drawbacks. Try branch issue/216 if you encounter issues with master (v0.3.0-0.3.1). git-subrepo is moving back to having values stored in .gitrepo as was done in 0.2.3. More details.

These changes will be in the 0.4.0 master release after review #280. More details.

Q: If I managed to push, but the local .gitrepo is not updated (version 0.4.0+)

A: When do this happen? If the subrepo push failed and you try the push manually, you will not get the updated .gitrepo file. Easiest way to update your local gitrepo is to perform a new clone overwriting the existing one. You will need to use the --force flag for this.