Skip to content
AndrewRichardSmart edited this page Aug 25, 2017 · 31 revisions

Q: How do I create a subrepo from a folder in old/existing work? (i.e. split from git subtree)

A: First cd into your local copy of that work. You have two options depending on where you want to keep the new subrepo, either:

1. You may put the <subdir> into its own separate (remote) repository (with optional unique branch name):

$ cd <old-project>
$ git subrepo init <subdir> -r <remote> [-b <branch>]
$ git subrepo push <subdir>

This is likely what you want, as the shared code will then be in its own repository. You may then use this as a subrepo in your new project via the git subrepo clone command:

$ cd <new-project>
$ git subrepo clone <remote> [<subdir>] [-b <branch>]    # Use <remote>, <branch> as above- where you had put the shared code. <subdir> is where you want the subrepo in this new project.

2. Or we can keep the <subdir> in its own branch in the original repository:

$ cd <old-project>
$ git subrepo init <subdir>
$ git subrepo branch <subdir>
$ git branch
* master
  subrepo/<subdir>    <--This is the branch containing <subdir> and all its history

That branch can be pushed/pulled wherever you need. This might be a bit cleaner for you in your use case. You may then use this as a subrepo in your new project via the git subrepo clone command:

$ cd <new-project>
$ git subrepo clone <old-project> [<subdir>] [-b 'subrepo/<subdir>']    # Use the correct branch name shown above.

Synopsis

Do whichever makes the most sense for your shared code as it pertains to your use case. For one project it may make sense to keep the subrepo in the original repository, but for another it may make more sense to put it into a separate repository (e.g. for GitHub issues tracking, etc).

To migrate changes you've made back to that old project, you may either use git push, git subrepo push, or git subrepo pull depending on the context.

Q: How do I change the subrepo's tracking branch?

A: If you want to switch tracking branch on one of your subrepos, say go to another tag use:

git subrepo clone --force -b <new_remote_branch> <remote>

This will re clone your subrepo and make it track another branch. Note that if you do this changes that you haven't pushed to the subrepo will only be left in the main repo.

Q: I have a repository with commits on the order of 10^5 to 10^6 and I am concerned about performance. What is the performance/complexity of subrepo?

A: There are two facets to consider: 1) the setup with git subrepo init, and 2) pushing/pulling the subrepo (which requires a search for a common ancestor). For smaller repositories you will encounter hardly any delay.

Initialization performance with git subrepo init

git subrepo init with 34000 commits took 10.5 minutes (taken Feb 15th):

32,000 commit repo with almost 400,000 files. The Demo subdir is only referenced in 112 of those commits. #144

Subtree's split ran much slower than git subrepo init (quote fixed to reflect a performance increase):

rather than 10.5 minutes with subrepo and 6 hours with subtree. #145

Optional performance increases are in the works #142. e.g. option to --squash history into single commit may speed up git subrepo init from the 10.5 minutes to ~3 minutes as robe070 found above in #145. However squashing the history will loose the commit messages, and may not be what you want.

Initialization only has to be done once.

Performance/complexity when pushing/pulling the subrepo

Thanks to grimmySwe the complexity of subrepo pushing/pulling is now excellent (#138 pushed Feb 9th). This follows a tree-hash common-ancestor algorithm (which may change with #142):

  • Algorithm to find common tree ancestor (done in merge-base):
  • (S=Subrepo, C=Clone of subrepo)
  • Best case S+C (We always go through all commits at least once to detect duplicates)
  • Worst case of 2S+C
  • As C and S grows the time spent finding merge base will increase. The average case should often be like the best case because if you pull/push regularly you will find matches sooner. A reverse search is made (from most recent commit to oldest).
  • But... worst case is having a common clone commit and then have a small change in you C that you never push.

Performance at 10^5 commits using git-subrepo:master currently appears to be around 4 minutes, but is much faster using a different git-subrepo branch #265.

In git-subrepo version 0.3.0-0.3.1 consider squashing the history in order to have optimal push/pull performance. In 0.4.0 squashing the history won't help much other than initialization time.

Q: Difference between git-subrepo, submodule, and subtree #43?

git-subrepo subtree submodule
initialize from existing repo/subdir git subrepo init git subtree split git clone
git filter branch
many more commands ref
put separated common code into new project git subrepo clone git subtree add git submodule add

Q: What is currently going on in git-subrepo development? Status?

A: In master (0.3.0-0.3.1) a new tree hash based implementation was used. It had been found to have several drawbacks. Try branch issue/216 if you encounter issues with master. git-subrepo is moving back to having values stored in .gitrepo as was done in 0.2.3. More details.

These changes will be in the 0.4.0 master release after review #280.