Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

Git + `pod install` is slow #1077

Closed
devknoll opened this Issue · 27 comments

7 participants

@devknoll
$ time pod install
Analyzing dependencies
Downloading dependencies
Installing cocos2d (2.1-rc2)
Generating Pods project
Integrating client project

[!] From now on use `App.xcworkspace`.

real    1m3.338s
user    0m44.623s
sys     0m8.015s

I ran with --verbose and the problem seems to be in the git commands. It's doing

$ git init
$ git remote add origin [cached path]
$ git fetch origin tags/release-2.1-rc2 2>&1 # This takes up the majority of the time.

Is there any reason why it should do that instead of:

$ git clone [cached path] cocos2d
$ cd cocos2d
$ git checkout release-2.1-rc2

Which only takes a couple seconds?

@fabiopelosin

The cache needs to fetch the tags from the origin, otherwise it might miss any new tags. It would also fail to detect tags which where updated to point to another commit.

In the past the fetch was done only if the tag was missing but this lead to confusion to spec contributors which where updating a tag after the linting process. There is still an option for that in the config if you would like to enable it by default.

@devknoll

It's completely possible that I'm misunderstanding git or what needs to be done here, so bear with me...

The cache needs to fetch the tags from the origin, otherwise it might miss any new tags. It would also fail to detect tags which where updated to point to another commit.

But that happens in the in the update_cache method, right? After all, that is between the aggressive_cache block.

The slowness is with this fetch, which has its origin as the cache (not the remote). The cache has already been updated with update_cache, so what would the fetch do here that a clone wouldn't do?

@alloy
Owner

I have not tried with the cocos2d repo, but on a small scale test (35 MB repo), I can already notice a slight difference.

How large is the cocos2d repo?

Clone
~/tmp » time git clone ~/Library/Caches/CocoaPods/Git/757602ae2b4913f1d030b85f6d9b0c3980fff0a1 git-test
Cloning into 'git-test'...
done.
        0.55 real         0.23 user         0.28 sys
Fetch
~/tmp » mkdir git-test
~/tmp » cd git-test/
~/t/git-test » git init
Initialized empty Git repository in /Users/eloy/tmp/git-test/.git/
~/t/git-test [HEAD] » git remote add origin ~/Library/Caches/CocoaPods/Git/757602ae2b4913f1d030b85f6d9b0c3980fff0a1
~/t/git-test [HEAD] » time git fetch origin tags/2.12.7 2>&1
remote: Counting objects: 38668, done.
remote: Compressing objects: 100% (7684/7684), done.
remote: Total 38668 (delta 29917), reused 38621 (delta 29885)
Receiving objects: 100% (38668/38668), 30.48 MiB | 16.25 MiB/s, done.
Resolving deltas: 100% (29917/29917), done.
From /Users/eloy/Library/Caches/CocoaPods/Git/757602ae2b4913f1d030b85f6d9b0c3980fff0a1
 * tag               2.12.7     -> FETCH_HEAD
        4.63 real         5.62 user         1.13 sys
@alloy alloy reopened this
@alloy
Owner

It looks like git-clone might recognise that it's a local repo and simply copies over the repo as-is, whereas the git-fetch route does an actual git fetch which is probably slower than a file level copy. This is pure speculation, though. It would be great if you could look into if this is the case or what might otherwise be the issue.

@devknoll

Looks like cocos2d is ~426 MB.

It looks like git-clone might recognise that it's a local repo and simply copies over the repo as-is, whereas the git-fetch route does an actual git fetch which is probably slower than a file level copy.

That was my guess as well.

@alloy
Owner

If I apply some naive math, that would indeed mean a fetch would take around a minute for me.

It’s quite reasonable what git does. In the case of git-clone no destination repo even exists, so it can safely optimise. Our git-fetch route, however, could mean that the destination repo already has content, or other reasons for granularity.

I don’t remember from the top of my head, but I assume we use git-fetch so that it will update a destination repo if it already existed. If so, then we should probably use git-clone when the destination repo does not exist yet and the origin is the cache. (Ideally it would always use git-clone if the repo is local.)

Is this something you will want to work on?

@fabiopelosin

It looks like git-clone might recognize that it's a local repo and simply copies over the repo as-is, whereas the git-fetch route does an actual git fetch which is probably slower than a file level copy.

Git fetch is used because git clone will only download the history of the default branch (some info about the default branch of a bare repo is available here). As we can't know whether a given tag will be on the default branch or not, we need to fetch the whole repo (the cache) in our clone (the Pod checkout used for the installation).

I agree that the cocos2d repo is insanely slow in CocoaPods. This is even more evident in the creation of the mirror repo used as a cache.

@devknoll
git clone [cache] cocos2d
cd cocos2d
git remote update
git pull --all

Would this then be sufficient for that case?

@fabiopelosin

Interesting discussion from http://git.661346.n2.nabble.com/Cloning-a-remote-tag-without-using-git-fetch-pack-directly-td6288868.html.

To clone a subset of a repository, you have to do the init+fetch trick,
as you did above. If you want the configuration set up by clone, you
can do that, too, with "git config". So the equivalent commands to the
clone you want are:

  git init linux-2.6 
  cd linux-2.6 
  git config remote.origin.url /home/josh/src/linux-2.6 
  git config remote.origin.fetch refs/tags/v2.6.12 
  git fetch origin 
@fabiopelosin

I don't think that git pull --all would be beneficial respect to git fetch origin tags/release-2.1-rc2. Did you test it?

from the git-pull manual:

--all
Fetch all remotes.

@devknoll

I see what you mean. I'm not sure if it would be sufficient (I don't have a good test case).

Interesting discussion from http://git.661346.n2.nabble.com/Cloning-a-remote-tag-without-using-git-fetch-pack-directly-td6288868.html.

This appears to be just as slow :frowning:

@alloy
Owner

Git fetch is used because git clone will only download the history of the default branch (some info about the default branch of a bare repo is available here). As we can't know whether a given tag will be on the default branch or not, we need to fetch the whole repo (the cache) in our clone (the Pod checkout used for the installation).

@irrationalfab Gotcha. However, I guesstimate that if you fetch the default branch by using git clone, then in most cases this will already contain most objects. Doing a git fetch afterwards should then only fetch any objects that aren’t already fetched, which in the ideal case might be 0.

@devknoll Can you try how fast this is?

git clone [cache] cocos2d
cd cocos2d
git fetch origin tags/2.12.7 2>&1
git checkout release-2.1-rc2
@alloy
Owner

Hrmm, I have an email notification from this thread where @irrationalfab says “Nice PR @devknoll - thanks.”, but I have no idea where that comment is… Can anyone point me to it?

@fabiopelosin

@irrationalfab Gotcha. However, I guesstimate that if you fetch the default branch by using git clone, then in most cases this will already contain most objects.

I think that in the case of the Cocos2d repo there are some important branches which haven't been merged in master for various reasons. At least this is the only explanation that I can find.

Btw, about your proposed test I fail to see how is it different from the current implementation which already passes the tag to the fetch command.

@alloy The message was posted by accident while toying around with @orta's extension. There is a green button whose purpose was unclear to me :smile:

@orta
Owner

yeah that's "say something nice and merge". By the weekend I'll have a version that's less magic.

@fabiopelosin

@orta The extension is :+1:!

@alloy
Owner

I think that in the case of the Cocos2d repo there are some important branches which haven't been merged in master for various reasons. At least this is the only explanation that I can find.

Sure, but I think it’s fairly common that the default branch will contain many objects that are also present on the other branches. So the fetch should only need to fetch objects that are not on the default branch.

Btw, about your proposed test I fail to see how is it different from the current implementation which already passes the tag to the fetch command.

I first do a clone before the fetch, instead of an init.

This means that, assuming the default branch contains many objects that are present in the tag, the git clone action will be fast and copy the majority of the objects, and the fetch then only has to apply the differences.

@alloy
Owner

@irrationalfab @orta lol

@devknoll

@alloy Looks faster.

$ git clone ~/Library/Caches/CocoaPods/GitHub/acf17302384c833b0f1c71211e010148c7a7068e cocos2d
Cloning into 'cocos2d'...
done.
Checking out files: 100% (1581/1581), done.
$ cd cocos2d
$ git fetch origin tags/release-2.1-rc2
From ~/Library/Caches/CocoaPods/GitHub/acf17302384c833b0f1c71211e010148c7a7068e
 * tag               release-2.1-rc2 -> FETCH_HEAD
$ git checkout release-2.1-rc2
Note: checking out 'release-2.1-rc2'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

  git checkout -b new_branch_name

HEAD is now at 11af159... cocos2d v2.1-rc2

real    0m4.406s
user    0m0.799s
sys     0m0.515s
@alloy
Owner

@devknoll Awesome, thanks. Then this is how it should work. Will you create a patch for it?

@devknoll devknoll referenced this issue in CocoaPods/cocoapods-downloader
Merged

Git clone + fetch instead of init + fetch. #3

@fabiopelosin

Finally :beers: :

@katgironpe

For me, it's the "analyzing dependencies" part that's incredibly slow.

@fabiopelosin

@katgironpe Can you point out which step is being incredibly slow by using the pod install --verbose command?

@katgironpe

@irrationalfab figured that out right after posting. I fixed the missing statement on Podfile and it worked.

@yorkie

I have the same problem:

Analyzing dependencies

Updating spec repositories
  $ /usr/bin/git rev-parse  >/dev/null 2>&1
  $ /usr/bin/git ls-remote

but $ git ls-remote just returns its results after 5s, then I think this is a bug in CocoaPods.

@pkotesh-systango

I have the same problem. It stopped for so long at this point

   Updating spec repo `master`
      $ /usr/bin/git pull --ff-only
@kylef kylef locked and limited conversation to collaborators
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Something went wrong with that request. Please try again.