Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace shelling out git in src/git.rs with usage of git2 crate. #86

Open
tomprince opened this issue Jun 4, 2017 · 9 comments
Open
Labels
C-refactoring Category: refactoring of existing code

Comments

@tomprince
Copy link
Member

No description provided.

@tomprince tomprince added help wanted C-refactoring Category: refactoring of existing code labels Jun 4, 2017
@Mark-Simulacrum
Copy link
Member

Note that while this may be better in the long run, the code usually gets more complex and is sometimes harder to write/test.

@balajisivaraman
Copy link

balajisivaraman commented Mar 19, 2018

@Mark-Simulacrum, If we still want to go ahead and make this change, I can work on it and send a PR. I've already begun making changes locally using git2-rs, and it seems relatively straightforward. Please let me know your thoughts.

EDIT: Unfortunately, got stuck with the issue where libgit2 (See: libgit2/libgit2#3058) - and thereby git2rs - does not support shallow clones yet. So this most definitely is not straightforward, sadly.

@Mark-Simulacrum
Copy link
Member

Hm, I think shallow clones might actually be hurting us in the long run as we don't particularly care about disk space and we presumably end up making them deep in many cases anyway while re-pulling, though I'm not certain. It'd be good to figure out how much we're gaining in performance and/or disk space because of shallow clones.

@aidanhs
Copy link
Member

aidanhs commented Mar 28, 2018

The original command used to clone effectively becomes irrelevant after the first run because all the git repos are cached so it just pulls new commits after that (which doesn't deepen by default). Shallow clones are a net benefit but aren't too much of a big deal.

@pietroalbini
Copy link
Member

Uhm, there are some issues with swallow clones on GitHub making them more expensive than normal clones. Sure, we're not at that scale, but I think that's something worth considering.

@aidanhs
Copy link
Member

aidanhs commented Mar 28, 2018

I'm aware of that issue, but there are many things different about it that I don't think it's very informative for us:

  1. in the two years since that comment was posted, github submitted a patch to git to improve fetch performance for shallow clones
  2. rather than one huge repo with a pathological directory structure that basically always has many new commits, our pattern is many tiny repos with mostly 0 new commits since the last run
  3. we have two machines which update the repos once every 4 days, rather than ~100,000 users updating all the time

Point 2 is probably the most notable - if there are no changes, there's nothing for github to do.

To be honest, I'd be 👍 on normal clones just because they're less exotic and less likely to cause surprises. Other than that its pretty much a wash - disk savings aren't too interesting when just doing a crater run takes about 1TB.

@balajisivaraman
Copy link

balajisivaraman commented Mar 29, 2018

Hmmm. Just following the conversation, I get the feeling that we're OK with switching to git2rs and using regular clones. If that is the case, I'd be interested in taking this issue to completion and closing it. 👍

@aidanhs
Copy link
Member

aidanhs commented Mar 29, 2018

@balajisivaraman that sounds great - I'm certainly willing to review and merge if you make a PR :)

@balajisivaraman
Copy link

@aidanhs, Thanks for the confirmation! I'll get started on it. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-refactoring Category: refactoring of existing code
Projects
None yet
Development

No branches or pull requests

5 participants