New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Git repository rule cache #5928

Open
wants to merge 2 commits into
base: master
from

Conversation

Projects
None yet
10 participants
@ittaiz
Member

ittaiz commented Aug 17, 2018

fixes #5116 still WIP
Main open question which we'd love @aehlig's thoughts about is how to decide where to clone this?
Ideally it should be controlled by --repository_cache's value and just add a new key type (git?) which this code clones under it. WDYT?
If this sounds like a good idea we'd really appreciate pointers on how to access the command line flag's value from the repository rule.

thank you @aoded and @eyalmalron for your work on this

aoded added some commits Aug 15, 2018

#BS-466 - First phase of adding cache folder to git_repository rule. …
…Cachce path is currently harcoded. Cache isn't supported when rule has the init_submodules attribute
@googlebot

This comment has been minimized.

googlebot commented Aug 17, 2018

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and have the pull request author add another comment and the bot will run again. If the bot doesn't comment, it means it doesn't think anything has changed.

@googlebot googlebot added the cla: no label Aug 17, 2018

@aiuto aiuto requested a review from aehlig Aug 17, 2018

@ittaiz

This comment has been minimized.

Member

ittaiz commented Aug 17, 2018

@ittaiz ittaiz changed the title from [WIP] Git report rule cache to [WIP] Git repository rule cache Aug 19, 2018

@Globegitter

This comment has been minimized.

Globegitter commented Aug 21, 2018

Another interesting flag to support would be --experimental_repository_cache_hardlinks but I guess that could not be achieved with git and would really require a proper skylark api.

@edbaunton

This comment has been minimized.

Contributor

edbaunton commented Aug 21, 2018

If you clone from the cache (on the same filesystem) I believe git will automatically create hardlinks for you https://git-scm.com/docs/git-clone

@ittaiz

This comment has been minimized.

Member

ittaiz commented Aug 21, 2018

@aehlig can we get your feedback about this? Also, how do you recommend we handle the path?
I have a bit of hope of getting this before 0.17.0 RC1 is cut (probably next few days)

@Globegitter

This comment has been minimized.

Globegitter commented Aug 24, 2018

Ah nice @edbaunton the docs do talk about hard linking but only when the --local flag is provided. So should this flag maybe be passed in also?

@ittaiz

This comment has been minimized.

Member

ittaiz commented Aug 26, 2018

@aehlig ping

@aehlig

This comment has been minimized.

Member

aehlig commented Sep 13, 2018

@VolodymyrPolishchuk

This comment has been minimized.

VolodymyrPolishchuk commented Sep 18, 2018

@aehlig ping
I didn't realize you already wanted a review for a pull request marked "Work in progress". In any case, the concern I already mentioned in direct communication is still valid: sharding the git cache by name(!) of the repository contradicts the general idea of our repository caches being conent-addressable. This also is unnecessary, as git already stores everything in a content-adressable way. Also, hard-coding cache paths is not a ready-to-merge solution. Using an appropriate environement variable (and falling back to not caching, if unset) is a solution for the time being, till #6016 is done.

-- Klaus Aehlig Google Germany GmbH, Erika-Mann-Str. 33, 80636 Muenchen Registergericht und -nummer: Hamburg, HRB 86891 Sitz der Gesellschaft: Hamburg Geschaeftsfuehrer: Paul Terence Manicle, Halimah DeLaine Prado

Hi Klaus!
we are working on improvement of this PR to align with bazel's vision of local cache.
Afaiu, if we clone multiple repositories into temp directory, we may encounter naming collisions, since repositories end up being folders of the same name as repository, am i correct?

@ittaiz

This comment has been minimized.

Member

ittaiz commented Sep 19, 2018

@siedentop

This comment has been minimized.

Contributor

siedentop commented Oct 11, 2018

Thank you @aehlig for the nice collaboration today at the hackathon. My branch can be found here.

I've written what is definitely still missing inside the commit.

Globegitter pushed a commit to ecosia/bazel that referenced this pull request Dec 18, 2018

Git_repository: WIP: cache git repositories using 'git worktree'
Inspired by the following pull request and done in collaboration
with @aehlig (Klaus Aehlig) at the Bazel Hackathon 2018.

bazelbuild#5928

Lots of open issues remaining:
* Unit tests needs to be adapted to the fact that there is now a cache.
* Klaus would like it to be optional only (I agree).
* 'git worktree add ...' does not work on second run. --> tests failing
because of that.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment