Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support caching repositories #235

Open
japborst opened this issue Feb 21, 2022 · 6 comments
Open

Support caching repositories #235

japborst opened this issue Feb 21, 2022 · 6 comments
Labels
enhancement New feature or request

Comments

@japborst
Copy link

Hello!

When using multi-gitter I noticed that on every run the respective repos are always pulled.

It would be great if this could be cached, to avoid long wait times to pull many repositories (especially when the entire org is specified).

@lindell
Copy link
Owner

lindell commented Feb 23, 2022

I think it would be useful if it will not create to much problem for the user.
How do you image this to work? 😄
Should the user set a cache timeout themselfs, and if they enable caching expect errors such as merge conflicts which they have to deal with manually.

@lindell lindell added the enhancement New feature or request label Feb 23, 2022
@lindell lindell changed the title Enhancement: support caching repositories Support caching repositories Feb 23, 2022
@Stephan202
Copy link

@lindell admitted I didn't deeply think about this yet, but a first version could implement an algorithm such a the following, given a $CACHE_ROOT directory (I'm assuming Github terminology):

  • If $CACHE_ROOT/$org/$repo doesn't exist: check out as usual.
  • If $CACHE_ROOT/$org/$repo does exist, execute a number of commands to get it into a pristine state:
    git clean -fdx
    git fetch --depth=[the-configured-fetch-depth]
    git remote prune origin
    git remote set-head origin -a
    git checkout [the-configured-base-branch]
    git reset --hard origin/[the-configured-base-branch]
    # ^ If not configured, could run the semantic equivalent of e.g.:
    #     git symbolic-ref refs/remotes/origin/HEAD \
    #       | sed "s,^refs/remotes/origin/,," \
    #       | xargs git checkout
    #     git reset --hard refs/remotes/origin/HEAD 
    git submodule update --recursive # If `multi-gitter` currently handles submodules; didn't check.
    (I'm no Git guru, so perhaps there's a more straightforward way to reset the repository into a pristine state, containing the n most recent commit on the configured target branch, but the over-all gist would be the same: (a) re-use already-downloaded data, (b) update to match the most recent state, (c) clear any local modifications.)

I suppose there should also be a --trust-cached-repositories flag (better name TBD), so that during rapid prototyping the user can iterate on the script passed to multi-gitter run without incurring any IO overhead.

@lindell
Copy link
Owner

lindell commented Feb 23, 2022

@Stephan202 So in that case, multi-gitter would still need to fetch from the remote. I guess this could speed up the process in some cases with very big repos and small changes 🤔 For those usecases it would indeed be useful.

@Stephan202
Copy link

Indeed, we have a number of large repos that would benefit from this.

(Currently we have a repository containing all our other repositories as submodules, with various operations performed using git submodule foreach. This can be a bit unwieldy, but does have the benefit of repository state updates being decoupled from modification operations, which avoids extensive waiting between trials, even when on a slow network.)

@japborst
Copy link
Author

To give a little more flavour to the size of the problem: in our case (and I imagine many other companies) running multi-gitter against the entire GitHub org means cloning hundreds of repos. Even using the default depth of 1, that still means fetching between a few MB up to - worst case - a GB.

@lindell
Copy link
Owner

lindell commented Mar 1, 2022

I do agree that this is something that should be added! I will not have the time to look at this any time soon, but if you add it and create a PR, I'm happy to merge it 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants