Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inrepoconfig: optimize secondary clones #30400

Merged
merged 2 commits into from Aug 18, 2023

Commits on Aug 17, 2023

  1. TestInteractor_Clone: rename source/dest dirs for readability

    No functional change.
    Linus Arver committed Aug 17, 2023
    Configuration menu
    Copy the full SHA
    21f8136 View commit details
    Browse the repository at this point in the history
  2. inrepoconfig: optimize secondary clones

    The only data we need from a clone inrepoconfig repo is the .prow
    contents (aside: we clone it in the first place so that we can merge to
    the base branch).
    
    The underlying Git client actually uses 2 clones --- first a mirror
    clone (git clone --mirror ...), and then a secondary local clone into a
    temporary folder (git clone ...). This secondary clone is the one we are
    optimizing here.
    
    This adds 2 optimizations:
    
    (1) Avoid populating object files with "--shared" flag. This way, the
    .git folder of the secondary clone can just have a pointer to the mirror
    clone. This saves a lot of disk space for large repos (not to mention
    that both the CPU time and disk usage for creating this clone is vastly
    reduced, especially when combined with the next optimization).
    
    (2) Avoid a full checkout with sparse-checkout (supported since Git
    2.25). This way, we can avoid deflating Git blob objects into file paths
    on disk for everything except (a) files at the toplevel and (b) files
    specified by the "git sparse-checkout set <PATH> [...<PATH>]" command.
    The toplevel files are included as part of "cone mode" which is the
    recommended mode for sparse checkouts. In our case we specify the
    ".prow" folder (inRepoConfigDirName) which is the only path to make sure
    we get those inrepoconfig files checked out.
    
    Sadly, for Git versions 2.32 (which we currently use in our components),
    the toplevel files are *not* included by default. So we have to specify
    it separately, which is why we have inRepoConfigFileName also in
    SparseCheckoutDirs. In Git 2.41 (probably since 2.36) specifying
    inRepoConfigFileName will make Git exit with an error because Git will
    see that the `.prow.yaml` is a file, not a folder [1]. So before we
    upgrade our components to use Git 2.36+ we have to delete the
    inRepoConfigFileName from this list.
    
    [1] git/git@4ce5043
    Linus Arver committed Aug 17, 2023
    Configuration menu
    Copy the full SHA
    8e5361f View commit details
    Browse the repository at this point in the history