Skip to content

Commit

Permalink
git v2 client: keep the primary clone up to date
Browse files Browse the repository at this point in the history
The primary clone can get "stale" if it doesn't keep up with the changes
going into the upstream remote. Staleness can result in a performance
penalty for EnsureFreshSecondary(), because it would need to fetch more
objects to get to the commit SHAs specified in repoOpts.FetchCommits.

Git clients that are not using `repoOpts.FetchCommits` cannot get
stale, because they invoke the (expensive) RemoteUpdate() call on the
primary clone which fetches all refs. However, the primary clones for
inrepoconfigs can indeed get stale because it uses
`repoOpts.FetchCommits` to do targeted fetches only for the secondary
clone since bdd601c (git v2 client for inrepoconfig: allow targeted
fetches (2nd attempt), 2023-08-22).

Allow updating the primary clone with a new PrimaryCloneUpdateCommits
field to periodically update the primary clone with new commits. For
inrepoconfigs, we only specify the baseSHA because it may be that the
headSHAs never get merged into the base (perhaps the PR gets rejected or
abandoned).
  • Loading branch information
Linus Arver committed Aug 26, 2023
1 parent df1280a commit e46df0a
Show file tree
Hide file tree
Showing 3 changed files with 16 additions and 0 deletions.
4 changes: 4 additions & 0 deletions prow/config/inrepoconfig.go
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,10 @@ func prowYAMLGetter(
// change was a fast-forward merge. So we need to dedupe it with sets.
repoOpts.FetchCommits = sets.New(baseSHA)
repoOpts.FetchCommits.Insert(headSHAs...)
// Only update the primary with the baseSHA, because it may be that the
// headSHAs never get merged into the base (perhaps the PR gets rejected or
// abandoned).
repoOpts.PrimaryCloneUpdateCommits = sets.New(baseSHA)
repo, err := gc.ClientForWithRepoOpts(orgRepo.Org, orgRepo.Repo, repoOpts)
inrepoconfigMetrics.gitCloneDuration.WithLabelValues(orgRepo.Org, orgRepo.Repo).Observe((float64(time.Since(timeBeforeClone).Seconds())))
if err != nil {
Expand Down
10 changes: 10 additions & 0 deletions prow/git/v2/client_factory.go
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,12 @@ type RepoOpts struct {
// NoFetchTags determines whether we disable fetching tag objects). Defaults
// to false (tag objects are fetched).
NoFetchTags bool
// PrimaryCloneUpdateCommits are any additional SHAs we need to fetch into
// the primary clone to bring it up to speed. This should at least be the
// current base branch SHA. Needed when we're using shared objects because
// otherwise the primary will slowly get stale with no updates to it after
// its initial creation.
PrimaryCloneUpdateCommits sets.Set[string]
}

// Apply allows to use a ClientFactoryOpts as Opt
Expand Down Expand Up @@ -394,6 +400,10 @@ func (c *clientFactory) ensureFreshPrimary(cacheDir string, cacheClientCacher ca
if err := cacheClientCacher.RemoteUpdate(); err != nil {
return err
}
} else if repoOpts.PrimaryCloneUpdateCommits.Len() > 0 {
if err := cacheClientCacher.FetchCommits(repoOpts.NoFetchTags, repoOpts.PrimaryCloneUpdateCommits.UnsortedList()); err != nil {
return err
}
}
}

Expand Down
2 changes: 2 additions & 0 deletions prow/git/v2/interactor.go
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,8 @@ type cacher interface {
MirrorClone() error
// RemoteUpdate fetches all updates from the remote.
RemoteUpdate() error
// FetchCommits fetches only the given commits.
FetchCommits(bool, []string) error
}

// cloner knows how to clone repositories from a central cache
Expand Down

0 comments on commit e46df0a

Please sign in to comment.