Shared repositories (git mirrors) between checkouts #936

lox · 2019-03-03T23:14:08Z

This is another take on #917 that should hopefully be less controversial. Rather than introducing a new phase, this creates a shared repository mirror that is then used for reference cloning the subsequent checkout. This should radically reduce network usage for cloning big repositories. Otherwise the semantics are the same as without a reference clone, no new checkout dirs or other complexities.

This introduces an experiment, git-mirrors, and several new configuration options:

git-mirrors-path: a directory that contains the git mirrors shared by agents on a give machine, unique to a given repository address
git-clone-mirrors-flags: the parameters passed to git clone for the mirroring portion of the checkout. Defaults to -v --mirror.
git-mirrors-lock-timeout: the number of seconds to wait for a lock to expire on a checkout that has crashed. Needs to be greater than the maximum time for a git clone on the agent.

It also introduces the idea of running the integration tests with an experiment enabled, which whilst potentially could end up with an explosion of permutations, for the minute it allows us to test checkouts with and without the git-mirrors experiment.

To test this out, you can run your agent with:

$  buildkite-agent start --experiment git-mirrors --git-mirrors-path /tmp/buildkite-git-mirrors

My plan is to merge this and release it in 3.10.0, with the intention of making it the default and removing the experiment wrapper in 4.0.0.

lox · 2019-03-03T23:15:28Z

bootstrap/bootstrap.go

+
+				// if we have a reference, add the submodule to it
+				if reference != "" {
+					name := fmt.Sprintf("submodule%d", idx+1)


This might require some more thought when submodules change.

DoomGerbil

This change look awesome!

We have some large, slow-to-clone repos in some of our pipelines, and we would love to have this. To the point where I was about two days away from building it myself before we spoke about this PR. :)

agent/job_runner.go

bootstrap/bootstrap.go

DoomGerbil · 2019-03-08T14:30:44Z

bootstrap/integration/bootstrap_tester.go

@@ -78,6 +84,7 @@ func NewBootstrapTester() (*BootstrapTester, error) {
 			"HOME=" + homeDir,
 			"BUILDKITE_BIN_PATH=" + pathDir,
 			"BUILDKITE_BUILD_PATH=" + buildDir,
+			"BUILDKITE_REPOS_PATH=" + reposDir,


Same comment as above about BUILDKITE_REPOS_PATH vs BUILDKITE_REPO_MIRROR_PATH.

clicommand/agent_start.go

gavinelder

Main NIT: I found variable naming somewhat ambiguous.

Should it be Repo vs Repos, repository vs repositories ?

petemounce

LGreatTM.

agent/agent_configuration.go

agent/job_runner.go

bootstrap/bootstrap.go

bootstrap/config.go

clicommand/bootstrap.go

petemounce · 2019-03-08T15:07:50Z

How does this interact with running multiple agent processes per host via

separate processes (e.g. systemd template-unit based vs unit based)
--spawn n

?

I would prefer that there be a single reference of each repo per host as opposed to per BK agent instance, to save disk space.

DoomGerbil · 2019-03-08T15:14:16Z

Another quick thing comes to mind - since it's probably not safe to assume that you can guarantee that the mirrored repos never get any unexpected mutations, then it's probably safer to also include the --dissociate flag to git clone as well as --reference. Otherwise, the reference clone will mostly just hard-link to the mirror clone, which can cause issues if any mutations happen to the mirror clone.

DoomGerbil · 2019-03-08T15:26:46Z

Also, though this doesn't necessarily reflect on everyone's experience (since this particular repo is very large), I did some git benchmarking of clones of our worst-performing repo using various different sets of git clone flags.

# Totally normal clone
time git clone git@github.com:improbableio/UnrealEngine.git \
std-clone
	Time: 9m21.282s
	Size: 3.3GB

# Mirror clone
time git clone git@github.com:improbableio/UnrealEngine.git \
--mirror \
std-clone
	Time: 9m1.757s
	Size: 2.1GB

# Standard clone with a local reference
time git clone git@github.com:improbableio/UnrealEngine.git \
--reference std-clone \
ref-clone
	Time: 0m30.103s
	Size: 1.2GB

# Shallow clone from remote, no reference
time git clone git@github.com:improbableio/UnrealEngine.git \
--depth 1 \
shallow-clone
	Time: 1m16.511s
	Size: 1.4GB

# Standard clone with a local reference, dissociated
time git clone git@github.com:improbableio/UnrealEngine.git \
--reference std-clone \
--dissociate \
ref-dis-clone
	Time: 1m1.195s
	Size: 3.3GB

# Clone with a local reference, dissociated, no post-clone checkout
time git clone git@github.com:improbableio/UnrealEngine.git \
--reference std-clone \
--dissociate \
--no-checkout \
ref-dis-nocheckout-clone
	Time: 0m26.485s
	Size: 2.1GB

# Shallow clone with a local reference, dissociated, no post-clone checkout
time git clone git@github.com:improbableio/UnrealEngine.git \
--reference std-clone \
--dissociate \
--no-checkout \
--depth 1 \
ref-dis-nocheckout-shallow-clone
	Time: 0m19.503s
	Size: 1.5GB

# Shallow clone with a local reference, dissociated
time git clone git@github.com:improbableio/UnrealEngine.git \
--reference std-clone \
--dissociate \
--depth 1 \
ref-dis-shallow-clone
	Time: 0m35.618s
	Size: 2.6GB

# Shallow clone with a local reference, no post-clone checkout
time git clone git@github.com:improbableio/UnrealEngine.git \
--reference std-clone \
--no-checkout \
--depth 1 \
ref-nocheckout-shallow-clone
	Time: 0m2.188s
	Size: 696K

Again, this is not intended to be a scientific study of the impacts of these flags, but in our case, this is the scale of change that we see, and highlights why we want this change so much. :)

lox · 2019-03-09T02:45:56Z

That data is super helpful @DoomGerbil!

lox · 2019-03-09T02:47:35Z

@DoomGerbil I think I agree that --dissociate is a sensible default.

lox · 2019-03-09T02:47:57Z

I would prefer that there be a single reference of each repo per host as opposed to per BK agent instance, to save disk space.

@petemounce, yup that's the design!

lox · 2019-03-10T05:51:58Z

After some discussion @keithpitt and I ended up on BUILDKITE_GIT_MIRRORS_PATH.

lox · 2019-03-10T07:39:55Z

@harrietgrace I might need some assistance from you with updating the docs for this at some point!

toolmantim · 2019-03-10T21:26:43Z

Is it an opt-in config option at the moment? If so, why don’t we remove it from experiments?

lox · 2019-03-10T22:46:07Z

@toolmantim we've done that in that past. I think the big downside is that we miss out on the really clear "this is super experimental" . This feature will likely be flakey for a month or two whilst we work through the edge cases, so I feel like having to strongly opt-in to the experimental nature helps manage expectations.

The alternative is releasing a beta agent, but I'd much prefer to keep iterating quickly on stable with new things behind experiment flags.

toolmantim · 2019-03-10T22:48:29Z

Okay! Probably shouldn’t be changelogged or added to docs yet then?

lox · 2019-03-10T22:55:31Z

@toolmantim any reason we can't mention an experimental feature in a changelog? It's something heaps of people are interested in and would be great to get feedback on. We mention other beta features there?

toolmantim · 2019-03-10T23:01:27Z

Oh, it sounded this was “get out a sneaky build to test if this actually works as designed” and I was assuming you had enough people to test it. And then after we’re confident it works, turn off the experimental flag in a new point release and changelog. I thought a changelog announcement would be a good push for making sure we move it out of experimental.

But perhaps we just do a changelog for every new agent release? And we can list the experimental changes in there, alongside other fixes.

toolmantim · 2019-03-10T23:14:19Z

Sorry, I should have made my question clearer!

My plan is to merge this and release it in 3.10.0, with the intention of making it the default and removing the experiment wrapper in 4.0.0.

I was wondering why it couldn’t be removed from experiments in a point release once we’ve tested it, and then enabled by default in v4?

Related q for making it the default: would we have a default calculated value for --git-mirrors-path, or it’d error if people didn’t update their config, or it’s just be disabled if people didn’t update their config?

lox · 2019-03-10T23:29:51Z

I was wondering why it couldn’t be removed from experiments in a point release once we’ve tested it, and then enabled by default in v4?

Sure, quite possibly! I guess it depends how it goes?

Related q: would we have a default calculated value for --git-mirrors-path, or it’d error if people didn’t update their config, or it’s just be disabled if people didn’t update their config?

Nope, I don't think we would calculate a default value. We don't for plugins-path for instance. In v4 we'll change the packaging to set a default in the configs and during installation.

lox · 2019-03-10T23:32:55Z

@DoomGerbil I'm going to remove --dissociate from the defaults (it's presently accidentally on the mirror clone anyway). It's pretty recently added to git and it also causes problems for git LFS. It's easy to add it to your git-clone-flags if you want it!

DoomGerbil · 2019-03-11T00:09:58Z

@DoomGerbil I'm going to remove --dissociate from the defaults (it's presently accidentally on the mirror clone anyway). It's pretty recently added to git and it also causes problems for git LFS. It's easy to add it to your git-clone-flags if you want it!

https://github.blog/2015-02-06-git-2-3-has-been-released/ says it's been there for four years now, but the point about LFS is well taken. We don't support LFS with the rest of our infrastructure, so we would probably add the flag to our configuration, since as you said, it's not hard to tack it on ourselves.

lox · 2019-03-11T10:52:33Z

The bit I'm still very blurry on is how the submodule support will work in practice.

lox · 2019-03-12T03:24:44Z

We've decided to drop reference clone support for submodules for the first pass.

keithpitt

🥇

silvamerica · 2019-03-22T19:11:51Z

Just an FYI, we've been doing this for a little while with a custom elastic_bootstrap script to create the mirror and git clone flags to --reference-if-able and we had immediate issues with the cloned repo missing refs before adding --disassociate.

lox · 2019-04-11T22:55:04Z

@silvamerica yeah, I suspect we might end up with it being a default, but it brings with it a quite recent git dependency, so going to see how it goes. It's only really an issue if you plan on deleting the git bare repositories, isn't it?

lox requested a review from a team March 3, 2019 23:14

lox commented Mar 3, 2019

View reviewed changes

DoomGerbil reviewed Mar 8, 2019

View reviewed changes

gavinelder reviewed Mar 8, 2019

View reviewed changes

petemounce approved these changes Mar 8, 2019

View reviewed changes

lox added 2 commits March 9, 2019 13:56

Add repos-path config and start param

6dc3720

Use a reference dir to share git repositories between checkouts

2fcfaa8

lox force-pushed the shared-repositories branch from 079045c to 2fcfaa8 Compare March 9, 2019 02:56

lox added 5 commits March 9, 2019 13:57

Fix bad merge

54fc4e2

Fix bug where cwd was the deleted checkout dir

b34e976

Rename BUILDKITE_REPOS_PATH to BUILDKITE_GIT_MIRRORS_PATH

832cfc1

Fix tests that don't expect mirror

c7f34bf

Wrap all git mirrors functionality in experiment

e5d9fbd

lox mentioned this pull request Mar 10, 2019

Allow checkout with git worktrees #917

Closed

lox added 3 commits March 10, 2019 16:53

Skip a flakey windows test

dcac1f1

Fix indenting

e423bc7

Re-arrange output when using git-mirrors

2488807

lox added the wip label Mar 10, 2019

lox added 4 commits March 10, 2019 17:21

Allow git clone flags to be configured independently for mirror

3443c25

Allow lock timeout to be configured for git mirror

23a13d7

Quote repository path in log output

345071a

Add tests for varying clone mirror flags

75ecd69

lox changed the title ~~Shared repositories between checkouts~~ Shared repositories (git mirrors) between checkouts Mar 10, 2019

lox added 3 commits March 11, 2019 13:25

Drop --dissociate from the default clone options

0953248

Split locking process for mirrors into clone and update

76d47bd

Remove --dissociate from test expectations

b140923

lox added 3 commits March 12, 2019 14:16

Only short-circuit update if commit is actually a commit

f297e95

For now, don't use reference clones for submodules

93b80d2

Update integration tests for changes

c457df3

Do keyscan before mirrors

9c27045

keithpitt approved these changes Mar 12, 2019

View reviewed changes

Add debug comments around locks for mirrors

7c5723d

lox merged commit 7d24aec into master Mar 12, 2019

lox deleted the shared-repositories branch March 12, 2019 05:59

lox mentioned this pull request Mar 28, 2019

Add EnableAgentGitMirrorsExperiment parameter buildkite/elastic-ci-stack-for-aws#555

Merged

lox mentioned this pull request Apr 11, 2019

TensorFlow: "Preparing working directory" timed out on mac (also slow on Windows) bazelbuild/continuous-integration#602

Closed

harrietgrace mentioned this pull request Oct 29, 2019

Update agent config options buildkite/docs#535

Merged

harrietgrace removed the needs doc update label Oct 29, 2019

jonahbull mentioned this pull request Jan 31, 2023

use reference clones for submodules #1935

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shared repositories (git mirrors) between checkouts #936

Shared repositories (git mirrors) between checkouts #936

lox commented Mar 3, 2019 •

edited

Loading

lox Mar 3, 2019

DoomGerbil left a comment

DoomGerbil Mar 8, 2019

gavinelder left a comment

petemounce left a comment

petemounce commented Mar 8, 2019

DoomGerbil commented Mar 8, 2019

DoomGerbil commented Mar 8, 2019 •

edited

Loading

lox commented Mar 9, 2019

lox commented Mar 9, 2019

lox commented Mar 9, 2019

lox commented Mar 10, 2019

lox commented Mar 10, 2019

toolmantim commented Mar 10, 2019

lox commented Mar 10, 2019 •

edited

Loading

toolmantim commented Mar 10, 2019

lox commented Mar 10, 2019

toolmantim commented Mar 10, 2019

toolmantim commented Mar 10, 2019 •

edited

Loading

lox commented Mar 10, 2019

lox commented Mar 10, 2019

DoomGerbil commented Mar 11, 2019

lox commented Mar 11, 2019

lox commented Mar 12, 2019

keithpitt left a comment

silvamerica commented Mar 22, 2019

lox commented Apr 11, 2019

Shared repositories (git mirrors) between checkouts #936

Shared repositories (git mirrors) between checkouts #936

Conversation

lox commented Mar 3, 2019 • edited Loading

lox Mar 3, 2019

Choose a reason for hiding this comment

DoomGerbil left a comment

Choose a reason for hiding this comment

DoomGerbil Mar 8, 2019

Choose a reason for hiding this comment

gavinelder left a comment

Choose a reason for hiding this comment

petemounce left a comment

Choose a reason for hiding this comment

petemounce commented Mar 8, 2019

DoomGerbil commented Mar 8, 2019

DoomGerbil commented Mar 8, 2019 • edited Loading

lox commented Mar 9, 2019

lox commented Mar 9, 2019

lox commented Mar 9, 2019

lox commented Mar 10, 2019

lox commented Mar 10, 2019

toolmantim commented Mar 10, 2019

lox commented Mar 10, 2019 • edited Loading

toolmantim commented Mar 10, 2019

lox commented Mar 10, 2019

toolmantim commented Mar 10, 2019

toolmantim commented Mar 10, 2019 • edited Loading

lox commented Mar 10, 2019

lox commented Mar 10, 2019

DoomGerbil commented Mar 11, 2019

lox commented Mar 11, 2019

lox commented Mar 12, 2019

keithpitt left a comment

Choose a reason for hiding this comment

silvamerica commented Mar 22, 2019

lox commented Apr 11, 2019

lox commented Mar 3, 2019 •

edited

Loading

DoomGerbil commented Mar 8, 2019 •

edited

Loading

lox commented Mar 10, 2019 •

edited

Loading

toolmantim commented Mar 10, 2019 •

edited

Loading