- 
                Notifications
    
You must be signed in to change notification settings  - Fork 69
 
campaigns: add and use volume mounts by default on Intel macOS #412
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This mode is added by sourcegraph/src-cli#412.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic, thorough work - docs, tests, scripts, code. And things can get up to 4 times faster. Good start into the new year :)
211151e    to
    155e213      
    Compare
  
    | 
           OK, so this has changed around a bit in response to @mrnugget's review. Most notably: there's now a  PTAL!  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work 🌟
Co-authored-by: Thorsten Ball <mrnugget@gmail.com>
Co-authored-by: Thorsten Ball <mrnugget@gmail.com>
This mode is added by sourcegraph/src-cli#412.
Co-authored-by: Thorsten Ball <mrnugget@gmail.com>
Background
As we now know all too well, the default bind mount behaviour we use to manage workspaces when executing campaigns is slow on macOS: any file I/O in the workspace has to go through the loopback interface, and this causes problems with I/O heavy operations such as pretty much anything to do with
npmoryarn.The steps we perform within a workspace are extremely simple, however: we unzip the repository archive and run a variety of
gitcommands to generate the diff. It's handy for this to be done on the host filesystem, since it allows the user to inspect what happens to their repository if a campaign step fails, but it's not a technical requirement.What's in the
boxPRThis PR generalises our existing concept of a workspace, and expands it significantly. Workspace modes are now defined by implementing a
WorkspaceCreatorinterface, which knows how to createWorkspaceimplementations which implement the common operations we need to set up, diff, and tear down workspaces. This is used to add a new workspace mode that uses a Docker volume to contain the workspace, rather than bind mounting the workspace from the host. This is controlled by the new-workspaceflag, and the default for Intel macOS has been switched to the new volume mode.For this to work, we have to perform the workspace management commands — the aforementioned unzipping and
gitmalarkey — within a container, since that's the only way to access the volume. As such, part of this PR adds a Docker image that extendsalpineto always havegitavailable and configured appropriately, along withunzipandcurl. A GitHub Action has been added that will rebuild and push this image onsrcrelease.There's also a lot of new testing stuff: the volume workspace code is copiously unit tested. In order to support this, there's a (very) minimal implementation of a mock API client, and an expanded
ripoffimplementation of the method Go'sos/exectest suite uses to mock external commands, which means the new tests don't needdummydockerand work on Windows.Adam's guess at obvious questions
Why is CI failing?
The
os/execmethod for mocking external commands basically redirectsexec.Cmdcalls back into the test binary, which gains a little bit of logic in itsTestMainto handle those correctly.This works fine cross-platform, but something about AppVeyor specifically makes this not work on Windows. The errors are fairly inscrutable. However, the approach doesn't have any systemic issue on Windows: it works fine both for me locally, and in GitHub Actions, which now has Windows runner support.
I've opened #415, which is included in this PR as well, to remove our AppVeyor support and only use GitHub Actions for CI. If we decide to go ahead with that, then we can remove the AppVeyor integration and this PR will start passing.
How much faster is this on macOS?
It depends.
For campaigns that generate small-to-medium sized diffs and do little I/O, there isn't much difference: maybe a few percent here and there. For campaigns that generate large diffs and do little I/O, this can actually be slower. (That feels like a really weird case, though.)
For campaigns that perform lots of I/O, this is a big win. A test campaign that upgrades TypeScript in Sourcegraph completes in approximately a quarter of the time in volume mode compared to bind mode. (~11 minutes compared to ~42.)
Why change the default on macOS only?
Volume mode isn't universally better. On Linux (assuming nothing weird like a remote Docker server), there's no reason not to use bind mount mode: the performance is the same, and you have the advantage that you can inspect what happened in your workspace if a step fails.
Why change the default on Intel macOS only?
Conservatism. I've put in a bunch of effort to make multi-architecture builds of the Docker image needed by volume mode work, but I don't have an M1 Mac sitting around to test this. It should work, but I'd prefer to find that out by testing it using
-workspace volumeexplicitly, rather than finding out that it doesn't work after making it the default.What about Windows?
My suspicion is that this will provide similar performance improvements on Windows, but I don't have good numbers on this right now, don't have an environment ready to go to prove that out, and I'd prefer to focus on macOS for now. It's easy enough to change the default later.
That said, I think there's another reason we should consider this for Windows sooner rather than later: it effectively removes the requirement to have
gitin your PATH, and removes any potential for Windows-specificgitweirdness.What future work could we do to improve this further?
Glad you asked! I had more ideas, but this PR was more than big enough already.
sleeptoexecinto the container to inspect the workspace if that was necessary. We should probably provide some sort of interactive debugging mode where you get dropped into a container with the workspace already set up and running on error. (I mean, this is probably a good idea for bind mode, too.)stdoutthan anything endemic to the approach. Another way of doing this would be to have the utility container running the whole time while executing a campaign on a repository, and provide some sort of RPC interface to perform setup and teardown and (most importantly) get diffs without having to go through Docker.internal/execpackage instead ofos/execdirectly. If we migrate other modules to use this, then we could use that down the track to provide verbose logging of every command that's run, since command execution will go through a central place.PR links
Skeletal end user documentation is provided by sourcegraph/sourcegraph#16979.
Fixes sourcegraph/sourcegraph#16809.