-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement more granular build process #57
Conversation
This is the first step towards a more granular build process where some of the build responsibility moves into the server component. Rather than assembling all layers inside of Nix, it will only create the symlink forest and return information about the runtime paths required by the image. The server is then responsible for grouping these paths into layers, and assembling the layers themselves. Relates to #50.
Simplifies the wrapper script used to invoke Nix builds from Nixery to just contain the essentials, since the layer grouping logic is moving into the server itself.
Refactors the layer grouping package (which previously compiled to a separate binary) to expose the layer grouping logic via a function instead. This is the next step towards creating layers inside of the server component instead of in Nix. Relates to #50.
To answer my own question:
Yes! package main
import (
"crypto/md5"
"crypto/sha256"
"fmt"
"io"
"os"
)
func main() {
f, err := os.Open(os.Args[1])
if err != nil {
fmt.Fprintf(os.Stderr, "failed to open file: %s\n", err)
os.Exit(1)
}
md5hash := md5.New()
sha256hash := sha256.New()
multi := io.MultiWriter(md5hash, sha256hash)
_, err = io.Copy(multi, f)
if err != nil {
fmt.Fprintf(os.Stderr, "failed to copy to multi hash: %s\n", err)
os.Exit(1)
}
fmt.Printf("md5\t%x\n", md5hash.Sum([]byte{}))
fmt.Printf("sha256\t%x\n", sha256hash.Sum([]byte{}))
} Interestingly this ends up faster than a plain |
Minor change of plans: I will keep the construction of layer tarballs in Nix, but invoked as a separate derivation. The tarballs will then be read, hashed & uploaded by Nixery in one disk read. |
This creates a cache key which can be used to check if a layer has already been built.
This introduces a new Nix derivation that, given an attribute set of layer hashes mapped to store paths, will create a layer tarball for each of the store paths. This is going to be used by the builder to create layers that are not present in the cache. Relates to #50.
This cache is going to be used for looking up whether a layer build has taken place already (based on a hash of the layer contents). See the caching section in the updated documentation for details. Relates to #50.
The state type contains things such as the bucket handle and Nixery's configuration which need to be passed around in the builder. This is only added for convenience.
The new build process can now call out to Nix to create layers and upload them to the bucket if necessary. The layer cache is populated, but not yet used.
The new manifest package creates image manifests and their configuration. This previously happened in Nix, but is now part of the server's workload. This relates to #50.
Implements the new build process to the point where it can actually construct and serve image manifests. It is worth noting that this build process works even if the Nix sandbox is enabled! It is also worth nothing that none of the caching functionality that the new build process enables (such as per-layer build caching) is actually in use yet, hence running Nixery at this commit is prone to doing more work than previously. This relates to #50.
This pull request is now missing two major things: Using the new caching strategy, and cleaning up |
When retrieving tokens for service service accounts, some methods of retrieval require a scope to be specified.
This layer is needed in addition to those that are built in the second Nix build.
ec8afa5
to
dae25e4
Compare
This cache is no longer required as it is implicit because the layer cache (mapping store path hashes to layer hashes) implies that a layer has been seen.
A couple of minor fixes and improvements to the cache implementation.
The new builder now caches and reads cached manifests to/from GCS. The in-memory cache is disabled, as manifests are no longer written to local file and the caching of file paths does not work (unless we reintroduce reading/writing from temp files as part of the local cache).
MD5 hash checking is no longer performed by Nixery (it does not seem to be necessary), hence the layer cache now only keeps the SHA256 hash and size in the form of the manifest entry. This makes it possible to restructure the builder code to perform cache-fetching and cache-populating for layers in the same place.
The functions used for layer creation are now easier to follow and have clear points at which the layer cache is checked and populated. This relates to #50.
Almost there. Local manifest caching is currently gone, because the manifests are no longer written to local files and the local manifest cache used file paths. |
Implements a local manifest cache that uses the temporary directory to cache manifest builds. This is necessary due to the size of manifests: Keeping them entirely in-memory would quickly balloon the memory usage of Nixery, unless some mechanism for cache eviction is implemented.
The last missing puzzle piece for #50!
This previously invoked a Nix derivation that spent a few seconds on making an empty object in JSON ...
5ea77d0
to
29ec50c
Compare
Alright, it's done. The layer construction, as mentioned, is still in Nix (but in a separate invocation). A subsequent version could optimise this further to create the tarball in the server and avoid the last round of extra reads/writes on disk. |
This implements the concept outlined in #50 and is a followup to #52.
Build logic has been moved from the monolithic Nix expression into a multi-staged process in which layer creation is handled by the server.
For now the primary advantages of this are subtle:
In the future this might enable additional optimisations (such as additional caching steps, for example by requesting runtime closures of each package separately and applying caching at that level).
Note: This PR is currently WIP. There are many tasks left to do:
group-layers
into the server componentbuild-image.nix
to remove functionality that is moving into the server