Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement more granular build process #57

Merged
merged 25 commits into from
Oct 3, 2019
Merged

Conversation

tazjin
Copy link
Owner

@tazjin tazjin commented Sep 29, 2019

This implements the concept outlined in #50 and is a followup to #52.

Build logic has been moved from the monolithic Nix expression into a multi-staged process in which layer creation is handled by the server.

For now the primary advantages of this are subtle:

  • after creating a layer grouping we can avoid duplicate layer builds by checking for previously built layers with the same contents
  • less time is spent in Nix (which is the primary concurrency bottleneck for Nixery)
  • the build process becomes easier to parallelise

In the future this might enable additional optimisations (such as additional caching steps, for example by requesting runtime closures of each package separately and applying caching at that level).


Note: This PR is currently WIP. There are many tasks left to do:

  • Move group-layers into the server component
  • Refactor build-image.nix to remove functionality that is moving into the server
  • Deserialise the new build output correctly (including runtime graph)
  • Add logic to check for missing layers (& define caching strategy)
  • Add Nix derivation to build layers (build only!)
  • Implement hashing & uploading of layers in one disk read
  • Implement assembly of config layer
  • Implement assembly of manifest
  • Reimplement local manifest caching
  • Load popularity data in server

This is the first step towards a more granular build process where
some of the build responsibility moves into the server component.

Rather than assembling all layers inside of Nix, it will only create
the symlink forest and return information about the runtime paths
required by the image.

The server is then responsible for grouping these paths into layers,
and assembling the layers themselves.

Relates to #50.
Simplifies the wrapper script used to invoke Nix builds from Nixery to
just contain the essentials, since the layer grouping logic is moving
into the server itself.
Refactors the layer grouping package (which previously compiled to a
separate binary) to expose the layer grouping logic via a function
instead.

This is the next step towards creating layers inside of the server
component instead of in Nix.

Relates to #50.
@tazjin tazjin added the enhancement New feature or request label Sep 29, 2019
@tazjin
Copy link
Owner Author

tazjin commented Sep 29, 2019

To answer my own question:

side note: is it possible to feed both hashes simultaneously from one disk read?

Yes!

package main

import (
	"crypto/md5"
	"crypto/sha256"
	"fmt"
	"io"
	"os"
)

func main() {
	f, err := os.Open(os.Args[1])
	if err != nil {
		fmt.Fprintf(os.Stderr, "failed to open file: %s\n", err)
		os.Exit(1)
	}

	md5hash := md5.New()
	sha256hash := sha256.New()
	multi := io.MultiWriter(md5hash, sha256hash)

	_, err = io.Copy(multi, f)
	if err != nil {
		fmt.Fprintf(os.Stderr, "failed to copy to multi hash: %s\n", err)
		os.Exit(1)
	}

	fmt.Printf("md5\t%x\n", md5hash.Sum([]byte{}))
	fmt.Printf("sha256\t%x\n", sha256hash.Sum([]byte{}))
}

Interestingly this ends up faster than a plain sha256sum-call on average, despite doing both hashes.

@tazjin
Copy link
Owner Author

tazjin commented Sep 30, 2019

Minor change of plans: I will keep the construction of layer tarballs in Nix, but invoked as a separate derivation. The tarballs will then be read, hashed & uploaded by Nixery in one disk read.

This creates a cache key which can be used to check if a layer has
already been built.
This introduces a new Nix derivation that, given an attribute set of
layer hashes mapped to store paths, will create a layer tarball for
each of the store paths.

This is going to be used by the builder to create layers that are not
present in the cache.

Relates to #50.
This cache is going to be used for looking up whether a layer build
has taken place already (based on a hash of the layer contents).

See the caching section in the updated documentation for details.

Relates to #50.
The state type contains things such as the bucket handle and Nixery's
configuration which need to be passed around in the builder.

This is only added for convenience.
The new build process can now call out to Nix to create layers and
upload them to the bucket if necessary.

The layer cache is populated, but not yet used.
The new manifest package creates image manifests and their
configuration. This previously happened in Nix, but is now part of the
server's workload.

This relates to #50.
Implements the new build process to the point where it can actually
construct and serve image manifests.

It is worth noting that this build process works even if the Nix
sandbox is enabled!

It is also worth nothing that none of the caching functionality that
the new build process enables (such as per-layer build caching) is
actually in use yet, hence running Nixery at this commit is prone to
doing more work than previously.

This relates to #50.
@tazjin
Copy link
Owner Author

tazjin commented Oct 1, 2019

This pull request is now missing two major things: Using the new caching strategy, and cleaning up builder.go.

When retrieving tokens for service service accounts, some methods of
retrieval require a scope to be specified.
This layer is needed in addition to those that are built in the second
Nix build.
This cache is no longer required as it is implicit because the layer
cache (mapping store path hashes to layer hashes) implies that a layer
has been seen.
A couple of minor fixes and improvements to the cache implementation.
The new builder now caches and reads cached manifests to/from GCS. The
in-memory cache is disabled, as manifests are no longer written to
local file and the caching of file paths does not work (unless we
reintroduce reading/writing from temp files as part of the local
cache).
MD5 hash checking is no longer performed by Nixery (it does not seem
to be necessary), hence the layer cache now only keeps the SHA256 hash
and size in the form of the manifest entry.

This makes it possible to restructure the builder code to perform
cache-fetching and cache-populating for layers in the same place.
The functions used for layer creation are now easier to follow and
have clear points at which the layer cache is checked and populated.

This relates to #50.
@tazjin
Copy link
Owner Author

tazjin commented Oct 3, 2019

Almost there. Local manifest caching is currently gone, because the manifests are no longer written to local files and the local manifest cache used file paths.

Implements a local manifest cache that uses the temporary directory to
cache manifest builds.

This is necessary due to the size of manifests: Keeping them entirely
in-memory would quickly balloon the memory usage of Nixery, unless
some mechanism for cache eviction is implemented.
The last missing puzzle piece for #50!
This previously invoked a Nix derivation that spent a few seconds on
making an empty object in JSON ...
@tazjin tazjin changed the title (WIP) Implement more granular build process Implement more granular build process Oct 3, 2019
@tazjin
Copy link
Owner Author

tazjin commented Oct 3, 2019

Alright, it's done.

The layer construction, as mentioned, is still in Nix (but in a separate invocation). A subsequent version could optimise this further to create the tarball in the server and avoid the last round of extra reads/writes on disk.

@tazjin tazjin merged commit a931de2 into master Oct 3, 2019
@tazjin tazjin deleted the feat/granular-builder branch October 3, 2019 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant