Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make it easy to support estargz #866

Closed
mattmoor opened this issue Dec 10, 2020 · 9 comments · Fixed by #871
Closed

Make it easy to support estargz #866

mattmoor opened this issue Dec 10, 2020 · 9 comments · Fixed by #871

Comments

@mattmoor
Copy link
Collaborator

See the proposal here: opencontainers/image-spec#815

Given the following factors:

  1. estargz is designed for backwards compatibility with existing clients,
  2. images must be built with estargz to even have the option of lazy loading (they can still be eager!),
  3. the volume of build tools leveraging GGCR to produce images (ko, buildpacks, kaniko, bazel(?)).

I propose that we make estargz either the new default we push for images, or trivially easy to consume.

@jonjohnsonjr
Copy link
Collaborator

trivially easy to consume

I'd be okay with something like export GGCR_EXPERIMENT_ESTARGZ=1?

@mattmoor
Copy link
Collaborator Author

Fortunately the optimize code path in the stargz snapshotter sort of lays this out for us already (since they use ggcr). I think what we effectively want is a way to make this easy: https://github.com/containerd/stargz-snapshotter/blob/a85d05d70ed851a0f4cb0744ecd0153fe3acafcf/cmd/ctr-remote/commands/optimize.go#L400-L412

@mattmoor
Copy link
Collaborator Author

I think the simplest starting point would be to write a function that can convert images and image indices to estargz and harness that in crane. Perhaps an estargz package, with functions for converting layers, images and image indices.

I suspect that we may want to create an extended v1.Layer interface that can pass annotations, e.g.

type AnnotatedLayer interface {
      Layer
      Annotations() map[string]string
}

Then here we can check whether layers implement the type and augment the addendum passed to Append:

additions = append(additions, Addendum{Layer: layer})

I suspect that just doing the conversion will force us to confront a number of the issues, and also sets things up so that downstream tooling that wants to consciously produce estargz images need not start by building those images from scratch.


With the ability to convert images, we could then look at ko as a next step. It does its own gzipping, and given a good library for estargz I think we could simply take the resulting TOC digest and annotate it on the addendum here: https://github.com/google/ko/blob/06ebf033f04236836da625cb0477a002f6a06639/pkg/build/gobuild.go#L506

With the ability to convert the remote.Image base, and estargz-ify the kodata and ko-app layers, we should have ko in the bag, which gives us a very large corpus of real-world things to test it on.

@mattmoor
Copy link
Collaborator Author

Looking at kaniko, if we were to allow the Addendum.Layer to provide annotations in Append itself (vs. just AppendLayers as I linked above), and we were to change tarball.LayerFromFile to produce an annotated estargz layer, then I believe it would require no code changes.

Here's one of a couple places that it appends a layer: https://github.com/GoogleContainerTools/kaniko/blob/c982956c15edcbc90ab9e05db781fb994cac3165/pkg/executor/build.go#L485-L493

Here's one of a couple places that it constructs a layer with tarball.LayerFromFile: https://github.com/GoogleContainerTools/kaniko/blob/c982956c15edcbc90ab9e05db781fb994cac3165/pkg/executor/build.go#L476

@mattmoor
Copy link
Collaborator Author

I suspect that something similar would be true for buildpacks: https://github.com/buildpacks/imgutil/blob/339e186e495ae2c491e32cfd0c84aab77fc0e99e/remote/remote.go#L415-L423

@imjasonh
Copy link
Collaborator

Some thoughts:

  • +1 to GGCR_EXPERIMENT_ESTARGZ=1 and making this otherwise transparent to clients (ko, kaniko, buildpacks) while we gather data.
  • If we add an eStargz package it should be internal at first so we can iterate on the surface before committing to it.

@jonjohnsonjr
Copy link
Collaborator

I suspect that we may want to create an extended v1.Layer interface that can pass annotations, e.g.

We already allow for this via an optional Descriptor method, which we access through partial.Descriptor (needs to live in partial so we can unwrap the partial layer and/or image[ index] implementations), which we're already doing in mutate. I guess we could also look for Annotations implementations in partial.Descriptor, but what we have should be good enough?

Does estargz allow for a one-shot conversion? I'm guessing yes, as long as we don't need the optimize step. That was one of the niceties of stargz for stargzify (iirc, it was very important to @bradfitz that we do this in one pass and not buffer). I can figure this out myself, but just mentioning it so I don't forget. If we can do this in a streaming way, it will have minimal perf implications.

@mattmoor
Copy link
Collaborator Author

Not 100% yet, but if you pull the second PR into ko and enable via the env var:

crane manifest gcr.io/mattmoor-knative/extract-digest-c5d57a4440b9fe3a836df6e4b8382b05@sha256:261e6120d3c717da16c2d60a82da54451cbcc7a621f1e8f7fba5a12ad758446c | jq .
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "size": 1065,
    "digest": "sha256:efc56fd396ed4c4dd97b4928ec63aafa108e6c930e6599cadf47d079e2157184"
  },
  "layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 641550,
      "digest": "sha256:e59bd8947ac7af2c8f4403b183326986ab554bbc262739cf5d9d596c7c7a3aca"
    },
    {
      "mediaType": "",
      "size": 533057,
      "digest": "sha256:7a6f7f3b38af81adcdb1e9dc887ffe6f86ee9edb0c0cf5567fb91efcf4ff2f56",
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:642a8791d57e2e6f1db468f971b716871dcfcbeb200d6d6f0e37c3101256f1d0"
      }
    },
    {
      "mediaType": "",
      "size": 1483815,
      "digest": "sha256:b9e513968f709fc21295b452faeb2f80b8d2b0a4416f7c449e74bffdf15a1e84",
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:7a5ed3b3f23918599c5a8f9ab7e1a251b5bfd5e425056a2b32c8452d3bd07aa1"
      }
    }
  ]
}

I'll fix the mimeType, but this looks right to me. 😎

@mattmoor
Copy link
Collaborator Author

Alright, the latest looks right:

crane manifest gcr.io/mattmoor-knative/extract-digest-c5d57a4440b9fe3a836df6e4b8382b05@sha256:fb717a5cb9822011be414cf429651a661027e2c726dc6b89a766840c439d6191 | jq .
{
  "schemaVersion": 2,
  "mediaType": "application/vnd.docker.distribution.manifest.v2+json",
  "config": {
    "mediaType": "application/vnd.docker.container.image.v1+json",
    "size": 1065,
    "digest": "sha256:ab87a0437ea9688114206ab75730988ddf9a4ca7a7a27ef570df54e8e4ba3a23"
  },
  "layers": [
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 641550,
      "digest": "sha256:e59bd8947ac7af2c8f4403b183326986ab554bbc262739cf5d9d596c7c7a3aca"
    },
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 544943,
      "digest": "sha256:8807d1112b123b1dc508beccfdffa50a299a8555293774e2c31dcc767ae83043",
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:f6400b45186b7ba102e585b3558686d4e86b08569e8af316bed9afb709970e9c"
      }
    },
    {
      "mediaType": "application/vnd.docker.image.rootfs.diff.tar.gzip",
      "size": 1483815,
      "digest": "sha256:b9e513968f709fc21295b452faeb2f80b8d2b0a4416f7c449e74bffdf15a1e84",
      "annotations": {
        "containerd.io/snapshot/stargz/toc.digest": "sha256:7a5ed3b3f23918599c5a8f9ab7e1a251b5bfd5e425056a2b32c8452d3bd07aa1"
      }
    }
  ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants