Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extender changes for run image extension #1022

Merged
merged 42 commits into from
Mar 31, 2023
Merged

Conversation

natalieparellano
Copy link
Member

Fixes #998

Branching off other feature branch for now, will repoint when that is merged

- The logic to update the default path for TOML files was repeated across phases
- In general it is safe to provide default values for inputs that might not be relevant to the current phase,
  as these will be ignored when constructing a new service for the phase;
  e.g., platform.LifecycleInputs.OrderPath will be ignored when constructing a lifecycle.Exporter
- As more inputs are shared across phases (e.g., analyzed.toml is now an input to the detect phase),
  duplicating the logic for providing default values is becoming more cumbersome

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
…= 0.10

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
…kage

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
… the platform spec

Signed-off-by: Natalie Arellano <narellano@vmware.com>
…mage.extend = <true or false, default false>

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
… should fail if the selected base image is not found in run.toml.

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
- digest ref for run image
- target data for run image

Additionally the restorer will download the run image manifest & config when extend is true

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Because we change the media types to be oci types (vs docker types) this changes the digest of the image

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
imgutil/layout/sparse modifies the image media types which we don't want

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
@natalieparellano natalieparellano changed the title Runext/extend 998 Extender changes for run image extension Mar 1, 2023
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Base automatically changed from runext/restore-997 to main March 28, 2023 15:23
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
@natalieparellano natalieparellano marked this pull request as ready for review March 28, 2023 18:24
@natalieparellano natalieparellano requested a review from a team as a code owner March 28, 2023 18:24
Comment on lines +76 to +79
var skipLayers bool
if boolEnv(EnvSkipLayers) || boolEnv(EnvSkipRestore) {
skipLayers = true
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated bug

@@ -59,7 +59,7 @@ func (l Path) writeImage(img v1.Image) error {
if err != nil {
return err
}
if err := l.WriteBlob(cfgName, io.NopCloser(bytes.NewReader(cfgBlob))); err != nil {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure why this stood out to me

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Comment on lines +60 to +62
if platformAPI.LessThan("0.10") {
continue
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Speeds the tests up a little

h.AssertStringDoesNotContain(t, secondOutput, "ca-certificates") // shows that cache layer was used
h.AssertStringContains(t, secondOutput, "Hello Extensions buildpack\ncurl") // output by buildpack, shows that curl is still installed in the unpacked cached layer
h.AssertStringDoesNotContain(t, secondOutput, "ca-certificates") // shows that first cache layer was used
h.AssertStringDoesNotContain(t, secondOutput, "No cached layer found for cmd RUN apt-get update && apt-get install -y tree") // shows that second cache layer was used
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verifies we've fixed a bug that exists today where we only pull from the cache on the first Dockerfile (it has to do with non-reproducible image create time).

Signed-off-by: Natalie Arellano <narellano@vmware.com>
}

//go:generate mockgen -package testmock -destination testmock/dockerfile_applier.go github.com/buildpacks/lifecycle DockerfileApplier
type DockerfileApplier interface {
Apply(workspace string, digest string, dockerfiles []extend.Dockerfile, options extend.Options) error
ImageFor(reference string) (v1.Image, error)
Apply(dockerfile extend.Dockerfile, toBaseImage v1.Image, withBuildOptions extend.Options, logger log.Logger) (v1.Image, error)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty significant refactor, now instead of taking a list of Dockerfiles the applier takes a single Dockerfile. This allows us to consolidate all the CNB-spec logic in the lifecycle package without it bleeding into internal/extend/kaniko.

return fmt.Errorf("getting image hash: %w", err)
}
toPath := filepath.Join(e.ExtendedDir, "run", imageHash.String())
layoutPath, err := selective.Write(toPath, empty.Index) // FIXME: this should use the imgutil layout/sparse package instead, but for some reason sparse.NewImage().Save() fails when the provided base image is already sparse
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was a bummer to discover. When we update imgutil we can remove some logic (e.g., topLayer) from this file.

func NewDockerfileApplier(logger log.Logger) *DockerfileApplier {
return &DockerfileApplier{
logger: logger,
func (a *DockerfileApplier) Cleanup() error {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could cause problems when images are extended in parallel (if the kaniko directory is re-used, which it might be). It may be safer to move this logic to the exporter.

@natalieparellano
Copy link
Member Author

@jabrown85 @joe-kimmel-vmw this one is finally ready for review :)

Copy link
Contributor

@joe-kimmel-vmw joe-kimmel-vmw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left you a couple questions but idk, i'm not actually an approver anyhow 😆 feel free to ignore me, or tell me why i'm wrong / what i'm missing

h.AssertStringDoesNotContain(t, secondOutput, "Did not find cache key, pulling remote image...")
h.AssertStringDoesNotContain(t, secondOutput, "Error while retrieving image from cache: oci")
h.AssertStringDoesNotContain(t, secondOutput, "ca-certificates") // shows that first cache layer was used
h.AssertStringDoesNotContain(t, secondOutput, "No cached layer found for cmd RUN apt-get update && apt-get install -y tree") // shows that second cache layer was used
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the options are few here, but I'm thinking that somebody could come along later and change a log string, and not realize that they've invalidated a unit test (because if that string changes, then the log output will never include it). there's no great options but things that come to mind include:

  • a test that asserts that the message does occur, with some comments explaining that if this test is failing then some other maintenance is needed also wherever this string occurs.

  • make the string be a constant in the module. it can be 'private'/lowercase if you also put the unit test in that module, which i think there's Opinions about in some circles but technically files that end in _test.go aren't compiled or linked into the executable targets anyhow?

  • find a non-log-output indicator. (I'm guessing if there was a good one we'd be using it already tho)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are so right.

For line 3, ca-certificates, the positive assertion actually does exist here. For line 4, No cached layer found..., I'll add it in a43a606. For these lines, we could maybe add some sort of time-based assertion (since using the cache will be faster) and also I think they are also covered by this line since generating a new layer will cause the image SHA to be different.

For lines 1-2, there's probably no good option, and maybe we don't even need these tests anymore. In development I was trying to make sure that I used a particular code path but the end result of either was the same. For now I added constants in 007da92.

group.Go(func() error {
return copyLayer(currentLayer, toPath)
})
case currentHash.String() == origTopLayerHash:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell if i'm just old fashioned here, but it's confusing to me to see a switch statement that is looking at different variables. in my head switch is "on" a single variable and we check different value-cases of that variable, but i guess i'm just not hip to the golang idioms.

extender.go Outdated

var (
configFile *v1.ConfigFile
rebasable = true // for now, we don't require the initial base image to have io.buildpacks.rebasable=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's gonna be a tough one right? bc if we don't start that way, then technically we'd break backwards compatibility to ever require it? is the issue that basically nobody's gonna decorate their images like that, so we kind of just say "the absence of rebasable=false label is true"?

I guess i'm questioning whether the comment should imply that in the future we'll really have a default of rebaseable false, given that we're starting with a default of rebaseable true. are we skirting a spec change by saying "yeah yeah, we'll do it later"?? 😆

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair point - I'l remove "for now" because I don't think we'll ever actually make this a requirement.

buildOptions,
logger,
); err != nil {
return nil, fmt.Errorf("applying dockerfile to image: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if/when we return partway through this loop what does that mean for the user's docker images? are they going to be partially extended and in an inconsistent or unusable state? is there a way to ... "roll back the transaction"? or maybe indicate to the user "our condolences for the loss of your docker image. you have a backup, right?" Or is the UX cool, here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great question! It should be okay - we're using the filesystem as our export target. Specifically, we're writing to the kaniko dir as a "scratch" workspace and only copying the image layers to <extended>/run/<image> after all Dockerfiles have been applied.

@@ -76,7 +76,7 @@ type OSDistribution struct {
Version string `json:"version" toml:"version"`
}

// Satisfies treats optional fields (ArchVariant and Distributions) as wildcards if empty, returns true if
// IsSatisfiedBy treats optional fields (ArchVariant and Distributions) as wildcards if empty, returns true if
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can thank my IDE :)

Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
Signed-off-by: Natalie Arellano <narellano@vmware.com>
@@ -39,6 +39,11 @@ var (
extendTest *PhaseTest
)

const (
msgPullingRemoteImage = "Did not find cache key, pulling remote image"
msgErrRetrievingImageFromCache = "Error while retrieving image from cache: oci"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i see -- ideally imo these constants would be used in the (production) code that generates the logs, and then consumed from the tests, so that it's clear that those log lines are relied on someplace else outside the immediate domain, and the tests would update effortlessly with the constants.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohhhh I see... those log lines are emitted by kaniko :/

@natalieparellano natalieparellano merged commit 21fb915 into main Mar 31, 2023
@natalieparellano natalieparellano deleted the runext/extend-998 branch March 31, 2023 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC #0105] Extender changes for run image extension (Dockerfiles phase 3)
2 participants