Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deduplicate layer IDs to handle cases where duplicate layers are in the layer tree #1172

Merged
merged 1 commit into from
Jun 5, 2024

Conversation

komish
Copy link
Contributor

@komish komish commented May 31, 2024

In certain edge cases, a container builder may produce multiple "empty" layers that may have the same digest value. When parsing those layers, we fail to find an RPM database (which is expected) and therefore carry the previous layer's package list forward.

We run into issues when layers with the same ID and no modified files (e.g. "empty" layers) are separated by layers WITH package changes, such that when we carry forward file lists from previous layers, we would be impacting an earlier entry's file list because we use a map where the key is the layer ID. Concrete example below[1].

This PR changes our mapping to glue together a layer's ID and its relative position in the image, which prevents overwriting previous layer values and fixes a bug in how we detect modified files layer-over-layer.

It also adds logging to map DiffID, LayerID, and the deduped/calculated value in trace logs to assist with debugging.

[1]

Assume we have 4 layers with these modified files.

-- base layer omitted --
0: sha256:1111 - [] 
1: sha256:2222 - ["/modified", "/path/to/rpmdb"]
2: sha256:1111 - []
3: sha256:3333 - ["/unrelated"]

We parse layer 0 and see it doesn't have any files (nor RPMDB), so we store a mapping of that layer's ID ("sha256:1111")to a list of packages that might be associated (which will be empty in this example because the previous, omitted layer is the base layer.

As soon as layer 1 (first layer to contain an RPMDB) is evaluated, that becomes our source of truth. We store the package lists associated with it and use those to inform the "do not modify" list. In our mapping, the package information is mapped to this layer's ID as the map key ("sha256:2222").

When layer 2 is parsed and found to be empty, we map its package list to the previous layer (1) package list. Because the layer ID of layer 2 matches that of layer 0, we've now inadvertently populated layer 0's package list. This is the problem.

Later, when we validate each of these layers, we detect that layer 0 has packages (the same as layer 1), and treat that as our source of truth. Layer 1, then, is considered to have modified all of the same files as layer 0 - which fails the check.


To solve this, I bind the layer ID with its index position (e.g. "00-sha256:1111"). It makes things a little bit heavier, but barring refactoring to remove the map AND array usage and/or change the way this works - this seemed simpler and effective enough.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 31, 2024
@acornett21
Copy link
Contributor

I'm okay with this implementation, but also want to point out that we could also remove the empty layers before we build the maps/associations (ie before we iterate over the layers). Though, with that type of implementation Im not sure what the cutoff point for empty means size wise, since crane we are dealing with the uncompressed values and they are not zero at that point.

It would be nice to know others thoughts on various approaches as well.

@komish
Copy link
Contributor Author

komish commented May 31, 2024

+1 @acornett21 I briefly tested skipping empty layers based on size - and just ended up at "how do I confidently determine that the layer is truly empty" aside from looking at content. The smallest value I saw returned from layer.Size() was 34. I didn't test much beyond that, but it's another option. Will wait for comment.

@komish komish requested a review from bcrochet May 31, 2024 19:54
@acornett21
Copy link
Contributor

how do I confidently determine that the layer is truly empty

This is where I landed as well, I thought I was missing something as well with the size calculation. The one advantage of this I see, is it could speed up our processing by spiking some layers. But if don't have confidence on what the min size should be, they the saying cycles point is probably moot.

@dcibot
Copy link

dcibot commented May 31, 2024

Copy link
Contributor

@bcrochet bcrochet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 4, 2024
@jfrancin
Copy link
Contributor

jfrancin commented Jun 4, 2024

I hope this will get reviewed soon - have a partner waiting on the new Preflight version with this fix...

@komish
Copy link
Contributor Author

komish commented Jun 4, 2024

@jfrancin Yep, sorry for the delay - I've been trying to replicate the failing image with little success, and finally cracked it just a bit ago. Whipping up a test case, and we should be set.

@acornett21
Copy link
Contributor

@jfrancin we are well aware of the urgency, like mentioned we've been trying to understand how the partner built an image with multiple of the same layer (and empty ones at that), and come with with a repeatable test so we do not have an regressions. Figuring out how they got an image in this state, has been extremely difficult to replicate.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 4, 2024
@komish
Copy link
Contributor Author

komish commented Jun 4, 2024

@acornett21 @bcrochet A note -

This PR reconfigures HasModifiedFiles such that the packageDist value is extracted in a separate function from the package file map building process. This simplifies testing of the file mapping process by allowing us to build an ImageReference fragment that doesn't contain the extracted filesystem needed by the packageDist logic.

As a result, several return signatures have changed, dropping the returned packageDist.

Also note that the test written here uses a static fixture. The containerfile used to build it is included - but using a static fixture saves us storing the image in-repo (~110m). It's been documented as a TODO item to optimize at a later date.

…he layer tree

Signed-off-by: Jose R. Gonzalez <komish@flutes.dev>
@coveralls
Copy link

Coverage Status

coverage: 84.782% (+3.2%) from 81.591%
when pulling 517a88b on komish:fix-hmf-dupe-layers
into 354d10d on redhat-openshift-ecosystem:main.

Copy link
Contributor

@acornett21 acornett21 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments, nothing blocking.

@dcibot
Copy link

dcibot commented Jun 4, 2024

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 5, 2024
Copy link
Contributor

@bcrochet bcrochet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link

openshift-ci bot commented Jun 5, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: acornett21, bcrochet, komish

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [acornett21,bcrochet,komish]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@komish komish merged commit 3bdeab0 into redhat-openshift-ecosystem:main Jun 5, 2024
5 checks passed
@komish komish deleted the fix-hmf-dupe-layers branch June 5, 2024 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants