Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP First pass for multi architecture catalogs #568

Closed
wants to merge 19 commits into from

Conversation

jchunkins
Copy link
Contributor

@jchunkins jchunkins commented Feb 28, 2023

Description

  • Numerous changes to read/write single and multi arch images. Each code path has to handle the parent/child hierarchy of indexes, usually via recursion.
  • In operator.go
    • Made the operator rendering code "group together" the DeclarativeConfig and IncludeConfig and keep track of them on a per platform basis. Introduced this and other new data structures to pass between the various functions in pkg/cli/mirror as well as passing them to other packages like pkg/metadata. Introduced pkg/cli/mirror/operatorcatalog to contain structs which are used in the aforementioned packages to avoid circular dependencies.
    • Make writeLayout() attempt to get platform information from image configuration and fall back to linux/amd64 if it can't
    • Make writeConfigs() write content in new single / multi arch aware directory structure
  • In catalog_images.go
    • Make the rebuildCatalogs() function handle single / multi arch catalogs
  • In fbc_operators.go
    • Make extractDeclarativeConfigFromImage() function (which replaced other existing functions using the go-containerregistry API instead.
    • Make getManifest() function multi arch aware
  • In util.go
    • Add getCatalogMetadataByPlatform() function to capture metadata about platform specific catalogs and returns it to the caller. This information is now passed around various locations in the code.
    • Add getDigestFromOCILayout() function to get the hash associated with a given platform within an OCI layout
  • metadata.go
    • Add ASCII art to describe the new layout structure for the workspace source directory
  • pkg/image/builder
    • Make the image builder single / multi arch aware
    • Allow the API to provide an optional (i.e. can be nil) matcher. A matcher is satisfies a given criteria (e.g. a specific platform) to allow the builder to only update images that match the provided matcher criteria.
  • update bunch of test cases and introduce new tests
  • added docs as to clarify pieces as I understood them, and some of these changes are unrelated to the core purpose of this PR
  • updated vendoring with the make sanity target

Fixes # https://issues.redhat.com/browse/CFE-760

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Test With Docker (mirror to mirror)
    • Use a ImageSetConfiguration which points to a "fat manifest" catalog:
    kind: ImageSetConfiguration
    apiVersion: mirror.openshift.io/v1alpha2
    storageConfig:
      local:
        path: /tmp/localstorage
    mirror:
      operators:
      - catalog: icr.io/cpopen/ibm-zcon-zosconnect-catalog@sha256:6f02ecef46020bcd21bdd24a01f435023d5fc3943972ef0d9769d5276e178e76
    • Start a local docker registry to act as the destination
    docker run -d -p 5000:5000 --name registry registry:2
    
    • Run oc mirror like this (no errors should occur):
    oc mirror -c "/path/to/ImageSetConfiguration.yml" --dest-skip-tls --dest-use-http docker://localhost:5000
    
    • Throw away your local docker registry
    docker stop registry
    docker rm registry
    
  • Test With Docker (mirror to disk)
    • Use a ImageSetConfiguration which points to a "fat manifest" catalog:
    kind: ImageSetConfiguration
    apiVersion: mirror.openshift.io/v1alpha2
    storageConfig:
      local:
        path: /tmp/localstorage
    mirror:
      operators:
      - catalog: icr.io/cpopen/ibm-zcon-zosconnect-catalog@sha256:6f02ecef46020bcd21bdd24a01f435023d5fc3943972ef0d9769d5276e178e76
    • Run oc mirror like this (no errors should occur):
    oc mirror -c "/path/to/ImageSetConfiguration.yml" file://tmp/mirrortodisktest
    
  • Test With OCI
    • Pull a docker "fat manifest" image into a location on the file system:
    skopeo copy --all --format v2s2 docker://icr.io/cpopen/ibm-zcon-zosconnect-catalog@sha256:6f02ecef46020bcd21bdd24a01f435023d5fc3943972ef0d9769d5276e178e76 oci:///Users/jhunkins/temp/ocmirror
    
    • Use a ImageSetConfiguration which points to this newly downloaded image like this:
    kind: ImageSetConfiguration
    apiVersion: mirror.openshift.io/v1alpha2
    storageConfig:
      local:
        path: /tmp/localstorage
    mirror:
      operators:
      - catalog: oci:///Users/jhunkins/temp/ocmirror
    • Start a local docker registry to act as the destination
    docker run -d -p 5000:5000 --name registry registry:2
    
    • Run oc mirror like this (no errors should occur):
    oc mirror -c "/path/to/ImageSetConfiguration.yml" --use-oci-feature --oci-insecure-signature-policy --dest-skip-tls --dest-use-http docker://localhost:5000
    
    • Throw away your local docker registry
    docker stop registry
    docker rm registry
    

Test Configuration:

  • Firmware version:
  • Hardware:
  • Toolchain:
  • SDK:

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 28, 2023
@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Feb 28, 2023
@openshift-ci
Copy link

openshift-ci bot commented Feb 28, 2023

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link

openshift-ci bot commented Feb 28, 2023

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jchunkins
Once this PR has been reviewed and has the lgtm label, please assign soltysh for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jchunkins jchunkins force-pushed the CFE-760-manifest-list branch 2 times, most recently from d62464b to 25d6beb Compare March 1, 2023 04:36
@openshift-merge-robot openshift-merge-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Mar 1, 2023
• image.TypedImageMapping: the source/destination mapping for the catalog (this contains all images for single or multiple architectures)

• error: non-nil if error occurs, nil otherwise
*/
func (o *MirrorOptions) rebuildCatalogs(ctx context.Context, dstDir string) (image.TypedImageMapping, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add some unit tests , the cognitive complexity is high so we need to grasp this better

pkg/cli/mirror/fbc_operators_test.go Show resolved Hide resolved
pkg/config/load_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@sherine-k sherine-k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jchunkins
Our intention is not to block this PR any longer than needed, because the OCP branching is happening tomorrow, and all features need to be merged before that date.
This change is fairly complex, and will come with a certain amount of risk on oc-mirror 4.13 release, so is it possible to go through the code together, discuss what's still missing to make this stable (unit & E2E are failing on our end) , eventually demo it so that we can make it merge as soon as possible without putting the release to risk.

Comment on lines +218 to +220
destRef := image.TypedImage{}
destRef.Type = imagesource.DestinationRegistry
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure this will work also for MirrorToDisk & DiskToMirror workflows?

pkg/cli/mirror/util.go Outdated Show resolved Hide resolved
@@ -313,13 +313,14 @@ func (o *MirrorOptions) findFBCConfig(ctx context.Context, imagePath, catalogCon

//Use the label in the config layer to determine the
//folder containing the related images, when untarring layers
cfgDirName, err := getConfigPathFromConfigLayer(imagePath, string(manifest.ConfigInfo().Digest))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An idea would be to feed findFBCConfig the digestMap from getImageDigests, and to have it untar the right blob for each of these.

We could even imagine creating imageInfo with an additional field for the location of the contents folder (1 or more olm index.json) to have the info and use it later , when calling renderDC

pkg/cli/mirror/util.go Outdated Show resolved Hide resolved

• error: non-nil if an error occurs, nil otherwise
*/
func getImageDigests(ctx context.Context, imageRef string, layoutPath *layout.Path, insecure bool) (map[OperatorCatalogPlatform]CatalogMetadata, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it makes sense to move this function in fbc_operators? It has to do with reading the content of oci images. WDYT?

@aguidirh
Copy link
Contributor

aguidirh commented Mar 3, 2023

Hi @jchunkins,

Could you please convert this PR from draft to WIP ? Adding WIP in the title will prevent it to be merged.

This way we know where the tests are failing and we can fix them.

@jchunkins jchunkins changed the title First pass for multi architecture catalogs WIP First pass for multi architecture catalogs Mar 3, 2023
@jchunkins jchunkins marked this pull request as ready for review March 3, 2023 13:52
@jchunkins
Copy link
Contributor Author

@aguidirh I've added WIP and moved it out of draft.

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 3, 2023
@@ -92,6 +92,7 @@ func ParseReference(ref string) (TypedImageReference, error) {

dstType := DestinationOCI

// TODO: this assumes you can parse it into a docker image ref (and you can’t do that since its just a path).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jchunkins
ParseImageReference doesn't make assumptions about this being only docker (registry located)
This is why it can be used for oci:// prefixed ref.
The result needs to be placed back into a DockerImageReference struct, but that's just becaue we need to return a TypedImageReference.
Does this make sense?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get what you're saying... DockerImageReference has to be provided, but it is not really a valid docker reference for an OCI path. The DockerImageReference is constructed with the values that come from the result of v1alpha2.ParseImageReference, but this content is going to be variable depending on the path that's feed into v1alpha2.ParseImageReference.

In my next push, there's a test case called TestV1A2ParseImageReferenceOCIRefs in image_test.go which covers all of the bases of what v1alpha2.ParseImageReference does with an OCI path (or at least I think it does).

Since image.ParseImageReference function is called all over the place, it's hard to talk about specifically, but one example is in create.go in the run function. It iterates over the []v1alpha2.Operator slice, and uses the resulting parsedRef.Ref.Name as an image reference. If you look in the test case I mentioned, that equates to the expRepo field in the test case values. Essentially the last segment of the path is used as the reference. In some edge cases, the parsing results is an empty string (e.g. if you've been given / as the path or if the path is relative without any directory name provided). These edge cases are likely to result in an error (and maybe that's OK).

To flesh out what I am talking about:

  1. Start with "oci:///Users/jhunkins/temp/ocmirror" as a reference
  2. This becomes "ocmirror"
  3. Now "ocmirror" gets converted into a go container registry reference and stored in CatalogMetadata.CatalogRef. That effectively looks something like this:
index.docker.io/library/ocmirror@sha256:dc1b9846d7994450e74b9cde2e621f8c1d98cdf1debd591db291785dd3fc6446

In the end we have various ways to prevent CatalogMetadata.CatalogRef from being used as a real image reference. We just need to be vigilant to not accidentally use image.ParseImageReference results in ways that don't take into account the fact that it could be a bogus reference for OCI refs. My comment was intended to drive this conversation. I can change the comment to just be a warning that states this is generating a bogus docker image reference for OCI and leave it at that.

"github.com/openshift/oc-mirror/pkg/config"
"github.com/openshift/oc-mirror/pkg/image"
"github.com/openshift/oc-mirror/pkg/metadata/storage"
)

// Create will plan a mirroring operation based on provided configuration
func (o *MirrorOptions) Create(ctx context.Context, cfg v1alpha2.ImageSetConfiguration) (v1alpha2.Metadata, image.TypedImageMapping, error) {
func (o *MirrorOptions) Create(ctx context.Context, cfg v1alpha2.ImageSetConfiguration) (v1alpha2.Metadata, image.TypedImageMapping, map[string]map[oc.OperatorCatalogPlatform]oc.CatalogMetadata, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to include map[string]map[oc.OperatorCatalogPlatform]oc.CatalogMetadata in MirrorOptions in order to pass it more seemlessly accross methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered doing this, but I figured if I did that, we might need to use some form of synchronization (i.e. sync.Mutex or sync.Map) if this code ever used multi threading. Passing the map around like this avoids this issue since it is not shared. Also, as it is right now, testing individual functions is a bit easier since I don't have to construct MirrorOptions for those tests.

I've already done the painful part of refactoring things, so it is easier to leave the code as-is. If you'd prefer it, I can change it to storing it in MirrorOptions.

- Move OperatorCatalogPlatform into operator_platform.go
- change the render code to standardize on
map[OperatorCatalogPlatform]CatalogMetadata so that we can capture
new information in this map as the code goes through
getImageDigests -> renderDCFunc -> plan. To do this, renderDCFunc now
uses this map as an in/out parameter.
- CatalogMetadata.catalogRef is now a single ref rather than a slice
- change getImageDigests(...) to getCatalogMetadataByPlatform(...)
- Migrate OperatorCatalogPlatform & CatalogMetadata to its own package
to avoid circular dependencies
- Use map[OperatorCatalogPlatform]CatalogMetadata throughout the
planning and storage phases
- replace findFBCConfig with  extractDeclarativeConfigFromImage and
removed code which deals with layer extraction. This now uses
go-containerregistry which knows how to handle extraction properly.
- getCatalogMetadataByPlatform now handles DC extraction for OCI images
and places the content into olm_artifacts
- add more documentation
- TestGetCatalogMetadataByPlatform was becoming too complex to maintain
so it was re-written to just start a fake server, push content to it,
and then run the tests that talk to this server
- TestGetCatalogMetadataByPlatformFromOCILayout no longer uses the
"hello" test image since it does not contain an actual catalog. Also
added a single image test case to this test
- Add a single "testonly" image to use instead of
pkg/cli/mirror/testdata/artifacts/rhop-ctlg-oci which is actually broken
and therefore can't be pushed to a registry.
- in the OCI pathway, the mappings that are generated don't have the
right values for the "type", which is set to file://
- to fix this issue, after we write out the mapping file, reread it back
which fixes the source / file mapping
- also needed to add the results of this process into the "allMappings"
@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 16, 2023
@sherine-k
Copy link
Contributor

Hi @zhouying7780
This PR introduces a new feature - multi-arch support for catalogs.
It also refactors the way catalogs are mirrored (for all workflows: mirrorToMirror, mirrorToDisk + diskToMirror, oci as well as non-oci).
So can you help us validate this PR against regressions, and also add needed test cases to cover the multi-arch scenarios?
Thank you

@@ -605,7 +606,7 @@ func (o *MirrorOptions) mirrorToMirrorWrapper(ctx context.Context, cfg v1alpha2.
if err := bundle.MakeWorkspaceDirs(o.Dir); err != nil {
return err
}
meta, mapping, err := o.Create(ctx, cfg)
meta, mapping, allCatalogs, err := o.Create(ctx, cfg)
if err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feature probably needs oc-mirror to grow a new option to mirror the manifestlist and keep the legacy behavior as such.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@zhouying7780
Copy link

I tested this pr , and this is the result , when I use the oc image info to check the mirrored image , I found the mirror image is not correct :

oc image info localhost:5000/mulitoperator/cpopen/ibm-zcon-zosconnect-catalog:6f02ec --insecure --filter-by-os=linux/amd64

Warning: the default reading order of registry auth file will be changed from "${HOME}/.docker/config.json" to podman registry config locations in the future version of oc. "${HOME}/.docker/config.json" is deprecated, but can still be used for storing credentials as a fallback. See https://github.com/containers/image/blob/main/docs/containers-auth.json.5.md for the order of podman registry config locations.
error: the image is a manifest list and contains multiple images - use --filter-by-os to select from:

OS DIGEST
/ sha256:777486079eb21efd26af3d50332f5a70e1fd24a95245227dac41aab6d8871832

@zhouying7780
Copy link

if we mirrored *manifestlist.DeserializedManifestList type , then we should mirror linux/amd64 and linux/s390x
if we only mirror linux/amd64 , then we should not mirror the *manifestlist.DeserializedManifestList type

@zhouying7780
Copy link

And I think we should support to filter the architectures like the ocp release image . This is also needed by customer. This could be as new feature request.

@zhouying7780
Copy link

And also we need to support list operators which is multi-arch.

- certain code paths that use GetUniqueName would fail if a name and
digest were both missing. Updated function to use latest tag as a
fallback when this happens
- change how digest is fetched from OCI layout path. Previous
implementation used the containers/image project which had some
limitations because it would throw ErrMoreThanOneImage error.
Also the function was not really taking into account some of the
variations like index.json that directly referenced a fat manifest
rather than using indirection into the blobs directory. New
implementation now does its best to either return the digest of the
manifest list, or the image config digest.
- changed function name from getManifest() to getFirstDigestFromPath()
- changed how the error conditions are logged in this revised function
- add/update test cases
An OCI layout can reference a manifest list either directly or from an
"indirect index reference" found in the blobs. This "indirect index
reference" is NOT possible for a docker registry. This fix makes
the image builder "discover" the right index to push, so that you get
the correct content pushed into the remote registry. See code comments
for details.
- Took the changes provided by @sherine-k and made some modifications:

1. add function to look for "legacy" & "platform specific" catalog keys
so that we cover both variants in the metadata
1. add documentation
@openshift-ci
Copy link

openshift-ci bot commented Apr 4, 2023

@jchunkins: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@jchunkins
Copy link
Contributor Author

@zhouying7780 It's not entirely clear to me what scenario you ran, but I suspect that my change in commit 97ffd13 resolves the issue you hit. I'd be interested to know if that fixes the issue or not. If it does not, I'd like to know the exact scenario you ran so I can look at it more closely.

@sherine-k
Copy link
Contributor

Hi @jchunkins
As discussed last Thursday, this PR impacts all other oc-mirror users on execution time. Our analysis this week showed us it might not be aligning to what the rest of the CNCF is doing.
There seems to be a big misunderstanding about the expectations around multi-arch support.

With multi-arch, the idea is to get the same software on different hardware.
For operator catalogs that are built downstream, all image references (operators, operands, other related images) are manifest-lists. The philosophy is: The runtime will auto-select the correct arch depending on the hardware running underneath.
image

The current version of oc-mirror in fact already mirrors ALL the images from ALL platforms available in each of these manifest-lists. We verified this this week.

Although this PR will not have significant impact on mirroring volume, performance will be heavily impacted.
But more importantly, with this PR, we would allow a way to get slightly different software depending on the hardware.
Doing this would be a stark difference to how the rest of Kube and the CNCF ecosystem looks at this.
All packaging, from plain YAML, kustomize, Helm to OLM (CSV, bundle, catalog) is arch-less. The only multi-arch thing is the images themselves containing differently compiled binaries.

Maybe it's important to understand better your use case. It seems that the catalog content you deal with is not the same between a catalog's platformA manifest, and platformB manifest.
Can you elaborate more?

One thing to add is that OLM on cluster does not expect the architecture of the bundle to matter. Because bundles and catalogs are arch-less in OLM.

MaxPerRegistry int // Number of concurrent requests allowed per registry
UseOCIFeature bool // Use the new oci feature for oc mirror (oci formatted copy)
OCIRegistriesConfig string // Registries config file location (used only with --use-oci-feature flag)
OCIInsecureSignaturePolicy bool // If set, OCI catalog push will not try to push signatures
MaxNestedPaths int
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+100 for adding useful comments on each flag

@jchunkins
Copy link
Contributor Author

Closing this in favor of the replacement PR #611

@jchunkins jchunkins closed this Apr 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants