Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-17545: Improve extracting opm binary from catalogs #676

Merged
merged 6 commits into from Sep 7, 2023

Conversation

sherine-k
Copy link
Contributor

@sherine-k sherine-k commented Aug 16, 2023

Description

This PR improves the way the opm binary is extracted from catalogs by leveraging the ENTRYPOINT from the container image config.
It also forces pulling the container image that matches the runtime architecture and OS.

Fixes # OCPBUGS-17545

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

This was tested both on amd64 and arm64 OSs with:

 ./bin/oc-mirror -c iscv2.yaml docker://localhost:5000/aws --dest-use-http --dest-skip-tls

and the following config:

kind: ImageSetConfiguration
apiVersion: mirror.openshift.io/v1alpha2
mirror:
  # additionalImages: 
  #   - name: registry.redhat.io/ubi8/ubi:latest  
  operators:
    - catalog: registry.redhat.io/redhat/redhat-operator-index:v4.13
      packages:
      - name: node-observability-operator
      # - name: aws-load-balancer-operator  
      # - name: 3scale-operator

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@openshift-ci-robot openshift-ci-robot added jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels Aug 16, 2023
@openshift-ci-robot
Copy link

@sherine-k: This pull request references Jira Issue OCPBUGS-17545, which is invalid:

  • expected the bug to target the "4.14.0" version, but it targets "4.14" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Description

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Fixes # (issue)

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Test A
  • Test B

Test Configuration:

  • Firmware version:
  • Hardware:
  • Toolchain:
  • SDK:

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Aug 16, 2023
@openshift-ci
Copy link

openshift-ci bot commented Aug 16, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sherine-k

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 16, 2023
@sherine-k sherine-k changed the title OCPBUGS-17545: WIP: Improve extracting opm binary from catalogs OCPBUGS-17545: Improve extracting opm binary from catalogs Aug 16, 2023
@openshift-ci-robot
Copy link

@sherine-k: This pull request references Jira Issue OCPBUGS-17545, which is invalid:

  • expected the bug to target the "4.14.0" version, but it targets "4.14" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Description

This PR improves the way the opm binary is extracted from catalogs by leveraging the ENTRYPOINT from the container image config.
It also forces pulling the container image that matches the runtime architecture and OS.

Fixes # OCPBUGS-17545

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

This was tested both on amd64 and arm64 OSs with:

./bin/oc-mirror -c iscv2.yaml docker://localhost:5000/aws --dest-use-http --dest-skip-tls

and the following config:

kind: ImageSetConfiguration
apiVersion: mirror.openshift.io/v1alpha2
mirror:
 # additionalImages: 
 #   - name: registry.redhat.io/ubi8/ubi:latest  
 operators:
   - catalog: registry.redhat.io/redhat/redhat-operator-index:v4.13
     packages:
     - name: node-observability-operator
     # - name: aws-load-balancer-operator  
     # - name: 3scale-operator

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@sherine-k
Copy link
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added the jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. label Aug 16, 2023
@openshift-ci-robot
Copy link

@sherine-k: This pull request references Jira Issue OCPBUGS-17545, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.14.0) matches configured target version for branch (4.14.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @zhouying7780

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot removed the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label Aug 16, 2023
return err
}
entrypoint := cfgf.Config.Entrypoint
opmBin := strings.TrimPrefix(entrypoint[0], (string)(os.PathSeparator))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are we confident of the fact that there will always be an entrypoint[0] - should we check first and error if we don't have the correct length ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth reading these docs to get a sense of what's possible for cmd and entrypoint:
https://docs.docker.com/engine/reference/builder/#cmd
https://docs.docker.com/engine/reference/builder/#entrypoint

This doc is also quite helpful in understanding how cmd and entry point can be used independently or together:
https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact

It's critical to understand the shell form vs. exec form, which applies to cmd and entrypoint. The table that shows how cmd and entry point interact is useful to see how things will behave.

return fmt.Errorf("opm binary not found after extracting %q from catalog image %v", opmBinaryExtracted, srcRef)
}
opmBinaryFileName := filepath.Join(ctlgSrcDir, config.OpmBinDir, "opm")
err = copyOPMBinary(img, opmBinaryFileName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about the windows-amd64-opm and darwin-amd64-opm binaries (is it ok to ignore them ?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @lmzuccarelli
opmBinaryFileName is the target file where we will store the opm binary corresponding to the runtime platform.
copyOPMBinary and more exactly extractOPMBinary will take care of extracting the right binary by looking at the entrypoint of the image in img parameter.
That source binary might be windows-amd64-opm or darwin-amd64-opm, but we're copying it as opm so that we simplify the following steps during processCatalogRefs as to call command opm under the catalogsource folder.
What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was going to be my comment as well, this seems like we lost useful functionality.

I'd also like to see a mode where we can tell oc-mirror to use the already installed opm binary (e.g. --use-preinstalled-opm). I realize this is not something you might want to advertise, so maybe making a hidden flag for this option would be one way to go. My rationale for this is that when I am developing/testing I use a Mac, and any failure in finding a working opm executable is going to prevent me from using oc-mirror.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherine-k - thanks for the explanation !!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just occurred to me that entrypoint is not going to work... here's why:

$ skopeo --override-os linux inspect docker://registry.redhat.io/openshift4/ose-operator-registry:v4.13 --config | jq '[.config.Entrypoint, .config.Cmd]'
[
  [
    "/bin/registry-server"
  ],
  [
    "--database",
    "/bundles.db"
  ]
]

The entrypoint has nothing to do with the opm executable.

Copy link
Contributor Author

@sherine-k sherine-k Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$ skopeo --override-os linux inspect docker://registry.redhat.io/openshift4/ose-operator-registry:v4.13 --config | jq '[.config.Entrypoint, .config.Cmd]'

This is doesn't look like an FBC catalog. Is it?
If that is true, the fact that we only do all this logic when we detect --cache-dir in the CMD will allow us to skip all this for such catalogs.

Copy link
Contributor Author

@sherine-k sherine-k Aug 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was going to be my comment as well, this seems like we lost useful functionality.

Can you explain why you think that?
In the previous implementation we were trying to blindly find anything *opm. In this implementation we are trying to extract the exact binary with which this image is running.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding ose-operator-registry... I forgot that this is used as the base image in creating a new image that would have opm as the entrypoint. So disregard this comment. Doing too many things at once and did not think this through.

@zhouying7780
Copy link

/label qe-approved

return fmt.Errorf("opm binary not found after extracting %q from catalog image %v", opmBinaryExtracted, srcRef)
}
opmBinaryFileName := filepath.Join(ctlgSrcDir, config.OpmBinDir, "opm")
err = copyOPMBinary(img, opmBinaryFileName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was going to be my comment as well, this seems like we lost useful functionality.

I'd also like to see a mode where we can tell oc-mirror to use the already installed opm binary (e.g. --use-preinstalled-opm). I realize this is not something you might want to advertise, so maybe making a hidden flag for this option would be one way to go. My rationale for this is that when I am developing/testing I use a Mac, and any failure in finding a working opm executable is going to prevent me from using oc-mirror.

return err
}
entrypoint := cfgf.Config.Entrypoint
opmBin := strings.TrimPrefix(entrypoint[0], (string)(os.PathSeparator))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's worth reading these docs to get a sense of what's possible for cmd and entrypoint:
https://docs.docker.com/engine/reference/builder/#cmd
https://docs.docker.com/engine/reference/builder/#entrypoint

This doc is also quite helpful in understanding how cmd and entry point can be used independently or together:
https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact

It's critical to understand the shell form vs. exec form, which applies to cmd and entrypoint. The table that shows how cmd and entry point interact is useful to see how things will behave.

// if err := os.Symlink(absLinkTarget, filepath.Join("/tmp", header.Name)); err != nil {
// panic(err)
// }
return extractOPMBinary(linkTarget, img, target)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't (or at least should not) run this recursively. This is going to force each depth of recursion to:

  1. collapse the layers of the image into a flattened tar via mutate.Extract(img)
  2. walk the whole file system tree (a catalog image can have thousands of files in it, most relating to the operating system)

Imagine the scenario where they did something like this:

/opm -> /foo/opm
/foo/opm -> /bar/opm
/bar/opm

Depending on the order these files are processed, you could get lucky and find /bar/opm first, or at worst have to do two levels of recursion before you find it.

Seems like this should be iterating once and keeping track of what you find along the way and then choose the "winner".

Copy link
Contributor

@lmzuccarelli lmzuccarelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sherine-k - this I feel is a cleaner solution. I will leave @jchunkins to verify and also ensure his concerns were addressed

@aguidirh
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Aug 31, 2023
Comment on lines 570 to 583
// if it's a file create it
f, err := os.OpenFile(target, os.O_CREATE|os.O_RDWR, os.FileMode(header.Mode))
if err != nil {
return err
}

// copy over contents
if _, err := io.Copy(f, tr); err != nil {
return err
}

// manually close here after each file operation; defering would cause each file close
// to wait until all operations have completed.
f.Close()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this extracting every single file it encounters? Seems like you could check the header.Mode and see if its a least an executable before making the copy


}
opmBinPathComponents := strings.Split(opmBin, string(filepath.Separator))
for link, src := range symLinks {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused by the names here... maybe more documentation would help with understanding what you're doing.

As far as I can tell, the symLink map contains the following after iterating over the image contents:

symLink key == <destination folder>/<"name"> (i.e. this is the thing we might have just extracted ... see my other comment about header.Mode)
symLink value == <destination folder>/<"symlink name"> (i.e. this is just a path to something that does not exist yet)

Can you tell me what the values of header.Name or header.Linkname are? Is it just the base name (e.g. opm) or the path plus the file name (e.g. foo/opm)? I assume it's a path to a file or symlink, right?

Now when you iterate over linkMap, the variables don't seem to match, which is confusing me a bit:

link == <destination folder>/<path to file name> (i.e. an actual extracted file)
src == <destination folder>/<path to symlink name>

Let's discuss this in scrum since I am not following

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about the scenario where the path that is provided is a direct path to the binary? This seems like it assumes its going to be a symbolic link.

@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Sep 5, 2023
@openshift-ci
Copy link

openshift-ci bot commented Sep 5, 2023

@sherine-k: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Copy link
Contributor

@lmzuccarelli lmzuccarelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall this looks good. Great work @sherine-k

// Following https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact,
// identifying the binary that will be run by the catalog:
entrypoint := cfgf.Config.Entrypoint
if len(entrypoint) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice - good example of defensive programming :)

// All layers of the image are flattened.
// All contents (files, folders and symLinks are extracted)
// Errors related to creation of symbolic links are not considered as blockers
// unless they relate to the `opmBin` path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments are clear and concise

// descriptor=".../oc-mirror-workspace/src/catalogs/.../extracted/usr/bin/darwin-amd64-opm"; ".../oc-mirror-workspace/src/catalogs/.../extracted/bin/registry/darwin-amd64-opm"
// descriptor=".../oc-mirror-workspace/src/catalogs/.../extracted/usr/bin/opm"; ".../oc-mirror-workspace/src/catalogs/.../extracted/bin/registry/opm"
// descriptor=".../oc-mirror-workspace/src/catalogs/.../extracted/etc/alternatives/easy_install-3"; ".../oc-mirror-workspace/src/catalogs/.../extracted/usr/bin/easy_install-3.6"
// descriptor=".../oc-mirror-workspace/src/catalogs/.../extracted/usr/bin/easy_install-3.6"; ".../oc-mirror-workspace/src/catalogs/.../extracted/hostname"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also good to have some examples of what physically can be expected

@lmzuccarelli
Copy link
Contributor

@sherine-k - lets work towards getting this merged

@lmzuccarelli
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 5, 2023
@zhouying7780
Copy link

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved Signifies that QE has signed off on this PR label Sep 6, 2023
@jerpeter1
Copy link
Member

/label acknowledge-critical-fixes-only

@openshift-ci openshift-ci bot added the acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. label Sep 7, 2023
@openshift-merge-robot openshift-merge-robot merged commit 00c48bb into openshift:main Sep 7, 2023
5 checks passed
@openshift-ci-robot
Copy link

@sherine-k: Jira Issue OCPBUGS-17545: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-17545 has been moved to the MODIFIED state.

In response to this:

Description

This PR improves the way the opm binary is extracted from catalogs by leveraging the ENTRYPOINT from the container image config.
It also forces pulling the container image that matches the runtime architecture and OS.

Fixes # OCPBUGS-17545

Type of change

Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

This was tested both on amd64 and arm64 OSs with:

./bin/oc-mirror -c iscv2.yaml docker://localhost:5000/aws --dest-use-http --dest-skip-tls

and the following config:

kind: ImageSetConfiguration
apiVersion: mirror.openshift.io/v1alpha2
mirror:
 # additionalImages: 
 #   - name: registry.redhat.io/ubi8/ubi:latest  
 operators:
   - catalog: registry.redhat.io/redhat/redhat-operator-index:v4.13
     packages:
     - name: node-observability-operator
     # - name: aws-load-balancer-operator  
     # - name: 3scale-operator

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

sherine-k added a commit to sherine-k/oc-mirror that referenced this pull request Sep 8, 2023
…#676)

* Improve extracting opm binary from catalogs

* Fix OCPBUGS-17545 - Force pull catalog by platform architecture

* Rewrite extractOPMBinary in a non-recursive way

* Attempt 2 -  Rewrite extractOPMBinary in a non-recursive way
This new implementation extracts the whole catalog to a local folder in order to make it easier to grab the opm binary.

* Account for cases where catalog image's entrypoint might be empty

* Add comments to opm binary extraction logic
openshift-merge-robot pushed a commit that referenced this pull request Sep 11, 2023
…o deploy because of invalid caches (#691)

* CFE-825: As a oc-mirror user, I would like mirrored operator catalogs to have valid caches (#651)

* Regenerate opm cache when building catalog images

* Recreate the cache of local OCI catalogs

* Incomplete - Include opm cache integrity verification to E2E
Fix code regarding finding the location of the opm binary

* TEMPORARY: use a different repository for source catalogs
Fix E2E test catalog_full

* Make cache regeneration during catalog rebuild conditional based on presence of argument `--cache-dir` in the original image container CMD

* Fix review comments

* OCPBUGS-17545: Improve extracting opm binary from catalogs (#676)

* Improve extracting opm binary from catalogs

* Fix OCPBUGS-17545 - Force pull catalog by platform architecture

* Rewrite extractOPMBinary in a non-recursive way

* Attempt 2 -  Rewrite extractOPMBinary in a non-recursive way
This new implementation extracts the whole catalog to a local folder in order to make it easier to grab the opm binary.

* Account for cases where catalog image's entrypoint might be empty

* Add comments to opm binary extraction logic

* Fix E2E tests after conflict resolution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
acknowledge-critical-fixes-only Indicates if the issuer of the label is OK with the policy. approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-important Referenced Jira bug's severity is important for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. qe-approved Signifies that QE has signed off on this PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants