New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-17545: Improve extracting opm binary from catalogs #676
Conversation
@sherine-k: This pull request references Jira Issue OCPBUGS-17545, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sherine-k The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@sherine-k: This pull request references Jira Issue OCPBUGS-17545, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/jira refresh |
@sherine-k: This pull request references Jira Issue OCPBUGS-17545, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
pkg/cli/mirror/catalog_images.go
Outdated
return err | ||
} | ||
entrypoint := cfgf.Config.Entrypoint | ||
opmBin := strings.TrimPrefix(entrypoint[0], (string)(os.PathSeparator)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we confident of the fact that there will always be an entrypoint[0] - should we check first and error if we don't have the correct length ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth reading these docs to get a sense of what's possible for cmd and entrypoint:
https://docs.docker.com/engine/reference/builder/#cmd
https://docs.docker.com/engine/reference/builder/#entrypoint
This doc is also quite helpful in understanding how cmd and entry point can be used independently or together:
https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact
It's critical to understand the shell form vs. exec form, which applies to cmd and entrypoint. The table that shows how cmd and entry point interact is useful to see how things will behave.
pkg/cli/mirror/catalog_images.go
Outdated
return fmt.Errorf("opm binary not found after extracting %q from catalog image %v", opmBinaryExtracted, srcRef) | ||
} | ||
opmBinaryFileName := filepath.Join(ctlgSrcDir, config.OpmBinDir, "opm") | ||
err = copyOPMBinary(img, opmBinaryFileName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about the windows-amd64-opm and darwin-amd64-opm binaries (is it ok to ignore them ?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @lmzuccarelli
opmBinaryFileName
is the target file where we will store the opm binary corresponding to the runtime platform.
copyOPMBinary
and more exactly extractOPMBinary
will take care of extracting the right binary by looking at the entrypoint of the image in img
parameter.
That source binary might be windows-amd64-opm
or darwin-amd64-opm
, but we're copying it as opm
so that we simplify the following steps during processCatalogRefs
as to call command opm under the catalogsource folder.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was going to be my comment as well, this seems like we lost useful functionality.
I'd also like to see a mode where we can tell oc-mirror to use the already installed opm binary (e.g. --use-preinstalled-opm
). I realize this is not something you might want to advertise, so maybe making a hidden flag for this option would be one way to go. My rationale for this is that when I am developing/testing I use a Mac, and any failure in finding a working opm executable is going to prevent me from using oc-mirror.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sherine-k - thanks for the explanation !!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It just occurred to me that entrypoint is not going to work... here's why:
$ skopeo --override-os linux inspect docker://registry.redhat.io/openshift4/ose-operator-registry:v4.13 --config | jq '[.config.Entrypoint, .config.Cmd]'
[
[
"/bin/registry-server"
],
[
"--database",
"/bundles.db"
]
]
The entrypoint has nothing to do with the opm executable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$ skopeo --override-os linux inspect docker://registry.redhat.io/openshift4/ose-operator-registry:v4.13 --config | jq '[.config.Entrypoint, .config.Cmd]'
This is doesn't look like an FBC catalog. Is it?
If that is true, the fact that we only do all this logic when we detect --cache-dir in the CMD will allow us to skip all this for such catalogs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was going to be my comment as well, this seems like we lost useful functionality.
Can you explain why you think that?
In the previous implementation we were trying to blindly find anything *opm. In this implementation we are trying to extract the exact binary with which this image is running.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding ose-operator-registry
... I forgot that this is used as the base image in creating a new image that would have opm as the entrypoint. So disregard this comment. Doing too many things at once and did not think this through.
/label qe-approved |
pkg/cli/mirror/catalog_images.go
Outdated
return fmt.Errorf("opm binary not found after extracting %q from catalog image %v", opmBinaryExtracted, srcRef) | ||
} | ||
opmBinaryFileName := filepath.Join(ctlgSrcDir, config.OpmBinDir, "opm") | ||
err = copyOPMBinary(img, opmBinaryFileName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was going to be my comment as well, this seems like we lost useful functionality.
I'd also like to see a mode where we can tell oc-mirror to use the already installed opm binary (e.g. --use-preinstalled-opm
). I realize this is not something you might want to advertise, so maybe making a hidden flag for this option would be one way to go. My rationale for this is that when I am developing/testing I use a Mac, and any failure in finding a working opm executable is going to prevent me from using oc-mirror.
pkg/cli/mirror/catalog_images.go
Outdated
return err | ||
} | ||
entrypoint := cfgf.Config.Entrypoint | ||
opmBin := strings.TrimPrefix(entrypoint[0], (string)(os.PathSeparator)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's worth reading these docs to get a sense of what's possible for cmd and entrypoint:
https://docs.docker.com/engine/reference/builder/#cmd
https://docs.docker.com/engine/reference/builder/#entrypoint
This doc is also quite helpful in understanding how cmd and entry point can be used independently or together:
https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact
It's critical to understand the shell form vs. exec form, which applies to cmd and entrypoint. The table that shows how cmd and entry point interact is useful to see how things will behave.
pkg/cli/mirror/catalog_images.go
Outdated
// if err := os.Symlink(absLinkTarget, filepath.Join("/tmp", header.Name)); err != nil { | ||
// panic(err) | ||
// } | ||
return extractOPMBinary(linkTarget, img, target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't (or at least should not) run this recursively. This is going to force each depth of recursion to:
- collapse the layers of the image into a flattened tar via mutate.Extract(img)
- walk the whole file system tree (a catalog image can have thousands of files in it, most relating to the operating system)
Imagine the scenario where they did something like this:
/opm -> /foo/opm
/foo/opm -> /bar/opm
/bar/opm
Depending on the order these files are processed, you could get lucky and find /bar/opm
first, or at worst have to do two levels of recursion before you find it.
Seems like this should be iterating once and keeping track of what you find along the way and then choose the "winner".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sherine-k - this I feel is a cleaner solution. I will leave @jchunkins to verify and also ensure his concerns were addressed
This new implementation extracts the whole catalog to a local folder in order to make it easier to grab the opm binary.
63cc5d1
to
ccb512d
Compare
/lgtm |
pkg/cli/mirror/catalog_images.go
Outdated
// if it's a file create it | ||
f, err := os.OpenFile(target, os.O_CREATE|os.O_RDWR, os.FileMode(header.Mode)) | ||
if err != nil { | ||
return err | ||
} | ||
|
||
// copy over contents | ||
if _, err := io.Copy(f, tr); err != nil { | ||
return err | ||
} | ||
|
||
// manually close here after each file operation; defering would cause each file close | ||
// to wait until all operations have completed. | ||
f.Close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't this extracting every single file it encounters? Seems like you could check the header.Mode and see if its a least an executable before making the copy
pkg/cli/mirror/catalog_images.go
Outdated
|
||
} | ||
opmBinPathComponents := strings.Split(opmBin, string(filepath.Separator)) | ||
for link, src := range symLinks { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am confused by the names here... maybe more documentation would help with understanding what you're doing.
As far as I can tell, the symLink map contains the following after iterating over the image contents:
symLink key == <destination folder>/<"name">
(i.e. this is the thing we might have just extracted ... see my other comment about header.Mode)
symLink value == <destination folder>/<"symlink name">
(i.e. this is just a path to something that does not exist yet)
Can you tell me what the values of header.Name or header.Linkname are? Is it just the base name (e.g. opm
) or the path plus the file name (e.g. foo/opm
)? I assume it's a path to a file or symlink, right?
Now when you iterate over linkMap, the variables don't seem to match, which is confusing me a bit:
link
== <destination folder>/<path to file name>
(i.e. an actual extracted file)
src
== <destination folder>/<path to symlink name>
Let's discuss this in scrum since I am not following
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about the scenario where the path that is provided is a direct path to the binary? This seems like it assumes its going to be a symbolic link.
@sherine-k: all tests passed! Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall this looks good. Great work @sherine-k
// Following https://docs.docker.com/engine/reference/builder/#understand-how-cmd-and-entrypoint-interact, | ||
// identifying the binary that will be run by the catalog: | ||
entrypoint := cfgf.Config.Entrypoint | ||
if len(entrypoint) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice - good example of defensive programming :)
// All layers of the image are flattened. | ||
// All contents (files, folders and symLinks are extracted) | ||
// Errors related to creation of symbolic links are not considered as blockers | ||
// unless they relate to the `opmBin` path. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments are clear and concise
// descriptor=".../oc-mirror-workspace/src/catalogs/.../extracted/usr/bin/darwin-amd64-opm"; ".../oc-mirror-workspace/src/catalogs/.../extracted/bin/registry/darwin-amd64-opm" | ||
// descriptor=".../oc-mirror-workspace/src/catalogs/.../extracted/usr/bin/opm"; ".../oc-mirror-workspace/src/catalogs/.../extracted/bin/registry/opm" | ||
// descriptor=".../oc-mirror-workspace/src/catalogs/.../extracted/etc/alternatives/easy_install-3"; ".../oc-mirror-workspace/src/catalogs/.../extracted/usr/bin/easy_install-3.6" | ||
// descriptor=".../oc-mirror-workspace/src/catalogs/.../extracted/usr/bin/easy_install-3.6"; ".../oc-mirror-workspace/src/catalogs/.../extracted/hostname" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also good to have some examples of what physically can be expected
@sherine-k - lets work towards getting this merged |
/lgtm |
/label qe-approved |
/label acknowledge-critical-fixes-only |
@sherine-k: Jira Issue OCPBUGS-17545: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-17545 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
…#676) * Improve extracting opm binary from catalogs * Fix OCPBUGS-17545 - Force pull catalog by platform architecture * Rewrite extractOPMBinary in a non-recursive way * Attempt 2 - Rewrite extractOPMBinary in a non-recursive way This new implementation extracts the whole catalog to a local folder in order to make it easier to grab the opm binary. * Account for cases where catalog image's entrypoint might be empty * Add comments to opm binary extraction logic
…o deploy because of invalid caches (#691) * CFE-825: As a oc-mirror user, I would like mirrored operator catalogs to have valid caches (#651) * Regenerate opm cache when building catalog images * Recreate the cache of local OCI catalogs * Incomplete - Include opm cache integrity verification to E2E Fix code regarding finding the location of the opm binary * TEMPORARY: use a different repository for source catalogs Fix E2E test catalog_full * Make cache regeneration during catalog rebuild conditional based on presence of argument `--cache-dir` in the original image container CMD * Fix review comments * OCPBUGS-17545: Improve extracting opm binary from catalogs (#676) * Improve extracting opm binary from catalogs * Fix OCPBUGS-17545 - Force pull catalog by platform architecture * Rewrite extractOPMBinary in a non-recursive way * Attempt 2 - Rewrite extractOPMBinary in a non-recursive way This new implementation extracts the whole catalog to a local folder in order to make it easier to grab the opm binary. * Account for cases where catalog image's entrypoint might be empty * Add comments to opm binary extraction logic * Fix E2E tests after conflict resolution
Description
This PR improves the way the opm binary is extracted from catalogs by leveraging the ENTRYPOINT from the container image config.
It also forces pulling the container image that matches the runtime architecture and OS.
Fixes # OCPBUGS-17545
Type of change
Please delete options that are not relevant.
How Has This Been Tested?
This was tested both on amd64 and arm64 OSs with:
and the following config:
Checklist: