Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1826533: pkg/cli/admin/upgrade: Client-side by-tag guard #390

Conversation

wking
Copy link
Member

@wking wking commented Apr 16, 2020

Adding a client-side guard like the server-side guard from openshift/cluster-version-operator@55e3cb450f (openshift/cluster-version-operator#170). This isn't that big a deal, because the client-side guard exists, but it does save users the effort of pushing a by-tag pullspec into ClusterVersion and then having to circle back around to notice that the cluster-version operator is complaining about the verification failure.

The warning on --force (instead of a hard failure) is because @smarterclayton considers blocking this flow completely to be a breaking API change, and also considers that failing those folks early is more invasive than letting them continue to slip through here and have some subset of them potentially blow up later if:

  1. You ask to update to a by-tag pullspec.
  2. Cluster updates.
  3. Someone clobbers the tag you used in the registry to point it at a different release.
  4. Cluster continues on, blissfully unaware.
  5. CVO gets rescheduled for whatever reason.
  6. Cluster pulls a fresh registry image for the new pod, but it's by-tag, so you get the new content.
  7. CVO thinks it's still in reconciling mode, because the pullspec hasn't changed.
  8. World explodes as the new manifests get applied in a parallel, randomized order.

Or maybe the new CVO defaults to starting in install mode or some such. But still, random release rollouts trigged by CVO pod restarts are more exitement than I'd wish on anyone, even folks who use --force.

It would also be acceptable to have oc attempt to resolve the by-tag pullspec into a by-digest pullspec, but there's no guarantee that the host running oc has access to the same registry that the in-cluster CRI-O will be pulling from, so getting consistent client-side resolution would be tricky.

if o.Force {
fmt.Fprintln(o.ErrOut, "warning: Using by-tag pull specs is dangerous, and while we still allow it in combination with --force for backward compatibility, it would be much safer to pass a by-digest pul spec instead")
} else {
return fmt.Errorf("--to-image must be a by-digest pull spec, unless --force is also set, because release images that are not accessed via digest cannot be verified by the cluster. Even when --force is set, using tags is not recommended, although we continue to allow it for backwards compatibility")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this line is way too long to be a single line output.. maybe we make it multiline.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't the user's terminal/pager wrap this as appropriate? Should we use an output formatter that understands the terminal width when output is a terminal?

@wking wking changed the title pkg/cli/admin/upgrade: Client-side by-tag guard Bug 1826533: pkg/cli/admin/upgrade: Client-side by-tag guard Apr 21, 2020
@openshift-ci-robot openshift-ci-robot added bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Apr 21, 2020
@openshift-ci-robot
Copy link

@wking: This pull request references Bugzilla bug 1826533, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.5.0) matches configured target release for branch (4.5.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1826533: pkg/cli/admin/upgrade: Client-side by-tag guard

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Member

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve

pkg/cli/admin/upgrade/upgrade.go Outdated Show resolved Hide resolved
@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 22, 2020
Adding a client-side guard like the server-side guard [1] from
openshift/cluster-version-operator@55e3cb450f (verifier: Add public
key verification of release image digests, 2019-04-19,
openshift/cluster-version-operator#170).  This isn't that big a deal,
because the client-side guard exists, but it does save users the
effort of pushing a by-tag pullspec into ClusterVersion and then
having to circle back around to notice that the cluster-version
operator is complaining about the verification failure.

The warning on --force (instead of a hard failure) is because Clayton
considers blocking this flow completely to be a breaking API change,
and also considers that failing those folks early is more invasive
than letting them continue to slip through here and have some subset
of them potentially blow up later if:

1. You ask to update to a by-tag pullspec.
2. Cluster updates.
3. Someone clobbers the tag you used in the registry to point it at a
   different release.
4. Cluster continues on, blissfully unaware.
5. CVO gets rescheduled for whatever reason.
6. Cluster pulls a fresh registry image for the new pod, but it's
   by-tag, so you get the new content.
7. CVO thinks it's still in reconciling mode, because the pullspec
   hasn't changed.
8. World explodes as the new manifests get applied in a parallel,
   randomized order.

Or maybe the new CVO defaults to starting in install mode or some
such.  But still, random release rollouts trigged by CVO pod restarts
are more exitement than I'd wish on anyone, even folks who use
--force.

It would also be acceptable to have oc attempt to resolve the by-tag
pullspec into a by-digest pullspec, but there's no guarantee that the
host running 'oc' has access to the same registry that the in-cluster
CRI-O will be pulling from, so getting consistent client-side
resolution would be tricky.

[1]: https://github.com/openshift/cluster-version-operator/blame/89cb270523675e27a2f54918431170946636f5d5/pkg/verify/verify.go#L203-L204
@wking wking force-pushed the enforce-by-digest-upgrade-when-possible branch from 497ae17 to 43cb3c8 Compare May 5, 2020 04:38
@wking
Copy link
Member Author

wking commented May 5, 2020

/retest

@wking
Copy link
Member Author

wking commented May 7, 2020

e2e-cmd timed out on install.

/retest

Copy link
Member

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 7, 2020
@openshift-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: soltysh, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

@wking: The following test failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/images 43cb3c8 link /test images

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci-robot
Copy link

@wking: All pull requests linked via external trackers have merged: openshift/oc#387, openshift/oc#388, openshift/oc#389, openshift/oc#390. Bugzilla bug 1826533 has been moved to the MODIFIED state.

In response to this:

Bug 1826533: pkg/cli/admin/upgrade: Client-side by-tag guard

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking wking deleted the enforce-by-digest-upgrade-when-possible branch May 7, 2020 23:33
wking added a commit to wking/oc that referenced this pull request Jan 28, 2023
…lspecs

'oc adm release mirror ...' should tag both the main release image and
all the referenced component images when it pushes to the registry.
Ideally all downstream consumers are using by-digest pullspecs,
because those are immutable, and therefore safer, as described in
43cb3c8 (pkg/cli/admin/upgrade: Client-side by-tag guard,
2020-04-16, openshift#390).  But many image registries garbage-collect untagged
images, so when we push into a registry we need to push into a tag to
avoid that garbage collection.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-low Referenced Bugzilla bug's severity is low for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants