data/manifests/bootkube/cvo-overrides: Drop the explicit upstream #4112

wking · 2020-08-28T16:39:58Z

Not bug-worthy, but floating now for 4.7 to centralize discussion.

The cluster-version operator has been falling back to a default URI when the ClusterVersion upstream is empty since way back, and this behavior is enshrined in the API. Drop the installer-side
opinion, so:

There is a single location where we version-control the default upstream (the CVO).
Folks consuming in-cluster ClusterVersion objects have one less property to distract them (unless they want to override the default, in which case it's not distracting).

Related to openshift/openshift-docs#22252.

CC @vrutkovs , since I'm not sure how this will square with FCOS and a71f424. FCOS options:

Continue to override the CVO default with explicit ClusterVersion. No change from today, but breaks nominal API of:

By default it will use the appropriate update server for the cluster"
Teach the CVO to recognize FCOS vs. OCP and use the right default for each.
Fork the CVO and add an fcos branch, as you currently do for the installer.

Thoughts?

The cluster-version operator has been falling back to a default URI when the ClusterVersion upstream is empty since way back [1,2], and this behavior is enshrined in the API [3]. Drop the installer-side opinion, so: * There is a single location where we version-control the default upstream (the CVO). * Folks consuming in-cluster ClusterVersion objects have one less property to distract them (unless they want to override the default, in which case it's not distracting). [1]: https://github.com/openshift/cluster-version-operator/blame/2c4931dc283c551938be1a00fee290de0b79d99c/pkg/cvo/availableupdates.go#L27-L31 [2]: openshift/cluster-version-operator@ab4d84a#diff-78f2af341fa49292dd6930f378018867R24 [3]: https://github.com/openshift/api/blame/0422dc17083e9e8df18d029f3f34322e96e9c326/config/v1/types_cluster_version.go#L56-L57

vrutkovs · 2020-08-28T16:53:07Z

OKD installer is using a fork, eventually we'll most likely template that file to override upstream URL. So +1 from me on this

wking · 2020-08-28T16:54:49Z

...eventually we'll most likely template that file to override upstream URL...

You're not concerned about "what if a user clears upstream in their FCOS cluster, expecting the CVO to default to 'the appropriate update server for the cluster' as defined in the API, but they get pointed at the OCP update service instead"?

vrutkovs · 2020-08-28T17:37:17Z

OKD is open to experiments and there is no official support :) In any case I'd prefer to avoid forking CVO - that'd require additional tests, which are hard to keep track of as most of them are optional

wking · 2020-08-28T20:51:35Z

In any case I'd prefer to avoid forking CVO...

So an alternative option would be to have something about preferred upstream get baked into the release as metadata, and have the CVO load that dynamically. Not sure how that would work if in the future we decide to shard by region or some such, but we could set the metadata differently when we build OCP and OKD releases. Anyhow, sounds like we don't need to work out how to handle ODK defaulting before we're comfortable removing this for the OCP installer in this PR :).

abhinavdahiya · 2020-10-05T20:42:17Z

If the CVO team feels that the defaulting in CVO's code should be good enough, I don't see a reason why the installer would enforce this for now. If that changes we can bring it back.

/approve

wking · 2020-10-06T15:07:32Z

/retest

openshift-ci-robot · 2020-10-06T19:38:56Z

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-crc	`c9095b3`	link	`/test e2e-crc`
ci/prow/e2e-aws-workers-rhel7	`c9095b3`	link	`/test e2e-aws-workers-rhel7`
ci/prow/e2e-ovirt	`c9095b3`	link	`/test e2e-ovirt`
ci/prow/e2e-libvirt	`c9095b3`	link	`/test e2e-libvirt`
ci/prow/e2e-aws	`c9095b3`	link	`/test e2e-aws`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

wking · 2020-11-04T05:48:54Z

/retest

wking · 2020-11-04T14:26:59Z

/assign @vrutkovs

vrutkovs

/lgtm

openshift-ci-robot · 2020-11-04T14:32:23Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, vrutkovs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [abhinavdahiya]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2020-11-04T14:46:06Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-merge-robot · 2020-11-04T15:42:48Z

@wking: The following tests failed, say /retest to rerun all failed tests:

Test name	Commit	Details	Rerun command
ci/prow/e2e-crc	`c9095b3`	link	`/test e2e-crc`
ci/prow/e2e-ovirt	`c9095b3`	link	`/test e2e-ovirt`

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

openshift-bot · 2020-11-04T16:30:07Z

/retest

Please review the full test history for this PR and help us cut down flakes.

…sion Hopefully we never actually have to stuff a CVO-generated ClusterVersion into the cluster; it's just for recovery after admins accidentally delete their existing ClusterVersion. But if we ever do hit this code, we want to push it without spec.upstream, to allow later ClusterVersion-consuming code to say "ah, user doesn't care which upstream I use, so I'll use the best default I'm aware of". This effectively pushes default choice from the CVO that creates the ClusterVersion out to the CVO that consumes the ClusterVersion, and that later CVO is almost certainly more current on which default is best. Similar to openshift/installer@c9095b3451 (data/manifests/bootkube/cvo-overrides: Drop the explicit upstream, 2020-08-28, openshift/installer#4112).

In 4.1, the installer used to explicitly set upstream to our default URI. But in openshift/installer#c9095b34518a0 (data/manifests/bootkube/cvo-overrides: Drop the explicit update, 2020-08-28, openshift/installer#4112), which landed in 4.7 and was not backported, I'd stopped doing that. In clusters born in 4.7 and later, the installer will leave upstream unset, and the cluster-version operator will default to making a reasonable choice. We still need to talk about explicit upstreams in the case where folks are pointing their cluster at a local OpenShift Update Service, but this commit drops the properties where we were incidentally pointing at the default, Red-Hat-hosted location, because explicitly setting that value is an anti-pattern that makes it harder for clusters to adapt if we try to move our default location elsewhere in the future. Also restore a closing brace and dangling comma to clean up after c0fc03d (osdocs-2368: updating 4.8 references to 4.9, 2021-10-01, openshift#36974), which also removed some of the stale 'upstream' references.

In 4.1, the installer used to explicitly set upstream to our default URI. But in openshift/installer#c9095b34518a0 (data/manifests/bootkube/cvo-overrides: Drop the explicit update, 2020-08-28, openshift/installer#4112), which landed in 4.7 and was not backported, I'd stopped doing that. In clusters born in 4.7 and later, the installer will leave upstream unset, and the cluster-version operator will default to making a reasonable choice. We still need to talk about explicit upstreams in the case where folks are pointing their cluster at a local OpenShift Update Service, but this commit drops the properties where we were incidentally pointing at the default, Red-Hat-hosted location, because explicitly setting that value is an anti-pattern that makes it harder for clusters to adapt if we try to move our default location elsewhere in the future.

…e throttling Since [1], clusters born in 4.7 or later default to not having an explicit spec.upstream, so they rely on the CVO's internal default. However, in that case, u.Upstream will be an empty string, while by this point in syncAvailableUpdates, upstream will have been set to the default value. This commit splits up the "do we really need to check?" logic into a number of distinct cases, and gives them more specific logging, to make it easier to understand and confirm the desired behavior: a. If we have no cached data, we need to pull a graph. b. If it's been over minimumUpdateCheckInterval since our last check, we need to pull a graph. Even if nothing has changed on our side, our data is sufficiently stale to need a refresh. c. If the channel has changed, we have different interests, and we need to pull a graph to hear what the upstream recommends for this new set of interests. d. If the upstream hasn't changed, because: i. The current upstream (explicitly or by default) matches the old explicit upstream, or ii. The current upstream (explicitly or by default) matches the default, and the old upstream was unset. then everything's the same on our side, and our cached graph is recent, so we don't need to do anything. e. Otherwise, the upstream has changed, and we need to pull a graph to see what our new guide has to suggest. Cases for upstream: * A -> A: Handled by d.i. * A -> B: Handled by e. * A -> unset (defaulted) or default: Handled by e. * Unset or default -> A: Handled by e. * Default -> default: Handled in d.i. * Default -> unset (defaulted): Handled in d.i. * Unset -> default: Handled by d.ii, new in this commit, previously resulted in an excessive pull. * Unset -> unset (defaulted): Handled by d.i, new in this commit, previously resulted in an excessive pull. [1]: openshift/installer#4112

…e throttling Since [1], clusters born in 4.7 or later default to not having an explicit spec.upstream, so they rely on the CVO's internal default. However, in that case, u.Upstream will be an empty string, while by this point in syncAvailableUpdates, upstream will have been set to the default value. This commit splits up the "do we really need to check?" logic into a number of distinct cases, and gives them more specific logging, to make it easier to understand and confirm the desired behavior: a. If we have no cached data, we need to pull a graph. b. If it's been over minimumUpdateCheckInterval since our last check, we need to pull a graph. Even if nothing has changed on our side, our data is sufficiently stale to need a refresh. c. If the channel has changed, we have different interests, and we need to pull a graph to hear what the upstream recommends for this new set of interests. d. If the upstream hasn't changed, because: i. The current upstream (explicitly or by default) matches the old explicit upstream, or ii. The current upstream (explicitly or by default) matches the default, and the old upstream was unset. then everything's the same on our side, and our cached graph is recent, so we don't need to do anything. e. Otherwise, the upstream has changed, and we need to pull a graph to see what our new guide has to suggest. Cases for upstream: * A -> A: Handled by d.i. * A -> B: Handled by e. * A -> unset (defaulted) or default: Handled by e. * Unset or default -> A: Handled by e. * Default -> default: Handled in d.i. * Default -> unset (defaulted): Handled in d.i. * Unset -> default: Handled by d.ii, new in this commit, previously resulted in an excessive pull. * Unset -> unset (defaulted): Handled by d.ii, new in this commit, previously resulted in an excessive pull. [1]: openshift/installer#4112

openshift-ci-robot requested review from jhixson74 and mtnbikenc August 28, 2020 16:40

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 5, 2020

openshift-ci-robot assigned vrutkovs Nov 4, 2020

wking changed the title ~~data/manifests/bootkube/cvo-overrides: Drop the explicit update~~ data/manifests/bootkube/cvo-overrides: Drop the explicit upstream Nov 4, 2020

vrutkovs approved these changes Nov 4, 2020

View reviewed changes

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 4, 2020

openshift-merge-robot merged commit e176973 into openshift:master Nov 4, 2020

wking deleted the drop-upstream-opinion branch November 4, 2020 23:34

cblecker mentioned this pull request Mar 17, 2021

Add defaulting for clusterversion upstream configuration openshift/managed-upgrade-operator#209

Merged

3 tasks

wking mentioned this pull request Aug 17, 2021

modules: Drop 'upstream' from ClusterVersion examples openshift/openshift-docs#35567

Merged

wking mentioned this pull request Aug 20, 2021

pkg/cvo: Drop the explicit 'upstream' from our replacement ClusterVersion openshift/cluster-version-operator#640

Merged

wking mentioned this pull request Aug 20, 2021

Versioning cincinnati api and json schema openshift/enhancements#870

Merged

florkbr mentioned this pull request Sep 1, 2021

Console 2271: allow for configuring upstream server for air gapped envs openshift/console#9957

Merged

wking mentioned this pull request Dec 17, 2021

Bug 2033745: pkg/cvo/availableupdates: Acount for default upstream in recent-change throttling openshift/cluster-version-operator#718

Merged

wking mentioned this pull request Feb 10, 2022

Add user selectable capabilities in installconfig #5605

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data/manifests/bootkube/cvo-overrides: Drop the explicit upstream #4112

data/manifests/bootkube/cvo-overrides: Drop the explicit upstream #4112

wking commented Aug 28, 2020

vrutkovs commented Aug 28, 2020

wking commented Aug 28, 2020

vrutkovs commented Aug 28, 2020

wking commented Aug 28, 2020

abhinavdahiya commented Oct 5, 2020

wking commented Oct 6, 2020

openshift-ci-robot commented Oct 6, 2020

wking commented Nov 4, 2020

wking commented Nov 4, 2020

vrutkovs left a comment

openshift-ci-robot commented Nov 4, 2020

openshift-bot commented Nov 4, 2020

openshift-merge-robot commented Nov 4, 2020

openshift-bot commented Nov 4, 2020

data/manifests/bootkube/cvo-overrides: Drop the explicit upstream #4112

data/manifests/bootkube/cvo-overrides: Drop the explicit upstream #4112

Conversation

wking commented Aug 28, 2020

vrutkovs commented Aug 28, 2020

wking commented Aug 28, 2020

vrutkovs commented Aug 28, 2020

wking commented Aug 28, 2020

abhinavdahiya commented Oct 5, 2020

wking commented Oct 6, 2020

openshift-ci-robot commented Oct 6, 2020

wking commented Nov 4, 2020

wking commented Nov 4, 2020

vrutkovs left a comment

Choose a reason for hiding this comment

openshift-ci-robot commented Nov 4, 2020

openshift-bot commented Nov 4, 2020

openshift-merge-robot commented Nov 4, 2020

openshift-bot commented Nov 4, 2020