New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs/README: Useless bump to get a new commit hash #2584
docs/README: Useless bump to get a new commit hash #2584
Conversation
@wking: No Bugzilla bug is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@wking: No Bugzilla bug is referenced in the title of this pull request. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The MCO has a bug where it relies on its commit hash to mark the target version of rendered MachineConfigs [1]. We didn't get a bump, or even a rebuild between 4.7.11 and 4.7.12: $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.7.11-x86_64 | grep machine-config-operator machine-config-operator https://github.com/openshift/machine-config-operator e3863b0 $ oc adm release info --commits quay.io/openshift-release-dev/ocp-release:4.7.12-x86_64 | grep machine-config-operator machine-config-operator https://github.com/openshift/machine-config-operator e3863b0 So the MCO starts updating the pools between the two releases and immediately says "ahh, looks like I've already finished updating", when it hasn't. Lots of example jobs linked from [2], including [3]: the "master" pool should be updated before the CVO reports available at the new version From that job: $ curl -s https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1395451649530531840/build-log.txt | grep 'clusteroperator/machine-config.*version' INFO[2021-05-20T20:59:59Z] May 20 20:39:58.483 I /machine-config reason/OperatorVersionChanged clusteroperator/machine-config-operator started a version change from [{operator 4.7.11}] to [{operator 4.7.12}] INFO[2021-05-20T20:59:59Z] May 20 20:40:04.662 I /machine-config reason/OperatorVersionChanged clusteroperator/machine-config-operator version changed from [{operator 4.7.11}] to [{operator 4.7.12}] INFO[2021-05-20T20:59:59Z] May 20 20:40:04.815 I clusteroperator/machine-config versions: operator 4.7.11 -> 4.7.12 INFO[2021-05-20T20:59:59Z] May 20 20:40:05.420 W clusteroperator/machine-config changed Progressing to False: Cluster version is 4.7.12 The machine-config operator didn't actually roll all the control-plane nodes in six seconds. The useless docs bump will give us a new commit hash, so the MCO will understand that a 4.7.11 -> 4.7.13 bump is a real update that takes some time to roll out. Once we get [1] fixed, we won't need hacks like this for future releases. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1955929#c1 [2]: https://amd64.ocp.releases.ci.openshift.org/releasestream/4-stable/release/4.7.12#upgrades-from [3]: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1395451649530531840
d2621b4
to
31638f9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
as a one-off to get a 4.7 release, this seems fine to me
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: wking, yuqi-zhang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Manually setting labels for this hack workaround that cannot possibly break anything ;) |
/override ci/prow/e2e-agnostic-upgrade |
@yuqi-zhang: Overrode contexts on behalf of yuqi-zhang: ci/prow/e2e-agnostic-upgrade, ci/prow/e2e-aws-serial In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Does this give a hint that we need to backport fixes more frequently ;) |
The MCO has a bug where it relies on its commit hash to mark the target version of rendered MachineConfigs. We didn't get a bump, or even a rebuild between 4.7.11 and 4.7.12:
So the MCO starts updating the pools between the two releases and immediately says "ahh, looks like I've already finished updating", when it hasn't. Lots of example jobs linked from here, including this one:
From that job:
The machine-config operator didn't actually roll all the control-plane nodes in six seconds.
The useless docs bump will give us a new commit hash, so the MCO will understand that a 4.7.11 -> 4.7.13 bump is a real update that takes some time to roll out. Once we get the bug fixed, we won't need hacks like this for future releases.