Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA-1160: pkg/cvo/reconciliation_issues: Publish ClusterOperator transitionStart #1044

Conversation

wking
Copy link
Member

@wking wking commented Apr 24, 2024

Allow ClusterVersion status consumers to make their own decisions about how long to wait before complaining about a slow-to-update ClusterOperator.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 24, 2024
@wking wking changed the title pkg/cvo/reconciliation_issues: Publish ClusterOperator transitionStart OTA-1160: pkg/cvo/reconciliation_issues: Publish ClusterOperator transitionStart Apr 24, 2024
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 24, 2024
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Apr 24, 2024

@wking: This pull request references OTA-1160 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.16.0" version, but no target version was set.

In response to this:

Allow ClusterVersion status consumers to make their own decisions about how long to wait before complaining about a slow-to-update ClusterOperator.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Allow ClusterVersion status consumers to make their own decisions
about how long to wait before complaining about a slow-to-update
ClusterOperator.
@wking wking force-pushed the publish-cluster-operator-start-time branch from c35fe06 to a804d97 Compare April 24, 2024 23:26
@wking
Copy link
Member Author

wking commented Apr 25, 2024

Testing in Cluster Bot via launch 4.16,openshift/cluster-version-operator#1044 aws,techpreview (logs), I sent the cluster back to 4.16.0-ec.5 with:

$ oc adm upgrade --force --allow-explicit-upgrade --to-image quay.io/openshift-release-dev/ocp-release@sha256:f5c9cf5a461434e775af2b946fc2a8aee7240f5b05d4141a588e527adcad7af9

Maybe I didn't need to --force that? Anyhow, I let it run for a bit, until etcd completed:

$ oc adm upgrade
info: An upgrade is in progress. Working towards 4.16.0-ec.5: 113 of 952 done (11% complete), waiting on kube-apiserver
...

Then I turned around and went back to my original release images (I had to --force this time, because CI releases are unsigned):

$ oc adm upgrade --force --allow-upgrade-with-warnings --allow-explicit-upgrade --to-image registry.build03.ci.openshift.org/ci-ln-cm7dfnt/release@sha256:370d219ae7612d27ac08ab6de38e452c71bb4f770c692577fdd4493242ff3efd

And shortly after, here's the CVO waiting on etcd and kube-apiserver giving us start-times:

$ date --utc --iso=m
2024-04-25T04:50+00:00
$ oc get -o json clusterversion version | jq -r '.status.conditions[] | select(.type == "ReconciliationIssues") | .message | fromjson'
{
  "message": "Cluster operators etcd, kube-apiserver are updating versions",
  "children": [
    {
      "message": "[Cluster operator etcd is updating versions, Cluster operator kube-apiserver is updating versions]",
      "children": [
        {
          "message": "Cluster operator etcd is updating versions",
          "children": [
            {
              "message": "cluster operator etcd is available and not degraded but has not finished updating to target version"
            }
          ],
          "effect": "None",
          "manifest": {
            "originalFilename": "0000_20_etcd-operator_07_clusteroperator.yaml",
            "group": "config.openshift.io",
            "kind": "ClusterOperator",
            "name": "etcd"
          },
          "transitionStart": "2024-04-25T04:50:25.572524285Z"
        },
        {
          "message": "Cluster operator kube-apiserver is updating versions",
          "children": [
            {
              "message": "cluster operator kube-apiserver is available and not degraded but has not finished updating to target version"
            }
          ],
          "effect": "None",
          "manifest": {
            "originalFilename": "0000_20_kube-apiserver-operator_07_clusteroperator.yaml",
            "group": "config.openshift.io",
            "kind": "ClusterOperator",
            "name": "kube-apiserver"
          },
          "transitionStart": "2024-04-25T04:50:25.572311511Z"
        }
      ]
    }
  ],
  "effect": "None"
}

And a bit later, when kube-apiserver (which had never gotten all the way through to 4.16.0-ec.5) had completed, just waiting on etcd:

$ oc get -o json clusterversion version | jq -r '.status.conditions[] | select(.type == "ReconciliationIssues") | .message | fromjson'
{
  "message": "Cluster operator etcd is updating versions",
  "children": [
    {
      "message": "cluster operator etcd is available and not degraded but has not finished updating to target version"
    }
  ],
  "effect": "None",
  "manifest": {
    "originalFilename": "0000_20_etcd-operator_07_clusteroperator.yaml",
    "group": "config.openshift.io",
    "kind": "ClusterOperator",
    "name": "etcd"
  },
  "transitionStart": "2024-04-25T04:50:25.572524285Z"
}

@petr-muller
Copy link
Member

We're definitely not breaking Hypershift with this feature-gated change
/override ci/prow/e2e-hypershift ci/prow/e2e-hypershift-conformance

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Apr 25, 2024
Copy link
Contributor

openshift-ci bot commented Apr 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: petr-muller, wking

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link
Contributor

openshift-ci bot commented Apr 25, 2024

@petr-muller: Overrode contexts on behalf of petr-muller: ci/prow/e2e-hypershift, ci/prow/e2e-hypershift-conformance

In response to this:

We're definitely not breaking Hypershift with this feature-gated change
/override ci/prow/e2e-hypershift ci/prow/e2e-hypershift-conformance

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@wking
Copy link
Member Author

wking commented Apr 25, 2024

Waiving pre-merge QE for this tech-preview, currently unconsumed content.

/label no-qe

@openshift-ci openshift-ci bot added the no-qe Allows PRs to merge without qe-approved label label Apr 25, 2024
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 53a8fbf and 2 for PR HEAD a804d97 in total

@wking
Copy link
Member Author

wking commented Apr 25, 2024

/retest-required

@wking
Copy link
Member Author

wking commented Apr 25, 2024

not sure what's up with HyperShift, but that job won't care about this tech-preview property.

/retest-required

@wking
Copy link
Member Author

wking commented Apr 26, 2024

well, e2e-hypershift-conformance passed. Still dunno what's going on with the other.

/retest-required

@wking
Copy link
Member Author

wking commented Apr 26, 2024

wait, I'm a root approver for this repo 🤦 :

/override ci/prow/e2e-hypershift

Copy link
Contributor

openshift-ci bot commented Apr 26, 2024

@wking: Overrode contexts on behalf of wking: ci/prow/e2e-hypershift

In response to this:

wait, I'm a root approver for this repo 🤦 :

/override ci/prow/e2e-hypershift

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

openshift-ci bot commented Apr 26, 2024

@wking: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 0b3f507 into openshift:master Apr 26, 2024
11 checks passed
@wking wking deleted the publish-cluster-operator-start-time branch April 26, 2024 03:43
@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build cluster-version-operator-container-v4.16.0-202404251943.p0.g0b3f507.assembly.stream.el9 for distgit cluster-version-operator.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. no-qe Allows PRs to merge without qe-approved label
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants