CNTRLPLANE-2545: Add ExternalOIDCWithUpstreamParity Prow jobs#77059
CNTRLPLANE-2545: Add ExternalOIDCWithUpstreamParity Prow jobs#77059xingxingxia wants to merge 2 commits intoopenshift:mainfrom
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: xingxingxia The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@xingxingxia: This pull request references CNTRLPLANE-2545 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/payload-job-with-prs periodic-ci-openshift-cluster-authentication-operator-release-4.22-periodics-e2e-aws-external-oidc-upstream-parity openshift/origin#30906 |
|
@xingxingxia: the repo openshift/release does not contribute to the OpenShift official images, or the base branch is not currently having images promoted |
|
/retest |
1 similar comment
|
/retest |
|
this lgtm but I will leave it to @everettraven to give it a look and put the label. |
We do. Feature promotion criteria is that we have automated regression testing on all supported platforms. |
everettraven
left a comment
There was a problem hiding this comment.
A few comments, but overall this seems to be on the right track.
| TEST_SKIPS: \[OCPFeatureGate:ExternalOIDCWithUIDAndExtraClaimMappings\]\|\[OCPFeatureGate:ExternalOIDC\] | ||
| reverting to IntegratedOAuth | ||
| TEST_FOCUS: \[OCPFeatureGate:ExternalOIDC\] | ||
| TEST_SKIPS: \[OCPFeatureGate:ExternalOIDC\] reverting to IntegratedOAuth |
There was a problem hiding this comment.
I think that this is a good improvement to make, but what do you think about making that a separate PR to keep this one tightly scoped to adding the jobs needed for testing the feature in question?
By making this change as part of this PR, the impact of this change is significantly larger and requires broader review and is IMO scope creep for this PR.
There was a problem hiding this comment.
@everettraven , OK, I raise a separate #77530
| TEST_SUITE: openshift/auth/external-oidc | ||
| workflow: openshift-e2e-aws | ||
| - as: e2e-vsphere-external-oidc-upstream-parity | ||
| cron: 42 13 * * * |
There was a problem hiding this comment.
More for future reference for myself, how did you decide the cron configuration here?
There was a problem hiding this comment.
@everettraven , AI does it :)
We also have https://github.com/openshift/release/blob/main/ci-operator/config/openshift/openshift-tests-private/README.md#job-frequency :
$ ci-operator/config/openshift/openshift-tests-private/tools/generate-cron-entry.sh <test-name>-f<frequency-day> ci-operator/config/....the_path_to__periodics.yaml
E.g.:
$ ci-operator/config/openshift/openshift-tests-private/tools/generate-cron-entry.sh e2e-vsphere-external-oidc-upstream-parity-f1 ci-operator/config/openshift/cluster-authentication-operator/openshift-cluster-authentication-operator-release-4.22__periodics.yaml
cron: 49 12 * * *
| FEATURE_SET: TechPreviewNoUpgrade | ||
| OPENSHIFT_SKIP_EXTERNAL_TESTS: "True" | ||
| TEST_ARGS: --disable-monitor=legacy-cvo-invariants,legacy-test-framework-invariants | ||
| TEST_FOCUS: \[OCPFeatureGate:ExternalOIDCWithUpstreamParity\] |
There was a problem hiding this comment.
Just a focus itself may not be enough. We may need to shard this further until we improve the runtime of the tests for this feature overall.
Generally, these tests can take up to ~1hr each (worst case scenario) and it looks like openshift/origin#30906 will introduce 8 tests once all are enabled. That means worst case scenario this test job takes 8 hours to run which exceeds the default 4 hour limit. IIRC there was an issue with one of the platforms that made it so that 4 hour runs were the practical limit here and so we decided to limit to 4-5 tests per job to stay roughly within that 4 hour limit on a worst case but successful run.
There was a problem hiding this comment.
@everettraven yes, this was my worry too. I merged 4 negative tests to 2 tests respectively in https://github.com/openshift/origin/pull/30906/changes#r3050790964 and https://github.com/openshift/origin/pull/30906/changes#r3050795626 which looks reasonable.
c2a0d13 to
b5abcdb
Compare
b5abcdb to
10d7893
Compare
| # These lines are added for pj-rehearse with unmerged release PR + origin PR | ||
| # TODO: remove these lines | ||
| echo "Checking custom test binary for pj-rehearse with unmerged release PR + origin PR" | ||
| mkdir -p /tmp/custom-bin | ||
| oc image extract --path /usr/bin/openshift-tests:/tmp/custom-bin/ quay.io/xxia/tests:latest | ||
| chmod a+x /tmp/custom-bin/openshift-tests | ||
| export PATH=/tmp/custom-bin:$PATH | ||
| which openshift-tests | ||
|
|
There was a problem hiding this comment.
@everettraven , @ShazaAldawamneh , as Slack sync, I use this workaround to run pj-reherase. The workaround uses openshift-tests of quay.io/xxia/tests built from the unmerged origin PR. Once pj-reherase passes, will remove it.
|
[REHEARSALNOTIFIER]
A total of 19647 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs. A full list of affected jobs can be found here Interacting with pj-rehearseComment: Once you are satisfied with the results of the rehearsals, comment: |
|
Using this solution #77059 (comment) , running some pj-rehearse jobs first: |
|
@xingxingxia: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
@ShazaAldawamneh , @everettraven , hope this gives confidence. So running pj-rehearse jobs for remaining ones: /pj-rehearse periodic-ci-openshift-cluster-authentication-operator-release-4.22-periodics-e2e-azure-external-oidc-upstream-parity periodic-ci-openshift-cluster-authentication-operator-release-4.22-periodics-e2e-gcp-external-oidc-upstream-parity periodic-ci-openshift-cluster-authentication-operator-release-4.22-periodics-e2e-metal-ovn-dualstack-external-oidc-upstream-parity periodic-ci-openshift-cluster-authentication-operator-release-4.22-periodics-e2e-metal-ovn-ipv4-external-oidc-upstream-parity periodic-ci-openshift-cluster-authentication-operator-release-4.22-periodics-e2e-metal-ovn-ipv6-external-oidc-upstream-parity periodic-ci-openshift-cluster-authentication-operator-release-4.22-periodics-e2e-vsphere-external-oidc-upstream-parity |
|
@xingxingxia: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel. |
| TEST_SUITE: openshift/auth/external-oidc | ||
| workflow: openshift-e2e-aws-single-node | ||
| - as: e2e-aws-external-oidc-upstream-parity | ||
| interval: 168h |
There was a problem hiding this comment.
I'm quite surprised to see these set to 168h intervals. Do you know if this is because we haven't officially branched yet?
I don't think we are going to make promotion for this feature in 4.22 so we will want to run these daily for 4.23/5.0 as well to gather enough signal for promotion then, but maybe that needs to be a follow up after branching?
everettraven
left a comment
There was a problem hiding this comment.
Aside from the need to remove the temporary e2e image reference, this LGTM.
Adding a hold until we get that temporary workaround removed.
/hold
|
@xingxingxia: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Add ExternalOIDCWithUpstreamParity Prow jobs. I don't think we need to add all 8 platforms: aws, azure, gcp, vsphere, bm-ipv4, bm-ipv6, bm-dualstack, sno, because the feature itself is inherently not much related to every platform. So, to avoid too many jobs, I only select below representatives:
aws
vsphere
bm-ipv6
sno.
Additionally, I added TEST_FOCUS, so that we need not add
TEST_SKIPS: FG2|FG3for FG1 tests,TEST_SKIPS: FG1|FG3for FG2 tests,TEST_SKIPS: FG1|FG2for FG3 tests where the FG names are long.CC @ShazaAldawamneh , @everettraven