Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1862957: bump RHCOS images for FIPS fix #4066

Merged

Conversation

jlebon
Copy link
Member

@jlebon jlebon commented Aug 18, 2020

This fixes a race condition during early boot where FIPS mode doesn't
always get turned on.

$ ./differ.py --first-endpoint art --first-version 46.82.202008111140-0 --second-endpoint art --second-version 46.82.202008181646-0
{
    "sources": {
        "46.82.202008111140-0": "https://releases-rhcos-art.cloud.privileged.psi.redhat.com/storage/releases/rhcos-4.6/46.82.202008111140-0/x86_64/commitmeta.json",
        "46.82.202008181646-0": "https://releases-rhcos-art.cloud.privileged.psi.redhat.com/storage/releases/rhcos-4.6/46.82.202008181646-0/x86_64/commitmeta.json"
    },
    "diff": {
        "container-selinux": {
            "46.82.202008111140-0": "container-selinux-2.135.0-1.module+el8.2.1+6849+893e4f4a.noarch",
            "46.82.202008181646-0": "container-selinux-2.144.0-1.rhaos4.6.el8.noarch"
        },
        "cri-o": {
            "46.82.202008111140-0": "cri-o-1.19.0-71.rhaos4.6.git19455e9.el8.x86_64",
            "46.82.202008181646-0": "cri-o-1.19.0-81.rhaos4.6.gitfe43b78.el8.x86_64"
        },
        "ignition": {
            "46.82.202008111140-0": "ignition-2.6.0-1.rhaos4.6.git947598e.el8.x86_64",
            "46.82.202008181646-0": "ignition-2.6.0-2.rhaos4.6.git947598e.el8.x86_64"
        },
        "openshift-clients": {
            "46.82.202008111140-0": "openshift-clients-4.6.0-202008102017.p0.git.3703.5b22fb1.el8.x86_64",
            "46.82.202008181646-0": "openshift-clients-4.6.0-202008122114.p0.git.3707.b058f59.el8.x86_64"
        },
        "openshift-hyperkube": {
            "46.82.202008111140-0": "openshift-hyperkube-4.6.0-202008110953.p0.git.93411.7f76b87.el8.x86_64",
            "46.82.202008181646-0": "openshift-hyperkube-4.6.0-202008170849.p0.git.93419.3b88f8b.el8.x86_64"
        }
    }
}

This fixes a race condition during early boot where FIPS mode doesn't
always get turned on.

```
$ ./differ.py --first-endpoint art --first-version 46.82.202008111140-0 --second-endpoint art --second-version 46.82.202008181646-0
{
    "sources": {
        "46.82.202008111140-0": "https://releases-rhcos-art.cloud.privileged.psi.redhat.com/storage/releases/rhcos-4.6/46.82.202008111140-0/x86_64/commitmeta.json",
        "46.82.202008181646-0": "https://releases-rhcos-art.cloud.privileged.psi.redhat.com/storage/releases/rhcos-4.6/46.82.202008181646-0/x86_64/commitmeta.json"
    },
    "diff": {
        "container-selinux": {
            "46.82.202008111140-0": "container-selinux-2.135.0-1.module+el8.2.1+6849+893e4f4a.noarch",
            "46.82.202008181646-0": "container-selinux-2.144.0-1.rhaos4.6.el8.noarch"
        },
        "cri-o": {
            "46.82.202008111140-0": "cri-o-1.19.0-71.rhaos4.6.git19455e9.el8.x86_64",
            "46.82.202008181646-0": "cri-o-1.19.0-81.rhaos4.6.gitfe43b78.el8.x86_64"
        },
        "ignition": {
            "46.82.202008111140-0": "ignition-2.6.0-1.rhaos4.6.git947598e.el8.x86_64",
            "46.82.202008181646-0": "ignition-2.6.0-2.rhaos4.6.git947598e.el8.x86_64"
        },
        "openshift-clients": {
            "46.82.202008111140-0": "openshift-clients-4.6.0-202008102017.p0.git.3703.5b22fb1.el8.x86_64",
            "46.82.202008181646-0": "openshift-clients-4.6.0-202008122114.p0.git.3707.b058f59.el8.x86_64"
        },
        "openshift-hyperkube": {
            "46.82.202008111140-0": "openshift-hyperkube-4.6.0-202008110953.p0.git.93411.7f76b87.el8.x86_64",
            "46.82.202008181646-0": "openshift-hyperkube-4.6.0-202008170849.p0.git.93419.3b88f8b.el8.x86_64"
        }
    }
}
```
@openshift-ci-robot
Copy link
Contributor

@jlebon: This pull request references Bugzilla bug 1862957, which is valid. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.6.0) matches configured target release for branch (4.6.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1862957: bump RHCOS images for FIPS fix

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci-robot openshift-ci-robot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. labels Aug 18, 2020
@sdodson
Copy link
Member

sdodson commented Aug 18, 2020

/test e2e-gcp
/test e2e-azure
/test e2e-metal

@sdodson
Copy link
Member

sdodson commented Aug 18, 2020

/test e2e-vsphere

@sdodson
Copy link
Member

sdodson commented Aug 18, 2020

/test e2e-aws-fips
not sure if this even exists

@jlebon
Copy link
Member Author

jlebon commented Aug 18, 2020

/test e2e-aws-fips

@sdodson
Copy link
Member

sdodson commented Aug 18, 2020

/retest

@sdodson
Copy link
Member

sdodson commented Aug 18, 2020

e2e-metal failed on this, which doesn't seem like a valid test for this job.
[sig-cluster-lifecycle][Feature:Machines][Early] Managed cluster should have same number of Machines and Nodes [Suite:openshift/conformance/parallel]

e2e-ovirt failed to bootstrap, not going to block but
/test e2e-ovirt

e2e-openstack failed on the following in installation

 level=info msg="Cluster operator storage Progressing is True with ManilaCSIDriverOperatorCR_WaitForOperator: ManilaCSIDriverOperatorCRProgressing: Waiting for Manila operator to report status"
level=info msg="Cluster operator storage Available is False with ManilaCSIDriverOperatorCR_WaitForOperator::ManilaCSIDriverOperatorDeployment_WaitDeployment: ManilaCSIDriverOperatorCRAvailable: Waiting for Manila operator to report status\nManilaCSIDriverOperatorDeploymentAvailable: Waiting for a Deployment pod to start"

Not going to block on that but, run it again.
/test e2e-openstack

@sdodson
Copy link
Member

sdodson commented Aug 18, 2020

The machine == nodes check was added in openshift/origin#25419 for https://bugzilla.redhat.com/show_bug.cgi?id=1869654 which is closed as a dupe of https://bugzilla.redhat.com/show_bug.cgi?id=1862957 which this PR purports to fix. :-)

@sdodson
Copy link
Member

sdodson commented Aug 19, 2020

/retest

@sdodson
Copy link
Member

sdodson commented Aug 19, 2020

/lgtm
/approve
/hold
Clear the hold once AWS, gcp, and azure yield successful installations

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 19, 2020
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ashcrow, sdodson

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Aug 19, 2020
@deads2k
Copy link
Contributor

deads2k commented Aug 19, 2020

/retest

@deads2k
Copy link
Contributor

deads2k commented Aug 19, 2020

well, fips works. that's new. :)

@jlebon
Copy link
Member Author

jlebon commented Aug 19, 2020

e2e-ovirt failing on compilation issues looks like:

 + go build -mod=vendor -ldflags ' -X github.com/openshift/installer/pkg/version.Raw=unreleased-master-3566-g9b51bec6717bc2475286e53b011585a0737efff1-dirty -X github.com/openshift/installer/pkg/version.Commit=9b51bec6717bc2475286e53b011585a0737efff1 -s -w' -tags ' release' -o bin/openshift-install ./cmd/openshift-install
go build google.golang.org/api/pubsub/v1: /usr/local/go/pkg/tool/linux_amd64/compile: signal: terminated
error: build error: running 'hack/build.sh' failed with exit code 1 

All the other ones look like they timed out creating the release payload? E.g.

2020/08/19 11:32:24 Create release image default-route-openshift-image-registry.apps.build01.ci.devcluster.openshift.com/ci-op-sdczxvlk/release:latest
{"component":"entrypoint","file":"prow/entrypoint/run.go:165","func":"k8s.io/test-infra/prow/entrypoint.Options.ExecuteProcess","level":"error","msg":"Process did not finish before 4h0m0s timeout","severity":"error","time":"2020-08-19T15:24:15Z"}
time="2020-08-19T15:24:15Z" level=info msg="Received signal." signal=interrupt
2020/08/19 15:24:15 error: Process interrupted with signal interrupt, cancelling execution...
2020/08/19 15:24:15 cleanup: Deleting release pod release-latest

/retest

@cgwalters
Copy link
Member

waiting for openshift-console URL: context deadline exceeded

Hmm...not a lot of hits for that.

/test e2e-azure

@jlebon
Copy link
Member Author

jlebon commented Aug 19, 2020

Some progress, but still a bunch of failures related to release payload timeouts.

/retest

e2e-aws-workers failing with some CLI mismatch?

error: some steps failed:
  * could not run steps: step e2e-aws-workers-rhel7 failed: "e2e-aws-workers-rhel7" pre steps failed: "e2e-aws-workers-rhel7" pod "e2e-aws-workers-rhel7-ssh-bastion" failed: the pod ci-op-ctr8sv5i/e2e-aws-workers-rhel7-ssh-bastion failed after 18s (failed containers: test): ContainerFailed one or more containers exited

Container test exited with code 1, reason Error
---
argument "client" for "--dry-run" flag: strconv.ParseBool: parsing "client": invalid syntax


Usage:
  oc create clusterrolebinding NAME --clusterrole=NAME [--user=username] [--group=groupname] [--serviceaccount=namespace:serviceaccountname] [--dry-run] [flags]

Examples:
  # Create a ClusterRoleBinding for user1, user2, and group1 using the cluster-admin ClusterRole
  oc create clusterrolebinding cluster-admin --clusterrole=cluster-admin --user=user1 --user=user2 --group=group1

Options:
      --allow-missing-template-keys=true: If true, ignore any errors in templates when a field or map key is missing in the template. Only applies to golang and jsonpath output formats.
      --clusterrole='': ClusterRole this ClusterRoleBinding should reference
      --dry-run=false: If true, only print the object that would be sent, without sending it.
      --generator='clusterrolebinding.rbac.authorization.k8s.io/v1alpha1': The name of the API generator to use.
      --group=[]: Groups to bind to the clusterrole
  -o, --output='': Output format. One of: json|yaml|name|go-template|go-template-file|template|templatefile|jsonpath|jsonpath-file.
      --save-config=false: If true, the configuration of current object will be saved in its annotation. Otherwise, the annotation will be unchanged. This flag is useful when you want to perform kubectl apply on this object in the future.
      --serviceaccount=[]: Service accounts to bind to the clusterrole, in the format <namespace>:<name>
      --template='': Template string or path to template file to use when -o=go-template, -o=go-template-file. The template format is golang templates [http://golang.org/pkg/text/template/#pkg-overview].
      --validate=false: If true, use a schema to validate the input before sending it

Use "oc options" for a list of global command-line options (applies to all commands).

error: no objects passed to apply

e2e-metal still failing on openshift/origin#25419 (comment). I think @deads2k is looking at that.

e2e-azure timed out on the console:

level=info msg="Waiting up to 10m0s for the openshift-console route to be created..."
level=info msg="Cluster operator insights Disabled is False with AsExpected: "
level=fatal msg="waiting for openshift-console URL: context deadline exceeded"

Not sure how to debug that. Looked at some of the logs for the console-related pods, but not seeing anything obvious.

@cgwalters
Copy link
Member

OK that last azure run looks like a flake without a BZ.

@cgwalters
Copy link
Member

/test e2e-azure

@mfojtik
Copy link
Member

mfojtik commented Aug 20, 2020

/hold cancel

AWS, gcp, and azure yielded successful installations, lets make FIPS green again!

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 20, 2020
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

1 similar comment
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@cgwalters
Copy link
Member

Hm, just see two closed CVO bugs for "Cluster operator network is still updating" from that e2e-aws job.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci-robot
Copy link
Contributor

@jlebon: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-crc f3ac194 link /test e2e-crc
ci/prow/e2e-aws-workers-rhel7 f3ac194 link /test e2e-aws-workers-rhel7
ci/prow/e2e-vsphere f3ac194 link /test e2e-vsphere

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mfojtik
Copy link
Member

mfojtik commented Aug 20, 2020

@cgwalters the e2e-aws just passed :-)

@openshift-merge-robot openshift-merge-robot merged commit 422c565 into openshift:master Aug 20, 2020
@openshift-ci-robot
Copy link
Contributor

@jlebon: All pull requests linked via external trackers have merged: openshift/installer#4066. Bugzilla bug 1862957 has been moved to the MODIFIED state.

In response to this:

Bug 1862957: bump RHCOS images for FIPS fix

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants