Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stable: new release on 2022-09-19 (36.20220906.3.2) #560

Closed
34 tasks done
cverna opened this issue Sep 7, 2022 · 12 comments
Closed
34 tasks done

stable: new release on 2022-09-19 (36.20220906.3.2) #560

cverna opened this issue Sep 7, 2022 · 12 comments

Comments

@cverna
Copy link
Member

cverna commented Sep 7, 2022

First, verify that you meet all the prerequisites

Edit the issue title to include today's date. Once the pipeline spits out the new version ID, you can append it to the title e.g. (31.20191117.3.0).

Pre-release

Promote testing changes to stable

Manual alternative

Sometimes you need to run the process manually like if you need to add an extra commit to change something in manifest.yaml. The steps for this are:

  • git fetch upstream
  • git checkout stable
  • git reset --hard upstream/stable
  • /path/to/fedora-coreos-releng-automation/scripts/promote-config.sh testing
  • Open PR against the stable branch on https://github.com/coreos/fedora-coreos-config

Build

  • Start a build job (select stable, leave all other defaults). This will automatically run multi-arch builds.
  • Post links to the jobs as a comment to this issue
  • Wait for the jobs to finish and succeed
    • x86_64
    • aarch64
    • s390x

Sanity-check the build

Using the the build browser for the stable stream:

  • Verify that the parent commit and version match the previous stable release (in the future, we'll want to integrate this check in the release job)
    • x86_64
    • aarch64
    • s390x
  • Check kola AWS runs to make sure they didn't fail
    • x86_64
    • aarch64
  • Check kola OpenStack runs to make sure they didn't fail
    • x86_64
    • aarch64
  • Check kola Azure run to make sure it didn't fail
  • Check kola GCP run to make sure it didn't fail

⚠️ Release ⚠️

IMPORTANT: this is the point of no return here. Once the OSTree commit is
imported into the unified repo, any machine that manually runs rpm-ostree upgrade will have the new update.

Run the release job

  • Run the release job, filling in for parameters stable and the new version ID
  • Post a link to the job as a comment to this issue
  • Wait for job to finish

At this point, Cincinnati will see the new release on its next refresh and create a corresponding node in the graph without edges pointing to it yet.

Refresh metadata (stream and updates)

  • Wait for all releases that will be released simultaneously to reach this step in the process
  • Go to the rollout workflow, click "Run workflow", and fill out the form
Manual alternative
  • Make sure your fedora-coreos-stream-generator binary is up-to-date.

From a checkout of this repo:

  • Update stream metadata, by running:
fedora-coreos-stream-generator -releases=https://fcos-builds.s3.amazonaws.com/prod/streams/stable/releases.json  -output-file=streams/stable.json -pretty-print
  • Add a rollout. For example, for a 48-hour rollout starting at 10 AM ET the same day, run:
./rollout.py add stable <version> "10 am ET today" 48
  • Commit the changes and open a PR against the repo
  • Verify that the PR contains the expected OS versions
  • Post a link to the resulting PR as a comment to this issue
  • Review and approve the PR, then wait for someone else to approve it also
  • Once approved, merge it and verify that the sync-stream-metadata job syncs the contents to S3
  • Verify the new version shows up on the download page
  • Verify the incoming edges are showing up in the update graph.
Update graph manual check
curl -H 'Accept: application/json' 'https://updates.coreos.fedoraproject.org/v1/graph?basearch=x86_64&stream=stable&rollout_wariness=0'
curl -H 'Accept: application/json' 'https://updates.coreos.fedoraproject.org/v1/graph?basearch=aarch64&stream=stable&rollout_wariness=0'
curl -H 'Accept: application/json' 'https://updates.coreos.fedoraproject.org/v1/graph?basearch=s390x&stream=stable&rollout_wariness=0'

NOTE: In the future, most of these steps will be automated.

Housekeeping

  • If one doesn't already exist, open an issue in this repo for the next release in this stream. Use the approximate date of the release in the title.
  • Issues opened via the previous link will automatically create a linked Jira card. Assign the GitHub issue and Jira card to the next person in the rotation.
@dustymabe
Copy link
Member

For the releases this cycle let's continue to skip ppc64le. Remove it from the list of arches for the build job when you kick it off. Context: coreos/fedora-coreos-tracker#987 (comment)

@prestist
Copy link
Contributor

prestist commented Sep 19, 2022

Build:
x86: https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build/254/
aarch: https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build-arch/317/ "thought the build had failed, and re built it..."
2nd try: https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build-arch/321/
3rd try: https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build-arch/335/
4th: https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build-arch/352/
390x: https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build-arch/318/
2nd try: https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build-arch/333/
390x... https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build-arch/350/
9:35:02 + cosa remote-session create --image quay.io/coreos-assembler/coreos-assembler:main --expiration 4h
09:35:02 notice: failed to look up uid in /etc/passwd; enabling workaround
09:35:02 Cannot connect to Podman. Please verify your connection to the Linux system using podman system connection list, or try podman machine init and podman machine start to manage a new Linux VM
09:35:02 Error: unable to connect to Podman. failed to create sshClient: connection to bastion host (ssh://builder@/run/user//podman/podman.sock) failed: ssh: handshake failed: EOF
09:35:02 Error: exit status 125

390x seems to be working now

Ok after speaking with @bgilbert just to make sure I dont have any conflicting builds, I am going to kick off the build process from the top again.

Clean slate start:https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build/272/ ... accidently included ppc64le, incrementing the 'z' again

x86 https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/build/276/

@dustymabe
Copy link
Member

The s390x run at build-arch#368 failed for an obscure reason.

We recently started checking if generators fail in coreos/coreos-assembler#2929. It just so happens that the coreos-platform-chrony generator in the stable branch has a space at the beginning of the regex which means the grep doesn't succeed in this case for the ext.config.ignition.kargs test.

I just so happened to inadvertently fix this in coreos/fedora-coreos-config@8c07dcb so testing and next were fine.

I just did a re-run of the s390x build and I modified the pipeline code to denylist the ext.config.ignition.kargs test so it should make it through this time.

That build is running over in build-arch#369.

@dustymabe dustymabe changed the title stable: new release on 2022-09-19 stable: new release on 2022-09-19 (36.20220906.3.2) Sep 22, 2022
@dustymabe
Copy link
Member

dustymabe commented Sep 22, 2022

@prestist
Copy link
Contributor

Thank you for tagging the builds, I had stepped away from looking at it since I triggered a new one. Thank you for re-triggering the 390x!

@prestist
Copy link
Contributor

prestist commented Sep 22, 2022

Looks like the new 390x arch failed again on connecting to activemq.
I am triggering it again. here 390x

@dustymabe
Copy link
Member

Looks like the new 390x arch failed again on connecting to activemq. I am triggering it again. here 390x

I killed that one because it didn't have the necessary modifications so that it wouldn't run the ext.config.ignition.kargs test. Started a new one in build-arch#371

@dustymabe
Copy link
Member

dustymabe commented Sep 22, 2022

AWS Azure GCP OpenStack
x86_64 ✔️ ✔️ ✔️ ✔️
aarch64 ✔️ ,

First OpenStack aarch64 failed in test setup (Infra flake?).
The Second OpenStack aarch64 run failed because of coreos/fedora-coreos-tracker#1292. We need to ignore that failure.

@prestist
Copy link
Contributor

prestist commented Sep 22, 2022

OpenStack / the 390x are now green. Proceeding
Release build: https://jenkins-fedora-coreos-pipeline.apps.ocp.fedoraproject.org/job/release/118/

@prestist
Copy link
Contributor

Rollout pr #564

@dustymabe
Copy link
Member

The second OpenStack aarch64 run failed but that's OK. See the update in #560 (comment)

@prestist
Copy link
Contributor

Released 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants