Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OCPBUGS-33227: Use openshift-install binary for releases >= 4.16 #6304

Merged

Conversation

carbonin
Copy link
Member

@carbonin carbonin commented May 9, 2024

In 4.16 the installer binaries started to be built against EL9 libraries rather than EL8. Additionally support for baremetal installs was added to the openshift-install binary which was also made statically linked.

This means that using openshift-baremetal-install from a 4.16 release on an EL8-based image will cause errors like:

Failed to prepare the installation due to an unexpected error: failed generating install config for cluster 47ff23ae-012a-421e-89a9-8ae1ca04a67f: error running openshift-install manifests, /data/install-config-generate/installercache/quay.io/openshift-release-dev/ocp-release:4.16.0-rc.0-x86_64/ln_1715091595_openshift-baremetal-install: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /data/install-config-generate/installercache/quay.io/openshift-release-dev/ocp-release:4.16.0-rc.0-x86_64/ln_1715091595_openshift-baremetal-install)

To address this, this commit moves installs for clusters >= 4.16 to use openshift-install which will work correctly regardless of the container base image it is run on. This is only possible for 4.16 and greater releases because the changes that allow openshift-install to do baremetal installations (and to statically link it) were only made for 4.16 and will not be backported to older releases.

This is a temporary fix as installing FIPS compliant clusters will still require using openshift-baremetal-install so the ultimate goal is to be able to run openshift-baremetal-install for any supported OCP version.

Note that this is only relevant for the case where assisted-service is an el8-based image, which is not the case currently. This PR is mostly meant for backport to earlier versions which will still need to install 4.16.

List all the issues related to this PR

Related to https://issues.redhat.com/browse/CORS-3024
Resolves https://issues.redhat.com/browse/OCPBUGS-33227

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

Deployed manually and installed both 4.15 and 4.16 SNO successfully.
Also provided the patched image to telco QE for testing.

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see [CONTRIBUTING] guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

@openshift-ci-robot openshift-ci-robot added jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. labels May 9, 2024
@openshift-ci-robot
Copy link

@carbonin: This pull request references Jira Issue OCPBUGS-33227, which is invalid:

  • expected the bug to target the "4.16.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

In 4.16 the installer binaries started to be built against EL9 libraries rather than EL8. Additionally support for baremetal installs was added to the openshift-install binary which was also made statically linked.

This means that using openshift-baremetal-install from a 4.16 release on an EL8-based image will cause errors like:

Failed to prepare the installation due to an unexpected error: failed generating install config for cluster 47ff23ae-012a-421e-89a9-8ae1ca04a67f: error running openshift-install manifests, /data/install-config-generate/installercache/quay.io/openshift-release-dev/ocp-release:4.16.0-rc.0-x86_64/ln_1715091595_openshift-baremetal-install: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /data/install-config-generate/installercache/quay.io/openshift-release-dev/ocp-release:4.16.0-rc.0-x86_64/ln_1715091595_openshift-baremetal-install)

To address this, this commit moves installs for clusters >= 4.16 to use openshift-install which will work correctly regardless of the container base image it is run on. This is only possible for 4.16 and greater releases because the changes that allow openshift-install to do baremetal installations (and to statically link it) were only made for 4.16 and will not be backported to older releases.

This is a temporary fix as installing FIPS compliant clusters will still require using openshift-baremetal-install so the ultimate goal is to be able to run openshift-baremetal-install for any supported OCP version.

Note that this is only relevant for the case where assisted-service is an el8-based image, which is not the case currently. This PR is mostly meant for backport to earlier versions which will still need to install 4.16.

List all the issues related to this PR

Related to https://issues.redhat.com/browse/CORS-3024
Resolves https://issues.redhat.com/browse/OCPBUGS-33227

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

Deployed manually and installed both 4.15 and 4.16 SNO successfully.
Also provided the patched image to telco QE for testing.

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see [CONTRIBUTING] guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. label May 9, 2024
@openshift-ci openshift-ci bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label May 9, 2024
Copy link

openshift-ci bot commented May 9, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: carbonin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 9, 2024
@carbonin
Copy link
Member Author

carbonin commented May 9, 2024

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 9, 2024
@openshift-ci-robot
Copy link

@carbonin: This pull request references Jira Issue OCPBUGS-33227, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.16.0) matches configured target version for branch (4.16.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira (josclark@redhat.com), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

internal/oc/release.go Outdated Show resolved Hide resolved
Copy link

codecov bot commented May 9, 2024

Codecov Report

Attention: Patch coverage is 68.18182% with 7 lines in your changes are missing coverage. Please review.

Project coverage is 68.28%. Comparing base (8ebb08d) to head (bab591c).
Report is 1 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #6304      +/-   ##
==========================================
+ Coverage   68.27%   68.28%   +0.01%     
==========================================
  Files         241      241              
  Lines       35863    35873      +10     
==========================================
+ Hits        24486    24497      +11     
  Misses       9214     9214              
+ Partials     2163     2162       -1     
Files Coverage Δ
internal/ignition/ignition.go 60.57% <0.00%> (ø)
internal/installercache/installercache.go 68.75% <60.00%> (-1.77%) ⬇️
internal/oc/release.go 71.14% <75.00%> (+0.12%) ⬆️

... and 1 file with indirect coverage changes

In 4.16 the installer binaries started to be built against EL9 libraries
rather than EL8. Additionally support for baremetal installs was added
to the `openshift-install` binary which was also made statically linked.

This means that using `openshift-baremetal-install` from a 4.16 release
on an EL8-based image will cause errors like:

```
Failed to prepare the installation due to an unexpected error: failed generating install config for cluster 47ff23ae-012a-421e-89a9-8ae1ca04a67f: error running openshift-install manifests, /data/install-config-generate/installercache/quay.io/openshift-release-dev/ocp-release:4.16.0-rc.0-x86_64/ln_1715091595_openshift-baremetal-install: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /data/install-config-generate/installercache/quay.io/openshift-release-dev/ocp-release:4.16.0-rc.0-x86_64/ln_1715091595_openshift-baremetal-install)
```

To address this, this commit moves installs for clusters >= 4.16 to use
`openshift-install` which will work correctly regardless of the
container base image it is run on. This is only possible for 4.16 and
greater releases because the changes that allow `openshift-install` to
do baremetal installations (and to statically link it) were only made
for 4.16 and will not be backported to older releases.

This is a temporary fix as installing FIPS compliant clusters will still
require using `openshift-baremetal-install` so the ultimate goal is to
be able to run `openshift-baremetal-install` for any supported OCP version.

Note that this is only relevant for the case where assisted-service is
an el8-based image, which is not the case currently. This PR is mostly
meant for backport to earlier versions which will still need to install
4.16.

Related to https://issues.redhat.com/browse/CORS-3024
Resolves https://issues.redhat.com/browse/OCPBUGS-33227
@tsorya
Copy link
Contributor

tsorya commented May 9, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 9, 2024
@carbonin
Copy link
Member Author

carbonin commented May 9, 2024

/test edge-e2e-metal-assisted

Operator didn't become ready, not an issue with this PR

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 904ee71 and 2 for PR HEAD bab591c in total

@tsorya
Copy link
Contributor

tsorya commented May 10, 2024

/lgtm

@tsorya
Copy link
Contributor

tsorya commented May 10, 2024

/retest

1 similar comment
@carbonin
Copy link
Member Author

/retest

@carbonin
Copy link
Member Author

/hold

Looks like this may indeed be causing the test failure.

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 10, 2024
@carbonin
Copy link
Member Author

/unhold

It wasn't causing the test failure. It's https://issues.redhat.com/browse/OCPBUGS-33493

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 14, 2024
@carbonin
Copy link
Member Author

carbonin commented May 15, 2024

Spoke with @gamli75 and we're going to override to get this one in since it's blocking telco QE and they already tested the fix and verified it works.

/override ci/prow/edge-e2e-metal-assisted

Copy link

openshift-ci bot commented May 15, 2024

@carbonin: Overrode contexts on behalf of carbonin: ci/prow/edge-e2e-metal-assisted

In response to this:

Spoke with @gamli75 and we're going to override to get this one in since it's blocking telco QE and they already tested the fix an verified it works.

/override ci/prow/edge-e2e-metal-assisted

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

openshift-ci bot commented May 15, 2024

@carbonin: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit fdf233a into openshift:master May 15, 2024
14 checks passed
@openshift-ci-robot
Copy link

@carbonin: Jira Issue OCPBUGS-33227: All pull requests linked via external trackers have merged:

Jira Issue OCPBUGS-33227 has been moved to the MODIFIED state.

In response to this:

In 4.16 the installer binaries started to be built against EL9 libraries rather than EL8. Additionally support for baremetal installs was added to the openshift-install binary which was also made statically linked.

This means that using openshift-baremetal-install from a 4.16 release on an EL8-based image will cause errors like:

Failed to prepare the installation due to an unexpected error: failed generating install config for cluster 47ff23ae-012a-421e-89a9-8ae1ca04a67f: error running openshift-install manifests, /data/install-config-generate/installercache/quay.io/openshift-release-dev/ocp-release:4.16.0-rc.0-x86_64/ln_1715091595_openshift-baremetal-install: /lib64/libc.so.6: version `GLIBC_2.34' not found (required by /data/install-config-generate/installercache/quay.io/openshift-release-dev/ocp-release:4.16.0-rc.0-x86_64/ln_1715091595_openshift-baremetal-install)

To address this, this commit moves installs for clusters >= 4.16 to use openshift-install which will work correctly regardless of the container base image it is run on. This is only possible for 4.16 and greater releases because the changes that allow openshift-install to do baremetal installations (and to statically link it) were only made for 4.16 and will not be backported to older releases.

This is a temporary fix as installing FIPS compliant clusters will still require using openshift-baremetal-install so the ultimate goal is to be able to run openshift-baremetal-install for any supported OCP version.

Note that this is only relevant for the case where assisted-service is an el8-based image, which is not the case currently. This PR is mostly meant for backport to earlier versions which will still need to install 4.16.

List all the issues related to this PR

Related to https://issues.redhat.com/browse/CORS-3024
Resolves https://issues.redhat.com/browse/OCPBUGS-33227

  • New Feature
  • Enhancement
  • Bug fix
  • Tests
  • Documentation
  • CI/CD

What environments does this code impact?

  • Automation (CI, tools, etc)
  • Cloud
  • Operator Managed Deployments
  • None

How was this code tested?

Deployed manually and installed both 4.15 and 4.16 SNO successfully.
Also provided the patched image to telco QE for testing.

  • assisted-test-infra environment
  • dev-scripts environment
  • Reviewer's test appreciated
  • Waiting for CI to do a full test run
  • Manual (Elaborate on how it was tested)
  • No tests needed

Checklist

  • Title and description added to both, commit and PR.
  • Relevant issues have been associated (see [CONTRIBUTING] guide)
  • This change does not require a documentation update (docstring, docs, README, etc)
  • Does this change include unit-tests (note that code changes require unit-tests)

Reviewers Checklist

  • Are the title and description (in both PR and commit) meaningful and clear?
  • Is there a bug required (and linked) for this change?
  • Should this PR be backported?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@gamli75
Copy link
Contributor

gamli75 commented May 15, 2024

/cherry-pick release-ocm-2.10

@openshift-cherrypick-robot

@gamli75: new pull request created: #6319

In response to this:

/cherry-pick release-ocm-2.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-bot
Copy link
Contributor

[ART PR BUILD NOTIFIER]

This PR has been included in build ose-agent-installer-api-server-container-v4.16.0-202405151511.p0.gfdf233a.assembly.stream.el9 for distgit ose-agent-installer-api-server.
All builds following this will include this PR.

@zaneb
Copy link
Member

zaneb commented May 22, 2024

@carbonin doesn't this break FIPS on the agent-based installer?
I don't understand how this passed CI, because we should be testing with FIPS enabled.

zaneb added a commit to zaneb/assisted-service that referenced this pull request May 22, 2024
….16 (openshift#6304)"

We must always use the openshift-baremetal-install binary because
otherwise enabling FIPS is not possible. The agent-based installer
depends on this.

This reverts commit fdf233a.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/severity-critical Referenced Jira bug's severity is critical for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants