Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 1894144: baremetal: pin libvirt to 4.5.0 #4339

Merged
merged 1 commit into from Nov 4, 2020

Conversation

stbenjam
Copy link
Member

@stbenjam stbenjam commented Nov 3, 2020

The latest version of RHEL ships with libvirt 6.x, which are now in
yum repositories used to build the CI containers. However, when running
an installer built that's linked against it on an older RHEL host, the
installer fails:

$ ocp/ostest/openshift-baremetal-install --dir ocp/ostest --log-level=debug create manifests
ocp/ostest/openshift-baremetal-install: /lib64/libvirt.so.0: version
```LIBVIRT_5.2.0' not found (required by
ocp/ostest/openshift-baremetal-install)
[...snip...]

According to the RHEL 8 ABI Guarantees,
"an application is only guaranteed to run correctly if it executes in an
environment that is as new as the environment it was built in or newer."
Therefore, we must build openshift baremetal installer against the
oldest version of the libvirt libraries we expect it to run against,
which is 4.5.x.

Long term, the libvirt team is looking at not linking to libvirt at
compile time, but rather use dlopen("libvirt.so") at runtime, and then
dlsym("funcname") to resolve functions at time they are first invoked.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 3, 2020
@berrange
Copy link
Contributor

berrange commented Nov 3, 2020

Theoretically, RHEL is not supposed to do this,
as the X.Y of core software should remain to ensure a stable ABI, but I
guess as libvirt is now distributed as part of AppStream there are looser
requirements.

RHEL ABI compatibility only applies in the forwards direction. ie if you build something in RHEL-8.2, you can run it in 8.3, 8.4, ...9.x, etc. There is no expectation of backwards ABI compatibility, so you can not expect to run on 8.1, 8.0, 7.x, if you built on 8.2.

IOW, to take advantage of the RHEL ABI compatibility you need to build against the oldest version that you intend to be able to run on. Libvirt API is append-only, so you will get full forwards compatibility.

The main limitation with this is that you cannot opt-in to use of newer APIs than is available on your oldest platform.

@stbenjam
Copy link
Member Author

stbenjam commented Nov 3, 2020

Theoretically, RHEL is not supposed to do this,
as the X.Y of core software should remain to ensure a stable ABI, but I
guess as libvirt is now distributed as part of AppStream there are looser
requirements.

RHEL ABI compatibility only applies in the forwards direction. ie if you build something in RHEL-8.2, you can run it in 8.3, 8.4, ...9.x, etc. There is no expectation of backwards ABI compatibility, so you can not expect to run on 8.1, 8.0, 7.x, if you built on 8.2.

IOW, to take advantage of the RHEL ABI compatibility you need to build against the oldest version that you intend to be able to run on. Libvirt API is append-only, so you will get full forwards compatibility.

The main limitation with this is that you cannot opt-in to use of newer APIs than is available on your oldest platform.

Thanks for clarifying - I just had a similar discussion on Slack with @cfergeau and came to the same conclusion after reading https://access.redhat.com/articles/rhel8-abi-compatibility.

What's not clear is how OpenShift CI handles this, we're sort of stuck at whatever repos we're given in the particular container, and it seems CI pushes them forward automatically. As now all of a sudden they have the latest libvirt versions.

The nightly brew OCP versions do not have this problem, it seems to be built consistently against RHEL 8.0 or something close, as it has libvirt 4.5.0.

@stbenjam
Copy link
Member Author

stbenjam commented Nov 3, 2020

My attempt at pinning to 4.5.0 seems to have worked, but now I have some frankenstein with mostly RHEL 8.2-ish packages, and then an older libvirt.

https://gcsweb-ci.apps.ci.l2s4.p1.openshiftapps.com/gcs/origin-ci-test/pr-logs/pull/openshift_installer/4339/pull-ci-openshift-installer-master-images/1323664566072119296/artifacts/build-logs/baremetal-installer.log

@stbenjam
Copy link
Member Author

stbenjam commented Nov 3, 2020

@cfergeau suggested maybe we try building with libvirt-go 4.5 but https://github.com/dmacvicar/terraform-provider-libvirt/ is pinned to at least 5.0, so it may be an accident that any of this actually works today.

@berrange
Copy link
Contributor

berrange commented Nov 3, 2020

FWIW, I'm increasingly of the opinion that we need to change the way the libvirt-go bindings work. Instead of linking to libvirt at compile time, I think it'll be more useful to Go app developers if we instead use dlopen("libvirt.so") at runtime, and then dlsym("funcname") to resolve functions at time they are first invoked. This would avoid the need to have libvirt installed on the host at all when building libvirt-go, and users would only need it installed at runtime if they actually intend to use its functionality.

@stbenjam
Copy link
Member Author

stbenjam commented Nov 3, 2020

FWIW, I'm increasingly of the opinion that we need to change the way the libvirt-go bindings work. Instead of linking to libvirt at compile time, I think it'll be more useful to Go app developers if we instead use dlopen("libvirt.so") at runtime, and then dlsym("funcname") to resolve functions at time they are first invoked. This would avoid the need to have libvirt installed on the host at all when building libvirt-go, and users would only need it installed at runtime if they actually intend to use its functionality.

@berrange This would solve a big problem for us. Today, we ship an entirely separate installer for baremetal (which uses a libvirt bootstrapping VM). What would be involved in such a change?

Also: do you think the way this PR is done makes sense to work around the problem we're currently seeing?

@berrange
Copy link
Contributor

berrange commented Nov 3, 2020

@berrange This would solve a big problem for us. Today, we ship an entirely separate installer for baremetal (which uses a libvirt bootstrapping VM). What would be involved in such a change?

I've been doing experimentation on this part time. Mostly I've been trying to make it possible to auto-generate alot of the required binding code, because it is way too much work to do the conversion manually. It has been low priority because no one expressed a clear and urgent need for it. Since I know your use case now though, I can justify making it a higher priority on my todo list.

Also: do you think the way this PR is done makes sense to work around the problem we're currently seeing?

I don't know much about your CI/deployment needs, but superficially it looks like a reasonable minimum effort approach to try to solve it.

The latest version of RHEL ships with libvirt 6.x, which are now in
yum repositories used to build the CI containers. However, when running
an installer built that's linked against it on an older RHEL host, the
installer fails:

```
$ ocp/ostest/openshift-baremetal-install --dir ocp/ostest --log-level=debug create manifests
ocp/ostest/openshift-baremetal-install: /lib64/libvirt.so.0: version
```LIBVIRT_5.2.0' not found (required by
ocp/ostest/openshift-baremetal-install)
[...snip...]
```

According to the [RHEL 8 ABI Guarantees](https://access.redhat.com/articles/rhel8-abi-compatibility),
"an application is only guaranteed to run correctly if it executes in an
environment that is as new as the environment it was built in or newer."
Therefore, we must build openshift baremetal installer against the
oldest version of the libvirt libraries we expect it to run against,
which is 4.5.x.

Long term, the libvirt team is looking at not linking to libvirt at
compile time, but rather use `dlopen("libvirt.so")` at runtime, and then
`dlsym("funcname")` to resolve functions at time they are first invoked.
@stbenjam
Copy link
Member Author

stbenjam commented Nov 3, 2020

/retitle Bug 1894144: baremetal: pin libvirt to 4.5.0

@openshift-ci-robot openshift-ci-robot changed the title [WIP] baremetal: pin libvirt libs to 4.5 Bug 1894144: baremetal: pin libvirt to 4.5.0 Nov 3, 2020
@openshift-ci-robot openshift-ci-robot added bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Nov 3, 2020
@openshift-ci-robot
Copy link
Contributor

@stbenjam: This pull request references Bugzilla bug 1894144, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.7.0) matches configured target release for branch (4.7.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

In response to this:

Bug 1894144: baremetal: pin libvirt to 4.5.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@hardys
Copy link
Contributor

hardys commented Nov 3, 2020

Seems like a reasonable workaround to unblock CI while we figure out a better long-term plan.

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Nov 3, 2020
@staebler
Copy link
Contributor

staebler commented Nov 3, 2020

/approve

@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: staebler

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 3, 2020
@stbenjam
Copy link
Member Author

stbenjam commented Nov 3, 2020

/cherry-pick release-4.6

We’ll likely need this for 4.6, too.

@openshift-cherrypick-robot

@stbenjam: once the present PR merges, I will cherry-pick it on top of release-4.6 in a new PR and assign it to you.

In response to this:

/cherry-pick release-4.6

We’ll likely need this for 4.6, too.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

7 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot
Copy link
Contributor

@stbenjam: The following tests failed, say /retest to rerun all failed tests:

Test name Commit Details Rerun command
ci/prow/e2e-ovirt 90bc26f link /test e2e-ovirt
ci/prow/e2e-crc 90bc26f link /test e2e-crc

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 9af0315 into openshift:master Nov 4, 2020
@openshift-ci-robot
Copy link
Contributor

@stbenjam: All pull requests linked via external trackers have merged:

Bugzilla bug 1894144 has been moved to the MODIFIED state.

In response to this:

Bug 1894144: baremetal: pin libvirt to 4.5.0

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@stbenjam: new pull request created: #4340

In response to this:

/cherry-pick release-4.6

We’ll likely need this for 4.6, too.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-high Referenced Bugzilla bug's severity is high for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants