Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vmware: kernel_lockdown breaks guestinfo fetching #1092

Open
lucab opened this issue Sep 11, 2020 · 5 comments
Open

vmware: kernel_lockdown breaks guestinfo fetching #1092

lucab opened this issue Sep 11, 2020 · 5 comments

Comments

@lucab
Copy link
Contributor

lucab commented Sep 11, 2020

Operating System Version

RHCOS 4.6 nightly (likely recent FCOS too, haven't directly checked)

Ignition Version

2.6.0

Environment

VMware vSphere 7.0, with EFI and Secure Boot enabled.

Reproduction Steps

  1. Before booting a VM for the first boot, follow https://docs.vmware.com/en/VMware-vSphere/7.0/com.vmware.vsphere.vm_admin.doc/GUID-898217D4-689D-4EB5-866C-888353FE241C.html

Actual Behavior

Enabling Secure Boot turns on kernel_lockdown, which in turn blocks the iopl call. Ignition performs that in order to get access to the I/O ports for hypervisor backdoor.

Symptoms are:

  • fetch stages failing with operation not permitted
  • kernel logging Lockdown: iopl is restricted on the console

Here below is a screenshot from the emergency console.

ignition-vmware-eperm

Ref:

@lucab lucab changed the title vmware: kernel lockdown breaks guestinfo fetching vmware: kernel_lockdown breaks guestinfo fetching Sep 11, 2020
@lucab
Copy link
Contributor Author

lucab commented Sep 15, 2020

For reference, this used to work on v0.x releases due to a lucky mess, possibly in a non-deterministic way. This bug only affects v2.x releases, starting with v2.0.0-beta.

On 0.35.0 Ignition is using an ancient/incomplete fork of the vmware library: https://github.com/coreos/ignition/tree/v0.35.0/vendor/github.com/sigma. This version does not perform any iopl to properly set access right to IO ports.
This theoretically may result in non-deterministic failures depending on I/O port choice (which is picked on each backdoor access by the hypervisor), although I wasn't able to practically encounter them after a dozen of manual attempts.

The library was updated in #793, which went into v2.0.0-beta. It now performs an iopl in its initialization path: https://github.com/coreos/ignition/blob/v2.6.0/vendor/github.com/vmware/vmw-guestinfo/vmcheck/vmcheck.go#L34-L37.
This is technically correct (as it avoids possible non-deterministic failures on port access), but it leads to all 2.x releases not working when kernel_lockdown is active.

@lucab
Copy link
Contributor Author

lucab commented Sep 28, 2020

The underlying library bug is tracked at vmware-archive/vmw-guestinfo#20. I posted a short-term hotfix at vmware-archive/vmw-guestinfo#21, which does alleviate the symptoms in Ignition in most cases.
Though, we likely need help from library owners for the longer-term solution, still pending.

@bgilbert
Copy link
Contributor

This is not fixed upstream. Reopening.

@bgilbert bgilbert reopened this Mar 17, 2022
@miabbott
Copy link
Member

Heh, we started carrying a patch in Fedora (https://src.fedoraproject.org/rpms/ignition/pull-request/91) just in time for the upstream PR to get merged - vmware-archive/vmw-guestinfo#21 (comment)

@bgilbert
Copy link
Contributor

#1332 revendors vmw-guestinfo to pick up the upstream workaround. Leaving this open to track a potential longer-term fix.

bgilbert added a commit to coreos/coreos-assembler that referenced this issue Mar 18, 2022
Disable Secure Boot due to coreos/ignition#1092.

Co-authored-by: Joseph Callen <jcallen@redhat.com>
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
yasminvalim pushed a commit that referenced this issue Mar 12, 2024
Fetch-Offline:
--------------

Right now, even if fetch-offline gets ErrNeedNet, it might've still
logged info about configs which it did fetch before hitting the error.
This then results in double-logging of e.g. the base config and at least
the first layer of user configs when fetch re-fetches them.

But it's also misleading, because anything which runs between
fetch-offline and fetch and sees the journal messages will think
that Ignition did successfully fetch and cache the merged user config,
when it did not.

And sadly, we still have code which peek at the cached config for
$reasons (legacy-style RHCOS LUKS is one of them, RHCOS FIPS support
is another), and those bits get thrown off by seeing the logging
messages yet not seeing a cached Ignition config.

Let's tweak things so that we buffer those messages and only actually
write them out once we've successfully acquired the configs.

While we're here, clean up the base config logging hack now that the
fetch stages are canonical.

VMware Kernel Lockdown:
-----------------------

This is a quickfix to avoid performing an `iopl`, which is blocked by
kernel_lockdown under SecureBoot.

Refs:
 * https://bugzilla.redhat.com/show_bug.cgi?id=1877995
 * https://github.com/lucab/vmw_backdoor-rs/issues/6
 * #1092
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants