Consider deleting userdata from provider after Ignition completes #1315

bgilbert · 2022-01-29T21:54:19Z

Feature Request

Environment

At least VirtualBox, VMware

Desired Feature

Some platforms allow modifying or removing userdata using APIs accessible from the VM instance. On those platforms, consider deleting the Ignition config from the platform after Ignition completes.

Other Information

Ignition configs may contain sensitive information that should not be accessible from workloads running on the machine. We currently do nothing to help users with sensitive userdata, and there is no documentation about the issue (coreos/fedora-coreos-docs#306). On platforms with network metadata services, the user can write an Ignition config that firewalls off the metadata service (not documented in Fedora CoreOS: coreos/fedora-coreos-docs#247). On platforms with in-hypervisor metadata accessible by non-root users (at least VMware), firewalling isn't an option, and the user must delete the metadata themselves.

Various minimalistic fixes are possible. We could publish docs on adding a systemd service that disables userdata access. If an external tool is necessary, we could ship it or recommend adding it to the OS. Or we could recommend that OSes check the platform ID and disable userdata access themselves.

However, we generally try to keep Ignition secure by default. I think it would make sense to delete userdata automatically wherever we have an API to do so. This is most important for platforms that can't be remediated by firewalling, but is still a worthwhile improvement on the others. On platforms where we can't delete userdata, we could log a warning recommending that users firewall or otherwise mitigate their risk.

In principle this would be a compatibility break, since it's possible for other software in the instance (or on a VM host) to intentionally access userdata. (That would be hacky, since the userdata must be formatted as an Ignition config for Ignition to care about it, but it's conceivable.) If we're concerned about this, we could add an Ignition config field (ignition.config.delete?) configuring whether Ignition performs the deletion. I think the default should certainly be true after the next major spec bump, but the short-term default is less clear, especially for old specs that don't have the field.

We'd need to delete userdata after Ignition completes, not after fetch succeeds. There are two reasons: kargs changes may reboot the machine and refetch userdata, and a failed Ignition run may be retried by the OS (since Ignition is mostly idempotent).

Platform notes

AWS

Network metadata service. Userdata can be changed with the ModifyInstanceAttribute control plane request, but there's no instance-side API. Control plane access is probably out of scope for us.

Azure

On FCOS and RHCOS, Azure "custom data" access is already disabled at runtime, outside Ignition's control. Azure doesn't allow full access to a VM until it "checks in" to report that it has provisioned successfully, and this checkin causes Azure to disable access to the config drive. Checkin is handled by Afterburn, which runs in the initramfs on RHCOS (to avoid long delays for console access) and in the real root on FCOS (for eventual integration with automatic rollback: coreos/fedora-coreos-tracker#47). It may or may not make sense to move this into Ignition.

The new Azure IMDS userdata mechanism (#1351) does not act this way. The instance can access userdata even after checkin, and there's no instance-side API to disable it.

DigitalOcean

Network metadata service does not appear to support changing userdata.

GCP

Network metadata service does not appear to support changing regular metadata attributes via the metadata service. Guest attributes can be changed from the guest, but the user-data attribute is not one of those. Some additional discussion in coreos/fedora-coreos-docs#306.

Packet

Network metadata service does not appear to support changing userdata.

qemu

fw_cfg userdata is only accessible by root. Write support was removed in qemu/qemu@023e314.

We also support config drives on non-x86_64.

VirtualBox

Guest properties can be modified from inside the guest. They're documented as being accessible only to root, but it still seems useful to remove userdata.

VMware

Userdata is accessible by non-root users. It can be stored directly in guestinfo or in an XML document in guestinfo, and we'd have to handle both cases. See #1300.

The text was updated successfully, but these errors were encountered:

jlebon · 2022-01-31T17:49:35Z

For VMware, it looks like cloud-init supports deleting the user-data, but it's opt-in:

https://github.com/canonical/cloud-init/blob/main/doc/rtd/topics/datasources/vmware.rst#redacting-sensitive-information
https://github.com/canonical/cloud-init/blob/0362abcd762d1bc3a359e7514113683b76d79352/cloudinit/sources/DataSourceVMware.py#L283-L293
https://github.com/canonical/cloud-init/blob/0362abcd762d1bc3a359e7514113683b76d79352/cloudinit/sources/DataSourceVMware.py#L485

jlebon · 2022-01-31T17:54:57Z

Torn on whether we should do this by default or make it opt-in too. The fact that it's also only possible on some platforms is a concern.

Maybe cleanest is a delete key which defaults to false, but if set to true, then we delete it on platforms where we can, but otherwise clearly error out where we can't. That way, users aren't misled into thinking their secrets are safe and can take alternative measures (e.g. firewalling or using platform APIs with higher creds).

bgilbert · 2022-01-31T22:12:00Z

I don't like the idea of making a security feature opt-in, even if we can't support it everywhere. But also, as a practical matter, Butane and Ignition configs aren't inherently platform-specific. (CT generated Ignition configs that were, and it always felt awkward.) For any deployments that use the same config on dev and prod platforms with different support for this option, we'd be forcing them to either maintain separate configs or accept the lower-security option in both places.

I'm leaning toward deleting configs by default, maybe even without an override. I can construct hypothetical cases where someone might need the config left in place, but none of them feel particularly natural. We can always go back and add an override if it turns out someone needs it.

(Alternative: we could have a tri-state, keep/delete/try-delete, defaulting to try-delete.)

jlebon · 2022-02-01T15:11:20Z

I don't like the idea of making a security feature opt-in, even if we can't support it everywhere.

I agree, though my concern is that latter part of that sentence. We could log a warning as you suggested, though that'd be easy to miss and wouldn't be accurate because the user might've taken appropriate measures already and we have no good way to detect that.

But also, as a practical matter, Butane and Ignition configs aren't inherently platform-specific.

True, but whether your config holds secrets is config-specific. :)

(Alternative: we could have a tri-state, keep/delete/try-delete, defaulting to try-delete.)

Hmm, yeah that could work.

For platforms in which we can only firewall, maybe we could for FCOS/RHCOS have Butane sugar to make that easier. E.g. have it create a stamp file and then add a systemd generator to https://github.com/coreos/fedora-coreos-config which sets up the rules for the specific platform we're on?

cgwalters · 2022-02-02T15:16:05Z

As openshift/enhancements#443 notes, the OpenShift SDN currently blocks link-local addresses which is the main mitigation against this from an OCP perspective.

However, note that there definitely exist containers which expect to be able to retrieve user data; those containers today can join the host network namespace to skip the SDN firewalling. If the firewalling is at the host level they'd have to go through further hoops to e.g. remove the iptables rules inside their own netns.

I found kubernetes-sigs/cluster-api-provider-aws#1875 in my web history which is related too.

bgilbert · 2022-02-04T09:17:33Z

@jlebon and I discussed this more OOB. Some (reconstructed) notes:

We've never published any guidance on secret storage. The obvious way to do this "properly" with Ignition is to put secrets in child configs hosted on a special server, which deletes them once provisioning is complete. That requires a) the parent config to be unique for each individual node, b) the secrets to be stored at a unique URL for each node being provisioned, and c) Ignition to inform the server once provisioning is complete.

We could build a component that acts as the server (as a deployable service; it's not obvious how to do it securely as a hosted one), but the user's infrastructure would still have to handle (a) and (b), so probably a dedicated server component is not helpful enough to be worthwhile. Maybe this functionality would make more sense integrated into tools like Matchbox. (c) could be implemented by having the config server send an HTTP reply header giving Ignition a callback URL to hit once provisioning is complete.

On the other hand, tools for cluster-level secret storage already exist and maybe we shouldn't reinvent the wheel. People are putting secrets in configs, though, so we shouldn't ignore this issue entirely.
There's a legitimate use case for not deleting configs: nodes might be reprovisioned and expect to re-fetch the Ignition config from userdata. For example, VMware nodes might run entirely from live PXE, so every boot would need to fetch the Ignition config. I don't think this is a common use case today - for example, PXE nodes probably use ignition.platform.id=metal ignition.config.url= - but it's a supported one. So we probably can't just implement try-delete without an override.
We didn't really make progress on the schema question. I don't think Butane firewall sugar will work (Butane would have to know both the platform ID and the right way to do firewalling for the user's cloud orchestrator), and suspect defaulting to delete will just cause everyone to set try-delete without actually creating the firewall rules we want to encourage. @jlebon doesn't want to implicitly encourage users to skip the needed firewall rules on platforms where that matters.
Userdata deletion should notionally happen when firstboot is marked complete, which is the OS' responsibility. It seems safe to do the deletion first (Ignition will just no-op if it reruns) but that might be confusing.
The technical implementation of userdata deletion is straightforward and best done in the Ignition codebase. In particular, on VMware we sometimes have to modify an XML document and write it back to the guestinfo property. In the short term, it might make sense to implement config deletion as a separate tool (or multicall binary) in this repo and let the OS invoke it, applying whatever policy it wants (perhaps driven by a config file written by the Ignition config). That lets OSes evolve the policy over time without encoding policy in a stable spec. Once we have a better sense of the tradeoffs, we can always drop the separate binary and have Ignition implement a policy directly.

bgilbert · 2022-02-04T09:20:41Z

@cgwalters

However, note that there definitely exist containers which expect to be able to retrieve user data

Really? Why?

cgwalters · 2022-02-04T14:59:50Z

Really? Why?

I think in some cases e.g. kubelet uses this to determine which availability zone it's in. It's not about accessing the user data, but more about instance metadata in general; an IP-level firewall can't block one but not the other.

There's definitely also 3rd party containers which read instance metadata.

I hesitate to say it, but the real solution may need to be something like a transparent proxy which knows about each cloud and filters out just the user data to start. i.e. we use iptables to redirect requests to the proxy process.

cgwalters · 2022-02-04T15:02:27Z

(My initial comment mentioned "user data" when I definitely meant "instance metadata")

pothos · 2022-04-22T04:33:51Z

With #1351 the situation for Azure changed when IMDS is used which will stay available for all clients on the instance (but the docs call out that secrets shouldn't be placed there).

bgilbert · 2022-04-22T06:26:26Z

Thanks, updated the description.

bgilbert added kind/enhancement area/security platform/gcp platform/virtualbox platform/vmware labels Jan 29, 2022

bgilbert mentioned this issue Jan 29, 2022

Security when using vmware to store the ignition config? #1300

Closed

bgilbert added the jira for syncing to jira label Feb 4, 2022

bgilbert removed the platform/gcp label Mar 21, 2022

bgilbert mentioned this issue Apr 20, 2022

Delete userdata from VirtualBox/VMware after Ignition completes #1350

Merged

bgilbert closed this as completed in #1350 May 3, 2022

GoVulnBot mentioned this issue May 17, 2022

x/vulndb: potential Go vuln in github.com/coreos/ignition: CVE-2022-1706 golang/vulndb#451

Closed

jba mentioned this issue May 17, 2022

x/vulndb: potential Go vuln in github.com/coreos/ignition: CVE-2022-1706 jba/nested-modules#316

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider deleting userdata from provider after Ignition completes #1315

Consider deleting userdata from provider after Ignition completes #1315

bgilbert commented Jan 29, 2022 •

edited

jlebon commented Jan 31, 2022

jlebon commented Jan 31, 2022

bgilbert commented Jan 31, 2022

jlebon commented Feb 1, 2022

cgwalters commented Feb 2, 2022

bgilbert commented Feb 4, 2022

bgilbert commented Feb 4, 2022

cgwalters commented Feb 4, 2022

cgwalters commented Feb 4, 2022

pothos commented Apr 22, 2022

bgilbert commented Apr 22, 2022

Consider deleting userdata from provider after Ignition completes #1315

Consider deleting userdata from provider after Ignition completes #1315

Comments

bgilbert commented Jan 29, 2022 • edited

Feature Request

Environment

Desired Feature

Other Information

Platform notes

AWS

Azure

DigitalOcean

GCP

Packet

qemu

VirtualBox

VMware

jlebon commented Jan 31, 2022

jlebon commented Jan 31, 2022

bgilbert commented Jan 31, 2022

jlebon commented Feb 1, 2022

cgwalters commented Feb 2, 2022

bgilbert commented Feb 4, 2022

bgilbert commented Feb 4, 2022

cgwalters commented Feb 4, 2022

cgwalters commented Feb 4, 2022

pothos commented Apr 22, 2022

bgilbert commented Apr 22, 2022

bgilbert commented Jan 29, 2022 •

edited