FAQ: AWS private regions #396

cgwalters · 2020-01-13T18:30:18Z

This has come up a few times.

cgwalters · 2020-01-13T18:31:01Z

Didn't test this e2e, but it's how it should work.

cgwalters · 2020-01-13T21:54:41Z

Some people are hitting:

IMPORTIMAGETASKS        x86_64  rhcos-43.81.201912030353.0-aws.x86_64.vmdk import-ami-fg6w6vod     Linux   deleted ClientError: EFI partition detected. UEFI booting is not supported in EC2.

offhand, I don't know why our pipeline isn't hitting this. One theory is that it's somehow specific to the aws CLI tool but that seems unlikely. Another possibility is that it's only newer EC2 regions which implement this check - we're uploading to the One True Region (us-east-1) and replicating from there.

cgwalters · 2020-01-14T19:45:09Z

This all said, maybe we should just delete the UEFI partition in our EC2 images. Part of me feels that it unnecessarily breaks the uniformity we have, and it's also explicitly against where we want to go in the future (using UEFI more across the board) but...

darkmuggle · 2020-01-14T20:22:58Z

The uniformity for a dead UEFI partition that we know won't be used doesn't really buy us much. In this case, the uniformity is academic. Dropping the UEFI partition seems like a minor thing, when we can just delete the partition and not change anything else. And with GPT partitions we can still keep root on part 4.

jlebon · 2020-01-14T20:39:39Z

I think the property of having each platform image be a simple transform step away from each other is really nice, but I'm not strongly opposed to it. There's something subtle going on here though if neither RHCOS nor FCOS hit this. Probably worth investigating a bit before using the nuclear option?

Note also if we do this, we'll probably have to adapt the mount generator too.

jaredhocutt · 2020-01-22T19:05:43Z

Due to the UEFI issue described by @cgwalters I've been working to find a workaround for getting a RHCOS image into AWS manually (especially in private AWS regions). Here are the details for how I used the RHCOS bare metal BIOS raw image and modified it to work: https://github.com/jaredhocutt/openshift4-aws/tree/master/rhcos#how-we-got-it-to-work

It's not ideal and I wouldn't expect anyone to actually do it that way if they want a supported cluster, but I did want to pass along what I've figured out.

cgwalters · 2020-01-22T20:58:39Z

Here are the details for how I used the RHCOS bare metal BIOS raw image and modified it to work: https://github.com/jaredhocutt/openshift4-aws/tree/master/rhcos#how-we-got-it-to-work

Eek, no please don't do it that way. By snapshotting a booted system, you've saved things like SSH keys (so each machine will have the same host key, random seed, etc.)

What you want to do is zap the partition offline. You should be able to do this by getting the raw vmdk file and using any partition program (fdisk etc.) on it.

You'll then have a failed systemd unit on startup looking for it as mentioned above so you'd probably need to do something like actually replace the partition with a non-FAT. (Or disable the unit, but that's a bit awkward to do in a way persistent across upgrades)

We're discussing potential upstream fixes here.

jaredhocutt · 2020-01-22T21:06:04Z

Eek, no please don't do it that way. By snapshotting a booted system, you've saved things like SSH keys (so each machine will have the same host key, random seed, etc.)

I mounted it as a secondary disk and did not boot it. So did exactly what you said, just by mounting it to an EC2 instance instead of doing it on my laptop.

cgwalters · 2020-01-22T21:15:43Z

I mounted it as a secondary disk and did not boot it. So did exactly what you said, just by mounting it to an EC2 instance instead of doing it on my laptop.

Got it, sorry. Yes, that's fine.

jaredhocutt · 2020-01-22T21:21:54Z

I was also able to figure out how to get the 4.3 AWS VMDK image to work, which I've added to the same GitHub page just below my details for the bare metal image in 4.2.

The big issue is that with the current 4.3 AWS VMDK, you cannot use aws ec2 import-image to import that image as-is because it gives you the EFI partition detected error. It seems the aws ec2 import-image command tries to "help" you by checking things and as soon as it sees the EFI partition, it fails.

However, I was able to import the image just as a simple snapshot using aws ec2 import-snapshot and then use aws ec2 register-image to create the AMI. That worked and I was able to boot the image with a bootstrap.ign file (I didn't go through with a full install though).

So this works for now, but we really need to have an image that we can use aws ec2 import-image with as-is.

jlebon · 2020-01-22T22:14:46Z

However, I was able to import the image just as a simple snapshot using aws ec2 import-snapshot and then use aws ec2 register-image to create the AMI

Ahh yup, this matches up with what ore aws upload does, so it makes sense that this works (and actually, anyone can use ore to do this but clearly the aws CLI is the standard tool).

So this works for now, but we really need to have an image that we can use aws ec2 import-image with as-is.

Hmm, might be worth asking AWS to refine that API to not erroneously reject images that have EFI partitions if they also have a BIOS boot partition. Or barring that, some kind of "I know what I'm doing" flag.

cgwalters · 2020-01-23T16:13:57Z

However, I was able to import the image just as a simple snapshot using aws ec2 import-snapshot and then use aws ec2 register-image to create the AMI. That worked and I was able to boot the image with a bootstrap.ign file (I didn't go through with a full install though).

If that works consistently, then I think it's much simpler to just document it. I'll update this PR. And further for OpenShift, the installer should have a high level command for this.

This has come up a few times.

jaredhocutt · 2020-01-24T13:33:52Z

If that works consistently, then I think it's much simpler to just document it. I'll update this PR. And further for OpenShift, the installer should have a high level command for this.

It may be simpler to document, but it's not how users of AWS expect it to work. The AWS documentation describes using aws ec2 import-image as the way to import AMIs. The RHCOS AWS image should be in a format that works with that command.

jlebon · 2020-01-24T15:01:55Z

It may be simpler to document, but it's not how users of AWS expect it to work. The AWS documentation describes using aws ec2 import-image as the way to import AMIs. The RHCOS AWS image should be in a format that works with that command.

I've opened a support case with AWS to fix the problematic ImportImage heuristic.

jlebon · 2020-01-24T22:21:25Z

/lgtm

jlebon · 2020-01-27T16:58:20Z

Got a response from AWS about this. Essentially, the ImportImage API isn't just the equivalent of ImportSnapshot + RegisterImage. It's more part of the VM import/export path for people migrating their workloads to AWS: https://docs.aws.amazon.com/vm-import/latest/userguide/vmie_prereqs.html

As such, the API is much more invasive. For example, for Windows images, it'll detect UEFI boot partitions and convert them to MBR. It doesn't support Linux UEFI images.

But the point is that there's a mismatch of intent. Its goal is to implement automatic conversion heuristics, which I don't think we want. So overall, I think we should stick with the ImportSnapshot workflow to be sure the final image is exactly as we intend.

jaredhocutt · 2020-01-27T22:15:19Z

@jlebon Thanks for the update. In that case, when we document this method, it would be nice to do 2 things.

Describe the steps for getting an RHCOS image into AWS, but also link out to this page in the AWS documentation as well for reference: https://docs.aws.amazon.com/vm-import/latest/userguide/vmimport-import-snapshot.html
Specifically call out that using the VM import documentation in AWS not work. I suspect that a lot of people will give that a try first and when they finally come back to actually read the documentation, it would be nice for them to find a specific statement confirming that the VM import method will not work and that they should follow different instructions.

Flesh things out a bit more based on discussions in openshift#396.

jlebon · 2020-01-31T18:42:49Z

@jaredhocutt I posted a follow-up here: #398.

jaredhocutt · 2020-01-31T19:05:21Z

@jaredhocutt I posted a follow-up here: #398.

Awesome! Thanks @jlebon :)

dmc5179 · 2020-05-18T15:05:45Z

The export process of AMIs fails for the same reason, the UEFI partitions. This means that it is not possible to get RHCOS images onto AWS SnowBall edge devices. I've spoken with the AWS TAMs at the NGA and they have said it is not possible to import snapshots and register images against the SnowBall edge devices as can be done for standard AWS like described in this issue.

Nothing in the OS touches the ESP by default, so there's no reason to mount it by default, particularly wriable. This is good for avoiding wear&tear on the filesystem, but I am specifically doing this as preparation for potentially removing the ESP from AWS images, because AWS `ImportImage` chokes on its presence: openshift/os#396

Nothing in the OS touches the ESP by default, so there's no reason to mount it by default, particularly writable. This is good for avoiding wear&tear on the filesystem, but I am specifically doing this as preparation for potentially removing the ESP from AWS images, because AWS `ImportImage` chokes on its presence: openshift/os#396

Preparation for potentially removing the ESP from AWS images, because AWS `ImportImage` chokes on its presence: openshift/os#396

jlebon · 2021-01-14T14:23:04Z

Re. Snowball, looks like the EFI partition is no longer an issue now as Dan mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1794157#c14.

openshift-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 22, 2020

FAQ: AWS private regions

67f1cb7

This has come up a few times.

cgwalters force-pushed the faq-aws-private-region branch from 12aa9ad to 67f1cb7 Compare January 23, 2020 16:28

openshift-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jan 23, 2020

openshift-ci-robot assigned jlebon Jan 24, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Jan 24, 2020

openshift-merge-robot merged commit a9f3a42 into openshift:master Jan 24, 2020

jlebon added a commit to jlebon/os that referenced this pull request Jan 31, 2020

FAQ: Add more info about creating RHCOS AMIs

1a71e99

Flesh things out a bit more based on discussions in openshift#396.

jlebon mentioned this pull request Jan 31, 2020

FAQ: Add more info about creating RHCOS AMIs #398

Merged

jlebon added a commit to jlebon/os that referenced this pull request Jan 31, 2020

FAQ: Add more info about creating RHCOS AMIs

98d4d85

Flesh things out a bit more based on discussions in openshift#396.

cgwalters mentioned this pull request Mar 3, 2020

Importing Fedora CoreOS VMDK into AWS coreos/fedora-coreos-tracker#406

Closed

cgwalters mentioned this pull request May 18, 2020

WIP: overlay: Don't mount /boot/efi by default coreos/fedora-coreos-config#407

Closed

cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this pull request May 25, 2020

overlay: Don't mount /boot/efi on AWS

d7fbbc4

Preparation for potentially removing the ESP from AWS images, because AWS `ImportImage` chokes on its presence: openshift/os#396

darkdatter mentioned this pull request Jul 2, 2020

RHCOS VMDK in AWS Private region #407

Closed

travier mentioned this pull request Oct 22, 2020

Mount /boot as RO by default coreos/fedora-coreos-tracker#652

Closed

travier mentioned this pull request Dec 7, 2020

Do not mount /boot/efi by default coreos/fedora-coreos-tracker#694

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FAQ: AWS private regions #396

FAQ: AWS private regions #396

cgwalters commented Jan 13, 2020

cgwalters commented Jan 13, 2020

cgwalters commented Jan 13, 2020

cgwalters commented Jan 14, 2020

darkmuggle commented Jan 14, 2020

jlebon commented Jan 14, 2020

jaredhocutt commented Jan 22, 2020

cgwalters commented Jan 22, 2020

jaredhocutt commented Jan 22, 2020

cgwalters commented Jan 22, 2020

jaredhocutt commented Jan 22, 2020

jlebon commented Jan 22, 2020

cgwalters commented Jan 23, 2020

jaredhocutt commented Jan 24, 2020

jlebon commented Jan 24, 2020

jlebon commented Jan 24, 2020

jlebon commented Jan 27, 2020

jaredhocutt commented Jan 27, 2020

jlebon commented Jan 31, 2020

jaredhocutt commented Jan 31, 2020

dmc5179 commented May 18, 2020

jlebon commented Jan 14, 2021

FAQ: AWS private regions #396

FAQ: AWS private regions #396

Conversation

cgwalters commented Jan 13, 2020

cgwalters commented Jan 13, 2020

cgwalters commented Jan 13, 2020

cgwalters commented Jan 14, 2020

darkmuggle commented Jan 14, 2020

jlebon commented Jan 14, 2020

jaredhocutt commented Jan 22, 2020

cgwalters commented Jan 22, 2020

jaredhocutt commented Jan 22, 2020

cgwalters commented Jan 22, 2020

jaredhocutt commented Jan 22, 2020

jlebon commented Jan 22, 2020

cgwalters commented Jan 23, 2020

jaredhocutt commented Jan 24, 2020

jlebon commented Jan 24, 2020

jlebon commented Jan 24, 2020

jlebon commented Jan 27, 2020

jaredhocutt commented Jan 27, 2020

jlebon commented Jan 31, 2020

jaredhocutt commented Jan 31, 2020

dmc5179 commented May 18, 2020

jlebon commented Jan 14, 2021