Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for Vultr platform #355

Closed
9 tasks done
dghubble opened this issue Jan 28, 2020 · 39 comments
Closed
9 tasks done

Support for Vultr platform #355

dghubble opened this issue Jan 28, 2020 · 39 comments

Comments

@dghubble
Copy link
Member

dghubble commented Jan 28, 2020

Opened per ignition#918

Vultr is a cloud (and bare-metal) hosting provider that may be reasonable for Fedora CoreOS to assemble images for. Vultr allows uploading ISOs by URL and iPXE so its been flexible to develop and hack with. They accept raw images similar to GCP. Vultr serves user-data for cloud-init, but its free-form and I've used it to serve Ignition just fine. Vultr provides CoreOS images themselves (cloud-init only), though it was not a CoreOS publish target. The platform's willingness for folks to use any raw image they can boot, to me, means its pretty unlikely there are required agents or any of that sort of thing.

I'd like to be able to build a raw image for Vultr that knows about their user-data (i.e. platform-id vultr). Such a raw image could be uploaded and used directly be end users. Possible steps:

Hacks today:

  • Boot "installer" ISO - drops to emergency shell so you can install to disk with an ignition shim (ISO installer was deprecated)
  • Boot "live" ISO - boots, but not ideal but you can't ssh in to add an ignition shim :(
  • iPXE - you can use iPXE with their cloud VMs (though its not a great fit and better suited for bare-metal). Currently panics on boot (separate bug, discuss separately)

Background

I've been running Fedora CoreOS on Vultr for a few months as part of some personal experimental infrastructure. I used the installer ISO approach to prepare a snapshot (rather big, but works) after disk install and it operates alright, at least for Kubernetes node use cases.

Why

Though Vultr is not a major cloud, its quite flexible and developer-friendly. I'd consider Vultr (and DigitalOcean) in a separate category since support there is geared toward developer affection and interesting experiments, which spawns blog posts, tutorials, and just general good vibes. I think its an under-estimated value, and Fedora CoreOS could gain if partial or full support was developed.

Note: I don't have any affiliation with Vultr, just using it personally

@dghubble
Copy link
Member Author

I've only had a bit of personal time, but was able to get coreos-assembler (quite different from CoreOS SDK) and build a vanilla image. Seems like I'd need the scripting to build a raw image similar to qemuvariant but specify the platform id to Ignition and bump Ignition somewhere, and then I could upload and test it out.

@bgilbert
Copy link
Contributor

The live ISO should boot to a shell prompt on console, similar to the installer ISO failure case. I'm not sure I understand the problem you're having there; could you elaborate?

Adding the platform SGTM. We should certainly add Afterburn support in that case; it's not essential for boot but it's functionality we should provide consistently on all platforms.

@dghubble
Copy link
Member Author

You're right, the current live ISO does automatic login. I must be recalling some prior build, I can't reproduce being stuck at a login prompt.

One complication with the live ISO approach is the RAM requirement. With the installer ISO, you can boot the $5 instance (1GB, 25GB disk), disk install with a tweaked ignition config to point to Vultr user-data, and snapshot for a 25GB snapshot (not great, but meh). The live ISO needs at least the larger instance (2GB, 50GB disk) and the snapshot ends up being 50GB. Arguably, Vultr could do disk snapshots differently or I could manipulate those raw images, but I know this is getting out of hand and isn't an ideal usage pattern.

That led to wanting to add Vultr awareness to Ignition and build the desired image directly.

@lucab
Copy link
Contributor

lucab commented Jan 29, 2020

I agree having directly usable cloud images is a better path forward.

It sounds like the platform provides network configuration via DHCP, right? In that case, Afterburn is not in the hot path. I've opened a ticket there to add support for SSH keys and attributes, which can be done at any later point.

I've updated the top-comment with more ticket references.

So far though I haven't found any docs regarding the disk image format. That is, do they maybe support some compressed format, or perhaps VM specific ones (qcow/vmdk/etc)?

@dghubble
Copy link
Member Author

dghubble commented Jan 29, 2020

I don't find a docs page, but from the Vultr "snapshot" dashboard (separate from their ISO upload dashboard) where an end-user provides a URL for Vultr to go fetch a raw image:

- Stored snapshots are currently free - pricing subject to change.
- We recommend using DHCP for networking. By default, Vultr instances are configured to use DHCP.
- Snapshots can only be restored to equal or bigger disks. If there is a single partition, it will be automatically expanded.
- Snapshot must be in RAW format (no VMDK, QCOW2, etc)
- Maximum uploaded snapshot size is 150 GB
- Snapshot must support VirtIO Ethernet/Disk Controllers.

@jlebon
Copy link
Member

jlebon commented Jan 29, 2020

This was discussed in the community meeting today. There was no major opposition since it seems rather straightforward to support. So 👍 from me!

jlebon pushed a commit to coreos/coreos-assembler that referenced this issue Jan 31, 2020
* Build a raw image with the Ignition provider set to vultr
to correspond with coreos/ignition#918
* coreos/fedora-coreos-tracker#355
@dghubble
Copy link
Member Author

dghubble commented Feb 18, 2020

I was able to build a Vultr image (with recent ignition and ignition-dracut) and use it for instances on Vultr via custom snapshots. 🙌

Custom override testing seems quite nice once the pattern is known (mentioned below):

# ignition
make install DESTDIR=/home/user/fcos/overrides/rootfs
# ignition-dracut
make install DESTDIR=/home/user/fcos/overrides/rootfs
# fcos
cosa build
cosa buildextend-vultr

So after the next Ignition release / bohdi update, I believe cosa buildextend-vultr images will work for folks who choose to make em.

@lucab
Copy link
Contributor

lucab commented Feb 18, 2020

@dghubble great news!

I guess a parallel path to explore would be to check with Vultr folks whether they have any interest in directly offering Fedora CoreOS templates. They currently seem to have CoreOS Container Linux templates, which would be EOL anyway before the end of 2020.

/cc @ddymko @Oogy

@ddymko
Copy link

ddymko commented Feb 18, 2020

Hey all 👋

We are definitely interested!

I'll create an internal ticket to add in support for Fedora CoreOS snapshots and I'll update this ticket with the progress of that.

If there is anything else we can help with please us know.

@ddymko
Copy link

ddymko commented Mar 3, 2020

Hey @lucab @dghubble

I just spoke to the image team they are shooting for a mi march release for a ton of new snapshots which include Fedora CoreOS.

They let me know if you would like an early preview build of the snapshot we can add this to your Vultr accounts. This way you can do some sanity testing against the images we are going to be releasing.

Let me know if you guys are interested!

@dghubble
Copy link
Member Author

dghubble commented Mar 3, 2020

I'm actively using my test build and am happy to try the Vultr previews (filed #DSB-47CQQ).

@ddymko
Copy link

ddymko commented Mar 6, 2020

Hey @dghubble

The preview snapshot should be ready today and it will get added to your account as a deploy-able snapshot.

@dghubble
Copy link
Member Author

It may be useful to have releases for ignition and ignition-dracut that contain the merged patches, to eliminate the temporary go build and override step mentioned above. It works, but I suspect it may be more difficult for Vultr to create their image until then.

ignition-2.1.1-5.git40c0b57.fc31.x86_64 seems too old.

@dustymabe
Copy link
Member

Agreed. I think there are some other patches in ignition (coming soon) that we'll want so we should be cutting a release soon.

@dghubble
Copy link
Member Author

With the new ignition and ignition-dracut releases, I built a new vultr image and now use it in prod:

cosa build
cosa buildextend-vultr
# produced fedora-coreos-31.20200327.dev.0-vultr.x86_64.raw <- date at time of testing

I also tested one preview image from Vultr folks and they're working on getting their build process established I believe.

@summatix
Copy link

I was able to get Fedora CoreOS running on Vultr using coreos-installer. Seems to be working well so far.

@jlebon
Copy link
Member

jlebon commented May 8, 2020

I added some more checkboxes to the original issue description for a few items more items that are needed before it's fully plumbed through.

jlebon pushed a commit to coreos/fedora-coreos-pipeline that referenced this issue May 8, 2020
* Vultr images can't be uploaded yet, but it would be nice
to build them in an automated fashion
* coreos/fedora-coreos-tracker#355
@dghubble
Copy link
Member Author

I can try to look into those missing plumbing items next week. Separately, I've done some testing with some snapshots with Vultr folks and that's looking promising. I'll let them chime in if they like or about remaining questions/hurdles.

@dghubble
Copy link
Member Author

dghubble commented Jun 9, 2020

lucab pushed a commit to coreos/fedora-coreos-stream-generator that referenced this issue Jun 9, 2020
* FCOS Jenkins pipeline builds Vultr disk images, but they
need to be mentioned in stream metadata too
* coreos/fedora-coreos-tracker#355
@lucab
Copy link
Contributor

lucab commented Jun 16, 2020

@dghubble @summatix @ddymko on a topic similar to #538, do machines on Vultr get their hostname from DHCP or do we need to wire it from the instance metadata?

@summatix
Copy link

I'm not sure how machines get their hostname. By default it seems machines have a hostname of vultr.guest. When I install FCOS the hostname is automatically set to [PUBLIC IP].vultr.com.

@dghubble
Copy link
Member Author

Checking an instance, the hostname seems to be set early on in NetworkManager, I don't find a lease file like in #538 so I'm not sure.

NetworkManager[816]: <info>  [1592466207.7101] hostname: hostname: using hostnamed
NetworkManager[816]: <info>  [1592466207.7104] hostname: hostname changed from (none) to "ams"
...
NetworkManager[816]: <info>  [1592466207.7271] dhcp-init: Using DHCP client 'internal'

Vultr accepts instance hostnames at creation via the UI or API and I do see those names set.

The hostname is also available via the metadata service at http://169.254.169.254/metadata/meta-data/hostname.

@lucab
Copy link
Contributor

lucab commented Jun 23, 2020

Thanks. My guess here is that this sounds like DHCP is not offering the hostname and NM is setting it based on reverse DNS.

@dustymabe
Copy link
Member

Checking an instance, the hostname seems to be set early on in NetworkManager, I don't find a lease file like in #538 so I'm not sure.

You don't see any files at /var/lib/NetworkManager/internal-*lease ?

@dghubble
Copy link
Member Author

No lease files or domain in nmcli device show eth0

ls /var/lib/NetworkManager/
NetworkManager-intern.conf  secret_key  timestamps

There are PTR records, but they resolve to names that don't seem useful to me (given the address is known).

66.xx.yy.zz.in-addr.arpa. 3600 IN     PTR     zz.yy.xx.66.vultr.com.

Seems like systemd may be setting this early on accroding to dmesg.

[   14.140851] systemd[1]: Set hostname to <ams>.
[   14.144278] systemd[1]: Initializing machine ID from KVM UUID.
[   14.375048] systemd[1]: Configuration file /usr/lib/systemd/system/ignition-firstboot-complete.service is marked executable. Please remove executable permission bits. Proceeding anyway.

@dustymabe
Copy link
Member

hey @ddymko - any chance we could get some accounts on vultr for Fedora so we can run some tests on some features we're adding? find me in #fedora-coreos on freenode if you want to chat higher bandwidth.

@ddymko
Copy link

ddymko commented Jul 10, 2020

@dustymabe sure thing! Create an account on vultr.com and create a support ticket. I'll get it squared away for you.

@dustymabe
Copy link
Member

Thanks @ddymko - created an account. Will wait for more info from the partner team.

lucab pushed a commit to coreos/fedora-coreos-docs that referenced this issue Jul 14, 2020
* Add Provisioning Fedora CoreOS on Vultr docs that
shows uploading a snapshot image and creating an
instance using the `vultr-cli`
* Ignition config must include SSH authorized key setup

Related: coreos/fedora-coreos-tracker#355
@summatix
Copy link

The new native Fedora CoreOS server type is working well for me. Makes it much easier to get started with FCOS on Vultr.

The only issue I'm facing is in regards to private networking. If I set the /etc/NetworkManager/system-connections/private-net.nmconnection file statically via ignition then everything works as expected. The trouble is, the contents of this file cannot be completely known until after the server is provisioned (since Vultr automatically assigns an available private IP). When I was using coreos-installer I was able to call upon Vultr's metadata API in order to dynamically generate this file for ignition. Now the ignition is set statically via userdata before provisioning.

I was hoping Fedora CoreOS would be able to pick up the private networking automatically (e.g. via DHCP). Instead, if I don't configure private-net.nmconnection statically via ignition while private networking is enabled, the NetworkManager service fails to start up:

$ journalctl -u NetworkManager-wait-online
Starting Network Manager Wait Online...
NetworkManager-wait-online.service: Main process exited, code=exit>
NetworkManager-wait-online.service: Failed with result 'exit-code'.
Failed to start Network Manager Wait Online.

@dghubble
Copy link
Member Author

Ignition is a first-boot initramfs disk manipulation tool, for writing units, network configs, etc that are acted upon later, in userspace. I think your question is about the particular network config, static assignment vs DHCP assignment. For static assignment you'd need to reserve an IP with Vultr beforehand, for DHCP (default) that works for the public interface, but I recall Vultr was still fleshing out the details for private network.

Btw, the Vultr "native Fedora CoreOS" images you mention are built by Vultr. Maybe not quite be the same as the (new) build artifacts. I'm not sure what Vultr's final plan was (e.g. fetch FCOS published images or custom build).

@lucab
Copy link
Contributor

lucab commented Jul 15, 2020

Self-note:

  • the current images on Vultr are a development build (32.20200618.dev.0) marked as stable but not following auto-updates for that stream
  • I do see lease files at /var/lib/NetworkManager/*.lease, no hostname in there
  • NM logs have a policy: set-hostname: set hostname to '136.244.81.212.vultr.com' (from address lookup)

@lucab
Copy link
Contributor

lucab commented Jul 15, 2020

Hey @ddymko, we are assembling the final bits of this and we realized we have a small impedance mismatch on how the snapshot create-url API works, see coreos/coreos-assembler#1595.

Would it be possible to augment the snapshot-by-URL API to optionally get these additional parameters from the client?

  • a compression algorithm (in our case, xz), with your backend doing a decompression step
  • a digest (in our case, SHA-256), with your backend doing an integrity check before decompressing/importing

@ghost
Copy link

ghost commented Jul 15, 2020

The only issue I'm facing is in regards to private networking. If I set the /etc/NetworkManager/system-connections/private-net.nmconnection file statically via ignition then everything works as expected. The trouble is, the contents of this file cannot be completely known until after the server is provisioned (since Vultr automatically assigns an available private IP).

@summatix, based on your note, I updated the documentation for Vultr ignition files to make clear there isn't a chicken-and-egg problem. I added the following:


  • When you enable private networking, you may use any RFC1918 private address for your ignition files: 10.0.0.0/8, 172.16.0.0/12, or 192.168.0.0/16.
  • You may choose any RFC1918 address, as long as there are no conflicts with your other instances at that location.
  • Private networks can not communicate between locations, regardless of IP addressing. For example, server instances in Miami can not see private networks in Dallas.
  • The private IP addresses shown in the customer portal are suggestions. You are not required to use these suggested private IP addresses.
  • Private networks do not have DHCP, you must manually manage your IP address space or install your own DHCP server on your private network.
  • For optimal performance, we suggest setting your private network adapters' MTU to 1450 when configuring the NIC at the OS level.

@ddymko
Copy link

ddymko commented Jul 15, 2020

Hey @ddymko, we are assembling the final bits of this and we realized we have a small impedance mismatch on how the snapshot create-url API works, see coreos/coreos-assembler#1595.

Would it be possible to augment the snapshot-by-URL API to optionally get these additional parameters from the client?

* a compression algorithm (in our case, xz), with your backend doing a decompression step

* a digest (in our case, SHA-256), with your backend doing an integrity check before decompressing/importing

@lucab unfortunately this isn't something that we could add to snapshot create-url

@lucab
Copy link
Contributor

lucab commented Jul 17, 2020

coreos/afterburn#451 added support for Vultr SSH keys, hostname, and metadata attributes to Afterburn upstream (not yet part of a tagged release).

@dustymabe
Copy link
Member

Afterburn 4.5.0 was released and hit the testing stream in the last release. Anything left to do?

@dghubble
Copy link
Member Author

I was going to check it out, to replace some manual hostname detection. Am I reading the package list right? https://getfedora.org/en/coreos?stream=testing says 32.20200809.2.1 is the latest at this time and has afterburn-4.4.2, same for next.

@dustymabe
Copy link
Member

dustymabe commented Aug 24, 2020

@dghubble - correct. We've got a round of releases in progress now and those should have afterburn 4.5.0.

Sorry I realize my earlier statement was incorrect. Afterburn 4.5.0 will be in the testing release that should go out tomorrow.

@dghubble
Copy link
Member Author

Looks goood to me on f32.20200824.2.0 testing.

I think that completes all the items for Fedora CoreOS on vultr, awesome!

@dghubble dghubble changed the title Support for Vultr platform (discuss) Support for Vultr platform Aug 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants