Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot SSH into Flatcar 3227.2.0 instance created in OpenStack #817

Closed
bpetermannS11 opened this issue Jul 27, 2022 · 9 comments · Fixed by flatcar/init#76 or flatcar-archive/coreos-overlay#2246
Labels
kind/bug Something isn't working platform/openstack

Comments

@bpetermannS11
Copy link

Description

Cannot SSH into an OpenStack VM created from a Flatcar 3227.2.0 image if the VM is provisioned with an ssh key (not with ignition user_data). Most likely the ssh key is not actually provisioned.

Impact

Our pipelines that publish Flatcar images to Glance include a basic test that it would be possible to ssh into a VM built from the image and that test fails now.

Environment and steps to reproduce

  1. Set-up:
    a. Use a Flatcar image downloaded from https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_openstack_image.img.bz2
    b. Create a glance image from it
    c. Create a compute instance based on the image with a key name (e.g. openstack server create with --key-name xxx, but without --user-data)
  2. Task: Try to ssh into the VM as user core
  3. Error: Public-key authentication is skipped and you get a password prompt

Expected behavior

ssh with public-key authentication works

Additional information

I tried to provision the same public key via ignition user-data instead and then ssh worked. So it's probably not a problem with the type of key or host key algorithm. With previous Flatcar image releases provisioning the ssh key worked fine.

Command line used to create a VM (where security group ssh is one that allows incoming port 22)

openstack server create <name> --key-name <key-name> --image <image-id> --flavor <flavor-name> --network <network-name> --security-group default --security-group ssh

or with terraform, including the key pair creation:

resource "tls_private_key" "ssh" {
  algorithm   = "RSA"
  rsa_bits    = 4096
}

resource "openstack_compute_keypair_v2" "ssh" {
  name       = "image_test"
  public_key = tls_private_key.ssh.public_key_openssh
}

resource "openstack_compute_instance_v2" "server" {
  name        = "image_test"

  image_id    = "31115d3f-50d9-4290-9cea-d2c362560290"
  flavor_name = "m1c.tiny"
  key_pair    = openstack_compute_keypair_v2.ssh.name

  security_groups = [
    "default",
    openstack_compute_secgroup_v2.sg_ssh.name,
  ]

  network {
    name = openstack_networking_network_v2.network.name
  }

  depends_on = [
    openstack_networking_subnet_v2.subnet,
  ]
}

I tried to understand the difference of the current release to previous ones and my idea is that cloudinit is not run anymore.
Cloudinit used to query the metadata agent via http://169.254.169.254/2009-04-04/meta-data/public-keys/0/openssh-key and provide the returned key for the core user.

If I configure the VM via user-data I can see in the journal that enabling cloudinit is skipped ("enable-oem-cloudinit.service: Skipped due to 'exec-condition'.") and that is probably because there is already user-data. I couldn't read the journal of a VM that was not configured with user-data though, so I try to guess now, what happened.

In enable-oem-cloudinit.service the ExecConditions should match (.userConfigProvided is probably false if the VM was not configured with user-data). So oem-cloudinit.service will most likely be enabled. But in oem-cloudinit.service there is an ExecCondition that checks if the OEM_ID is one from the list (aws gcp rackspace-onmetal azure cloudsigma packet vmware digitalocean) and openstack is not in it. So cloudinit is not run. But we would need it to fetch the public key.

In older releases there was a default ignition file that created and enabled some oem-cloudinit.service that didn't have that ExecCondition, so cloudinit was run (maybe again only if user-data wasn't available).

@bpetermannS11 bpetermannS11 added the kind/bug Something isn't working label Jul 27, 2022
bpetermannS11 added a commit to bpetermannS11/init that referenced this issue Jul 27, 2022
Allow oem-cloudinit.service for openstack too.
Fixes [1], an issue where ssh keys were not fetched if no user-data
was provisioned, but only a keypair.

[1] flatcar/Flatcar#817
@jepio
Copy link
Member

jepio commented Jul 27, 2022

Can you login via console to probe whether enable-oem-cloudinit.service ran but oem-cloudinit.service didn't? Or do you atleast have access to a console log that you could upload?

We need to add a conversion for openstack in oem-cloudinit.service. Unfortunately we don't have openstack in our CI.

Btw: we recently started publishing gzipped images specifically so that glance can directly import them: https://stable.release.flatcar-linux.net/amd64-usr/current/flatcar_production_openstack_image.img.gz

@bpetermannS11
Copy link
Author

I couldn't log in via console because I don't know the password. Is there a default password for "core" (or root)?

@jepio
Copy link
Member

jepio commented Jul 27, 2022

There isn't a default password, you would need to modify the OEM partition (partition 6), following these instructions https://flatcar-linux.org/docs/latest/installing/customizing-the-image/customize-the-image/#mounting-a-partition-for-customization, before uploading the image. You would need to extend grub.cfg in that partition to contain the line:

set linux_append="flatcar.autologin"

then you will have access to the console without a password.

bpetermannS11 added a commit to bpetermannS11/init that referenced this issue Jul 27, 2022
Allow oem-cloudinit.service for openstack too.
Fixes [1], an issue where ssh keys were not fetched if no user-data
was provisioned, but only a keypair.

[1] flatcar/Flatcar#817
@bpetermannS11
Copy link
Author

I created an image with autologin and checked the journal: oem-cloudinit.service is activated, but when it tries to start it says "oem-cloudinit.service: Skipped due to 'exec-condition'."

@till
Copy link

till commented Jul 27, 2022

I can confirm this even with older releases. I never bothered to check or make a ticket here (mea culpa). I noticed it in the very beginning of using Flatcar. It only works with ignition supplied user data. Not with the key itself provided via OpenStack.

@jepio
Copy link
Member

jepio commented Jul 27, 2022

@till could you check if enabling coreos-metadata-sshkeys@.service service makes that work? Afterburn is responsible for reading the ssh keys from the metadata service, it should support openstack but is not currently enabled for openstack.

@till
Copy link

till commented Jul 27, 2022

@jepio You mean, roll ignition with the service enabled? And then test if that makes the supplied key work?

@jepio
Copy link
Member

jepio commented Jul 27, 2022

Yes. You might also be able to test it from the command line, just systemctl enable --now coreos-metadata-sshkeys@.service

@tormath1 tormath1 reopened this Aug 3, 2022
tormath1 pushed a commit to flatcar/init that referenced this issue Aug 3, 2022
Allow oem-cloudinit.service for openstack too.
Fixes [1], an issue where ssh keys were not fetched if no user-data
was provisioned, but only a keypair.

[1] flatcar/Flatcar#817
@tormath1
Copy link
Contributor

tormath1 commented Aug 3, 2022

@jepio I justed tested with Flatcar deployed on Openstack:

systemctl enable --now coreos-metadata-sshkeys@.service

works fine - SSH keys are provisioned from metadata.

coreos-metadata-sshkeys@.service is enabled by a base ignition configuration which should be added like here: https://github.com/flatcar-linux/coreos-overlay/blob/9ac91d73a735834c38334c899376528b6354ea6d/coreos-base/oem-ec2-compat/oem-ec2-compat-0.1.2-r2.ebuild#L63-L65

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working platform/openstack
Projects
None yet
4 participants