New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1423.0.0 AWS no instance ssh key #1981

Closed
monder opened this Issue May 29, 2017 · 6 comments

Comments

Projects
None yet
4 participants
@monder

monder commented May 29, 2017

Issue Report

Bug

Container Linux Version

1423.0.0

Environment

AWS ami-a8c2d6ce

Expected Behavior

I can ssh to instance using the key specified in AWS

Actual Behavior

Instance asks for a password.

Reproduction Steps

Create a machine with with ignition config (From my experiments- it does not matter whats inside)

{
    "ignition": {
      "version": "2.0.0",
      "config": {}
    },
    "storage": {
      "files": [
      ]
    },
    "systemd": {
      "units": [
      ]
    },
    "networkd": {},
    "passwd": {}
  }

Try to ssh.

Other Information

@lucab

This comment has been minimized.

Show comment
Hide comment
@lucab

lucab May 31, 2017

Member

@monder good catch, thanks for the report! I can confirm this is a regression in 1423.0.0 due to coreos-metadata 0.10.0.

For reference, the service responsible for populating ssh keys is coreos-metadata-sshkeys@core.service.

Here is the log from a failed instance:

May 31 14:04:31 localhost systemd[1]: Starting CoreOS Metadata Agent (SSH Keys)...
May 31 14:04:31 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/instance-id": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/public-ipv4": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/local-ipv4": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/hostname": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/placement/availability-zone": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/dynamic/instance-identity/document": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/public-keys": Attempt #1

Paging @dgonyeo to have a look into this.

From my perspective there are a bunch of issues here:

  • the last "public-keys" url is wrong, missing a "meta-data" fragment
  • the service is doing superflous fetches to endpoints
  • the service returns with success even though the HTTP endpoints returns a 404
  • for some reasons this is only triggered when passing an ignition config, but the same error occurs on every new alpha instance. I guess in the other case cloud-init still manages to populate the ssh key.
Member

lucab commented May 31, 2017

@monder good catch, thanks for the report! I can confirm this is a regression in 1423.0.0 due to coreos-metadata 0.10.0.

For reference, the service responsible for populating ssh keys is coreos-metadata-sshkeys@core.service.

Here is the log from a failed instance:

May 31 14:04:31 localhost systemd[1]: Starting CoreOS Metadata Agent (SSH Keys)...
May 31 14:04:31 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/instance-id": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/public-ipv4": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/local-ipv4": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/hostname": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/placement/availability-zone": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/dynamic/instance-identity/document": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/public-keys": Attempt #1

Paging @dgonyeo to have a look into this.

From my perspective there are a bunch of issues here:

  • the last "public-keys" url is wrong, missing a "meta-data" fragment
  • the service is doing superflous fetches to endpoints
  • the service returns with success even though the HTTP endpoints returns a 404
  • for some reasons this is only triggered when passing an ignition config, but the same error occurs on every new alpha instance. I guess in the other case cloud-init still manages to populate the ssh key.
@monder

This comment has been minimized.

Show comment
Hide comment
@monder

monder May 31, 2017

Also, shouldn't that service have network-online.target since it does require network.

monder commented May 31, 2017

Also, shouldn't that service have network-online.target since it does require network.

@dgonyeo

This comment has been minimized.

Show comment
Hide comment
@dgonyeo

dgonyeo May 31, 2017

@lucab I made a PR to fix this, and on your comments:

the service is doing superfluous fetches to endpoints

I'm not sure which fetches in here are unnecessary. Each piece of metadata lives at a different endpoint in the metadata service. I can remove unnecessary calls if you can explain a little more

the service returns with success even though the HTTP endpoints returns a 404

A 404 is expected if the user has not configured an SSH key, so this error is swallowed. I'll see if I can find a way to differentiate between "no ssh keys" and "there's a legitimate error".

dgonyeo commented May 31, 2017

@lucab I made a PR to fix this, and on your comments:

the service is doing superfluous fetches to endpoints

I'm not sure which fetches in here are unnecessary. Each piece of metadata lives at a different endpoint in the metadata service. I can remove unnecessary calls if you can explain a little more

the service returns with success even though the HTTP endpoints returns a 404

A 404 is expected if the user has not configured an SSH key, so this error is swallowed. I'll see if I can find a way to differentiate between "no ssh keys" and "there's a legitimate error".

@lucab

This comment has been minimized.

Show comment
Hide comment
@lucab

lucab May 31, 2017

Member

I'm not sure which fetches in here are unnecessary. Each piece of metadata lives at a different endpoint in the metadata service. I can remove unnecessary calls if you can explain a little more

But this specific flag only takes care of the ssh key, so why re-fetching everything? Or maybe not and I misunderstood how this was designed to work.

A 404 is expected if the user has not configured an SSH key, so this error is swallowed.

Ah, having a single status for empty and missing endpoint is unfortunate.

Member

lucab commented May 31, 2017

I'm not sure which fetches in here are unnecessary. Each piece of metadata lives at a different endpoint in the metadata service. I can remove unnecessary calls if you can explain a little more

But this specific flag only takes care of the ssh key, so why re-fetching everything? Or maybe not and I misunderstood how this was designed to work.

A 404 is expected if the user has not configured an SSH key, so this error is swallowed.

Ah, having a single status for empty and missing endpoint is unfortunate.

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Jun 1, 2017

Member

We've released Container Linux 1430.0.0 to the alpha channel. It works around this issue by reverting the coreos-metadata change.

Member

bgilbert commented Jun 1, 2017

We've released Container Linux 1430.0.0 to the alpha channel. It works around this issue by reverting the coreos-metadata change.

@dgonyeo

This comment has been minimized.

Show comment
Hide comment
@dgonyeo

dgonyeo Jun 2, 2017

Oh I see what you're talking about with the extra fetches. Fixing that would require rearchitecting coreos-metadata, since there's the single FetchMetadata function for every provider, which is definitely outside the scope of this issue. Something worth doing once I have the time, I guess.

dgonyeo commented Jun 2, 2017

Oh I see what you're talking about with the extra fetches. Fixing that would require rearchitecting coreos-metadata, since there's the single FetchMetadata function for every provider, which is definitely outside the scope of this issue. Something worth doing once I have the time, I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment