Skip to content
This repository has been archived by the owner. It is now read-only.

1423.0.0 AWS no instance ssh key #1981

Closed
monder opened this issue May 29, 2017 · 6 comments
Closed

1423.0.0 AWS no instance ssh key #1981

monder opened this issue May 29, 2017 · 6 comments

Comments

@monder
Copy link

@monder monder commented May 29, 2017

Issue Report

Bug

Container Linux Version

1423.0.0

Environment

AWS ami-a8c2d6ce

Expected Behavior

I can ssh to instance using the key specified in AWS

Actual Behavior

Instance asks for a password.

Reproduction Steps

Create a machine with with ignition config (From my experiments- it does not matter whats inside)

{
    "ignition": {
      "version": "2.0.0",
      "config": {}
    },
    "storage": {
      "files": [
      ]
    },
    "systemd": {
      "units": [
      ]
    },
    "networkd": {},
    "passwd": {}
  }

Try to ssh.

Other Information

@lucab
Copy link
Member

@lucab lucab commented May 31, 2017

@monder good catch, thanks for the report! I can confirm this is a regression in 1423.0.0 due to coreos-metadata 0.10.0.

For reference, the service responsible for populating ssh keys is coreos-metadata-sshkeys@core.service.

Here is the log from a failed instance:

May 31 14:04:31 localhost systemd[1]: Starting CoreOS Metadata Agent (SSH Keys)...
May 31 14:04:31 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/instance-id": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/public-ipv4": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/local-ipv4": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/hostname": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/meta-data/placement/availability-zone": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/dynamic/instance-identity/document": Attempt #1
May 31 14:04:32 localhost coreos-metadata[681]: Fetching "http://169.254.169.254/2009-04-04/public-keys": Attempt #1

Paging @dgonyeo to have a look into this.

From my perspective there are a bunch of issues here:

  • the last "public-keys" url is wrong, missing a "meta-data" fragment
  • the service is doing superflous fetches to endpoints
  • the service returns with success even though the HTTP endpoints returns a 404
  • for some reasons this is only triggered when passing an ignition config, but the same error occurs on every new alpha instance. I guess in the other case cloud-init still manages to populate the ssh key.
@monder
Copy link
Author

@monder monder commented May 31, 2017

Also, shouldn't that service have network-online.target since it does require network.

@dgonyeo
Copy link

@dgonyeo dgonyeo commented May 31, 2017

@lucab I made a PR to fix this, and on your comments:

the service is doing superfluous fetches to endpoints

I'm not sure which fetches in here are unnecessary. Each piece of metadata lives at a different endpoint in the metadata service. I can remove unnecessary calls if you can explain a little more

the service returns with success even though the HTTP endpoints returns a 404

A 404 is expected if the user has not configured an SSH key, so this error is swallowed. I'll see if I can find a way to differentiate between "no ssh keys" and "there's a legitimate error".

@lucab
Copy link
Member

@lucab lucab commented May 31, 2017

I'm not sure which fetches in here are unnecessary. Each piece of metadata lives at a different endpoint in the metadata service. I can remove unnecessary calls if you can explain a little more

But this specific flag only takes care of the ssh key, so why re-fetching everything? Or maybe not and I misunderstood how this was designed to work.

A 404 is expected if the user has not configured an SSH key, so this error is swallowed.

Ah, having a single status for empty and missing endpoint is unfortunate.

@bgilbert
Copy link
Member

@bgilbert bgilbert commented Jun 1, 2017

We've released Container Linux 1430.0.0 to the alpha channel. It works around this issue by reverting the coreos-metadata change.

@dgonyeo
Copy link

@dgonyeo dgonyeo commented Jun 2, 2017

Oh I see what you're talking about with the extra fetches. Fixing that would require rearchitecting coreos-metadata, since there's the single FetchMetadata function for every provider, which is definitely outside the scope of this issue. Something worth doing once I have the time, I guess.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

4 participants
You can’t perform that action at this time.