New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container Linux by CoreOS stable (1409.2.0) - OpenStack fails to boot when Ignition config is provided #2014

Closed
pteichner opened this Issue Jun 21, 2017 · 10 comments

Comments

Projects
None yet
4 participants
@pteichner

pteichner commented Jun 21, 2017

Issue Report

Bug

Container Linux by CoreOS stable (1409.2.0) - OpenStack fails to boot when Ignition config is provided but incorrect

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1409.2.0
VERSION_ID=1409.2.0
BUILD_ID=2017-06-19-2321
PRETTY_NAME="Container Linux by CoreOS 1409.2.0 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

OpenStack

Expected Behavior

OS should at least boot

Actual Behavior

it goes into bootloop or recovery via console

Reproduction Steps

  1. Using ansible provide ignition config in the user data field
  2. Create the instance

Other Information

Ansible / OpenStack seems to incorrectly format the JSON file and thus journalctl is reporting the file to be incorrect. However it seems wrong to fail to start the OS as a result

@euank

This comment has been minimized.

Show comment
Hide comment
@euank

euank Jun 21, 2017

Contributor

This is operating as intended. Ignition will intentionally drop to an emergency shell if the config is syntactically or logically invalid.
The thought process is roughly that it's better for the user of Container Linux to know something's wrong than for the machine to boot, but be in an inconsistent unknown state which might be more difficult to observe.

I skimmed our docs, and we really don't make this clear enough though. @dgonyeo, am I missing something? The most glaring omission seems to be that troubleshooting says nothing about it.

Contributor

euank commented Jun 21, 2017

This is operating as intended. Ignition will intentionally drop to an emergency shell if the config is syntactically or logically invalid.
The thought process is roughly that it's better for the user of Container Linux to know something's wrong than for the machine to boot, but be in an inconsistent unknown state which might be more difficult to observe.

I skimmed our docs, and we really don't make this clear enough though. @dgonyeo, am I missing something? The most glaring omission seems to be that troubleshooting says nothing about it.

@crawford

This comment has been minimized.

Show comment
Hide comment
@crawford

crawford Jun 21, 2017

Member

@euank it's mentioned in the second paragraph of https://coreos.com/ignition/docs/0.14.0/getting-started.html. We could probably make that more obvious though.

Member

crawford commented Jun 21, 2017

@euank it's mentioned in the second paragraph of https://coreos.com/ignition/docs/0.14.0/getting-started.html. We could probably make that more obvious though.

@pteichner

This comment has been minimized.

Show comment
Hide comment
@pteichner

pteichner Jun 21, 2017

I know this is not the right forum to ask but I have a bit of an ambiguous behaviour in openstack with CoreOS when I specified a very basic ignition:

  1. Ansible is screwing up the format of the igntion config therefore the OS wont start. That's one issue

  2. Using Mirantis UI I pasted the same ignition config and now I can't log in to the server. I expect that the keys have not been applied - even though I read somewhere that the automatic metadata config via cloud init should still happen

{
  "ignition": {
    "version": "2.0.0",
    "config": {}
  },
  "storage": {},
  "systemd": {
    "units": [
      {
        "name": "coreos-metadata.service",
        "dropins": [
          {
            "name": "20-clct-provider-override.conf",
            "contents": "[Service]\nEnvironment=COREOS_METADATA_OPT_PROVIDER=--provider=openstack-metadata"
          }
        ]
      },
      {
        "name": "docker-tcp.socket",
        "enable": true,
        "contents": "[Unit]\nDescription=Docker Socket for the API\n\n[Socket]\nListenStream=2375\nBindIPv6Only=both\nService=docker.service\n\n[Install]\nWantedBy=sockets.target\n"
      }
    ]
  },
  "networkd": {},
  "passwd": {}
}

pteichner commented Jun 21, 2017

I know this is not the right forum to ask but I have a bit of an ambiguous behaviour in openstack with CoreOS when I specified a very basic ignition:

  1. Ansible is screwing up the format of the igntion config therefore the OS wont start. That's one issue

  2. Using Mirantis UI I pasted the same ignition config and now I can't log in to the server. I expect that the keys have not been applied - even though I read somewhere that the automatic metadata config via cloud init should still happen

{
  "ignition": {
    "version": "2.0.0",
    "config": {}
  },
  "storage": {},
  "systemd": {
    "units": [
      {
        "name": "coreos-metadata.service",
        "dropins": [
          {
            "name": "20-clct-provider-override.conf",
            "contents": "[Service]\nEnvironment=COREOS_METADATA_OPT_PROVIDER=--provider=openstack-metadata"
          }
        ]
      },
      {
        "name": "docker-tcp.socket",
        "enable": true,
        "contents": "[Unit]\nDescription=Docker Socket for the API\n\n[Socket]\nListenStream=2375\nBindIPv6Only=both\nService=docker.service\n\n[Install]\nWantedBy=sockets.target\n"
      }
    ]
  },
  "networkd": {},
  "passwd": {}
}
@crawford

This comment has been minimized.

Show comment
Hide comment
@crawford

crawford Jun 21, 2017

Member

even though I read somewhere that the automatic metadata config via cloud init should still happen

coreos-cloudinit will not run if you specify an Ignition config. Instead coreos-metadata has taken over this role. On OpenStack, you'll need to specify the source of truth for coreos-metadata (currently, only the network metadata service is supported). We recommend using CT, as detailed in https://coreos.com/os/docs/latest/provisioning.html, for creating these configs. CT will automatically validate your configs and throw in any of the platform-specific configs you need.

Member

crawford commented Jun 21, 2017

even though I read somewhere that the automatic metadata config via cloud init should still happen

coreos-cloudinit will not run if you specify an Ignition config. Instead coreos-metadata has taken over this role. On OpenStack, you'll need to specify the source of truth for coreos-metadata (currently, only the network metadata service is supported). We recommend using CT, as detailed in https://coreos.com/os/docs/latest/provisioning.html, for creating these configs. CT will automatically validate your configs and throw in any of the platform-specific configs you need.

@pteichner

This comment has been minimized.

Show comment
Hide comment
@pteichner

pteichner Jun 21, 2017

@crawford Just so I understand it fully: the automatic configuration will not happen unless I explictly include it?

I've used the latest CT from github and the above line is what was split out to me. Do I need to include the key location in the yaml file?

systemd:
  units:
    - name: docker-tcp.socket
      enable: true
      contents: |
        [Unit]
        Description=Docker Socket for the API

        [Socket]
        ListenStream=2375
        BindIPv6Only=both
        Service=docker.service

        [Install]
        WantedBy=sockets.target

pteichner commented Jun 21, 2017

@crawford Just so I understand it fully: the automatic configuration will not happen unless I explictly include it?

I've used the latest CT from github and the above line is what was split out to me. Do I need to include the key location in the yaml file?

systemd:
  units:
    - name: docker-tcp.socket
      enable: true
      contents: |
        [Unit]
        Description=Docker Socket for the API

        [Socket]
        ListenStream=2375
        BindIPv6Only=both
        Service=docker.service

        [Install]
        WantedBy=sockets.target
@crawford

This comment has been minimized.

Show comment
Hide comment
@crawford

crawford Jun 21, 2017

Member

OpenStack is a very special case in this setup. The reason it is special is, unlike most other environments, the source of truth is configurable. In some installations, the network metadata service is available. In others, the config-drive is available. Still more, there are some with both. From experience, we've seen that the config-drive has issues with dynamic data and because of that, we've gone the route of explicitly specifying the source of truth instead of implicitly discovering it. That being said, Ignition is unable to receive any direction about which source of truth to use, so it will look in both places for a little while. This is generally okay because Ignition only ever runs once. We didn't want to take the same performance hit for coreos-metadata, so we made it explicit (--provider=openstack-metadata or (in the future) --provider=openstack-configdrive). By default, coreos-metadata just reuses the coreos.oem.id from the kernel command line, but in the case of OpenStack, this doesn't work since the ID is openstack (which doesn't match one of the above).

You will need to include one more option in your Ignition config:

{
  "systemd": {
    "units": [{
      "name": "coreos-metadata.service",
      "dropins": [{
        "name": "20-clct-provider-override.conf",
        "contents": "[Service]\nEnvironment=COREOS_METADATA_OPT_PROVIDER=--provider=openstack-metadata",
      }]
    }]
  }
}

As I mentioned before, CT will do this automatically if you pass it --platform=openstack-metadata. Using your example config, I get the following:

{
  "ignition": {
    "version": "2.0.0",
    "config": {}
  },
  "storage": {},
  "systemd": {
    "units": [
      {
        "name": "coreos-metadata.service",
        "dropins": [
          {
            "name": "20-clct-provider-override.conf",
            "contents": "[Service]\nEnvironment=COREOS_METADATA_OPT_PROVIDER=--provider=openstack-metadata"
          }
        ]
      },
      {
        "name": "docker-tcp.socket",
        "enable": true,
        "contents": "[Unit]\nDescription=Docker Socket for the API\n\n[Socket]\nListenStream=2375\nBindIPv6Only=both\nService=docker.service\n\n[Install]\nWantedBy=sockets.target\n"
      }
    ]
  },
  "networkd": {},
  "passwd": {}
}
Member

crawford commented Jun 21, 2017

OpenStack is a very special case in this setup. The reason it is special is, unlike most other environments, the source of truth is configurable. In some installations, the network metadata service is available. In others, the config-drive is available. Still more, there are some with both. From experience, we've seen that the config-drive has issues with dynamic data and because of that, we've gone the route of explicitly specifying the source of truth instead of implicitly discovering it. That being said, Ignition is unable to receive any direction about which source of truth to use, so it will look in both places for a little while. This is generally okay because Ignition only ever runs once. We didn't want to take the same performance hit for coreos-metadata, so we made it explicit (--provider=openstack-metadata or (in the future) --provider=openstack-configdrive). By default, coreos-metadata just reuses the coreos.oem.id from the kernel command line, but in the case of OpenStack, this doesn't work since the ID is openstack (which doesn't match one of the above).

You will need to include one more option in your Ignition config:

{
  "systemd": {
    "units": [{
      "name": "coreos-metadata.service",
      "dropins": [{
        "name": "20-clct-provider-override.conf",
        "contents": "[Service]\nEnvironment=COREOS_METADATA_OPT_PROVIDER=--provider=openstack-metadata",
      }]
    }]
  }
}

As I mentioned before, CT will do this automatically if you pass it --platform=openstack-metadata. Using your example config, I get the following:

{
  "ignition": {
    "version": "2.0.0",
    "config": {}
  },
  "storage": {},
  "systemd": {
    "units": [
      {
        "name": "coreos-metadata.service",
        "dropins": [
          {
            "name": "20-clct-provider-override.conf",
            "contents": "[Service]\nEnvironment=COREOS_METADATA_OPT_PROVIDER=--provider=openstack-metadata"
          }
        ]
      },
      {
        "name": "docker-tcp.socket",
        "enable": true,
        "contents": "[Unit]\nDescription=Docker Socket for the API\n\n[Socket]\nListenStream=2375\nBindIPv6Only=both\nService=docker.service\n\n[Install]\nWantedBy=sockets.target\n"
      }
    ]
  },
  "networkd": {},
  "passwd": {}
}
@crawford

This comment has been minimized.

Show comment
Hide comment
@crawford

crawford Jun 21, 2017

Member

I just doubled checked our code to make sure that this is all plumbed correctly and discovered that it is not. As a workaround, you'll want to run one more program to get the keys (/usr/bin/coreos-metadata --provider=openstack-metadata --ssh-keys=core).

You can use the following to make that happen:

systemd:
  units:
    - name: coreos-metadata-sshkeys-workaround.service
      enable: true
      contents: |
        [Unit]
        Description=CoreOS Metadata Agent Workaround (SSH Keys)

        [Service]
        Type=oneshot
        ExecStart=/usr/bin/coreos-metadata --provider=openstack-metadata --ssh-keys=core

        [Install]
        RequiredBy=multi-user.target

/cc @dgonyeo can you fix this?

Member

crawford commented Jun 21, 2017

I just doubled checked our code to make sure that this is all plumbed correctly and discovered that it is not. As a workaround, you'll want to run one more program to get the keys (/usr/bin/coreos-metadata --provider=openstack-metadata --ssh-keys=core).

You can use the following to make that happen:

systemd:
  units:
    - name: coreos-metadata-sshkeys-workaround.service
      enable: true
      contents: |
        [Unit]
        Description=CoreOS Metadata Agent Workaround (SSH Keys)

        [Service]
        Type=oneshot
        ExecStart=/usr/bin/coreos-metadata --provider=openstack-metadata --ssh-keys=core

        [Install]
        RequiredBy=multi-user.target

/cc @dgonyeo can you fix this?

@dgonyeo

This comment has been minimized.

Show comment
Hide comment
@dgonyeo

dgonyeo commented Jun 21, 2017

@dgonyeo

This comment has been minimized.

Show comment
Hide comment
@dgonyeo

dgonyeo Jun 21, 2017

And the relevant ct PR to use the change: coreos/container-linux-config-transpiler#86

dgonyeo commented Jun 21, 2017

And the relevant ct PR to use the change: coreos/container-linux-config-transpiler#86

@pteichner

This comment has been minimized.

Show comment
Hide comment
@pteichner

pteichner Jun 21, 2017

Thanks for the fix in CT. When I generated it wasn't present in the output.

Sent from my ZTE A2017G using FastHub

pteichner commented Jun 21, 2017

Thanks for the fix in CT. When I generated it wasn't present in the output.

Sent from my ZTE A2017G using FastHub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment