Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to boot with torcx remotes #2562

Open
ajonkisz opened this Issue Mar 5, 2019 · 1 comment

Comments

Projects
None yet
2 participants
@ajonkisz
Copy link

ajonkisz commented Mar 5, 2019

Issue Report

Bug

System fails to boot when specifying custom Torcx remotes. I have followed the documentation from remotes.md and two, not yet merged, PRs hosting custom remotes and using custom remotes.

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=2023.4.0
VERSION_ID=2023.4.0
BUILD_ID=2019-02-26-0032
PRETTY_NAME="Container Linux by CoreOS 2023.4.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

Container Linux is run under QEMU/KVM and ignition config

Expected Behavior

Download specified torcx remote package and install it. Or in a case of failure for the system to still boot, so a user could investigate logs to see what went wrong.

Actual Behavior

System fails to boot. It halts on:

[    2.300286] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[    2.335825]  vda: vda1 vda2 vda3 vda4 vda6 vda7 vda9
[    2.387172] EXT4-fs (vda6): mounted filesystem with ordered data mode. Opts: (null)
[    2.558956] EXT4-fs (vda9): mounted filesystem with ordered data mode. Opts: (null)
[    2.573363] EXT4-fs (dm-0): mounted filesystem without journal. Opts: (null)

Looking at my server logs I can see that no requests are being made to the server. I have tried http and https protocols. Though with https I am concerned the certs might not be updated in time for fetching torcx remotes. I couldn't find any documentation on how to enable http.

The main issue here is that system fails to boot, so it is difficult for me to tell what is actually going wrong.

As per torcx-profile-populate.service I have verified on a healthy system (without remote in profile manifest) by modifying the manifest and running /usr/lib/coreos/torcx profile populate -v=debug that the package is successfully downloaded

Reproduction Steps

  1. Add user manifest with remote reference using ignition config
  2. With "remote": "com.example.my-remote" removed from manifest system starts up, but no remotes are downloaded, as expected, so my package cannot be installed

Other Information

Ignition config with unescaped strings for readability:

{
  "ignition": {
    "version": "2.2.0"
  },
  "storage": {
    "files": [
      {
        "filesystem": "root",
        "path": "/etc/ssl/certs/ca.pem",
        "mode": 640,
        "contents": {
          "source": "data:text/plain;charset=utf-8;base64,..."
        }
      },
      {
        "filesystem": "root",
        "path": "/etc/ssl/certs/client.pem",
        "mode": 640,
        "contents": {
          "source": "data:text/plain;charset=utf-8;base64,..."
        }
      },
      {
        "filesystem": "root",
        "path": "/etc/torcx/remotes/com.example.my-remote/remote.json",
        "mode": 640,
        "contents": {
          "source": "data:text/plain;charset=utf-8,{"kind":"remote-manifest-v0","value":{"base_url":"http://192.168.83.104/","keys":[{"armored_keyring":"B9231241A9415806A8969EDFFC52EB831521A606.pgp.asc"}]}}"
        }
      },
      {
        "filesystem": "root",
        "path": "/etc/torcx/remotes/com.example.my-remote/B9231241A9415806A8969EDFFC52EB831521A606.pgp.asc",
        "mode": 640,
        "contents": {
          "source": "data:text/plain;charset=utf-8;base64,..."
        }
      },
      {
        "filesystem": "root",
        "path": "/etc/torcx/next-profile",
        "mode": 640,
        "contents": {
          "source": "data:,some_manifest"
        }
      },
      {
        "filesystem": "root",
        "path": "/etc/torcx/profiles/some_manifest.json",
        "mode": 640,
        "contents": {
          "source": "data:text/plain;charset=utf-8;base64,
{
    "kind": "profile-manifest-v1",
    "value": {
        "images": [
            {
                "name": "cri-containerd",
                "reference": "1.2.4",
                "remote": "com.example.my-remote"
            }
        ]
    }
}
"
        }
      }
    ]
  }
}
@lucab

This comment has been minimized.

Copy link
Member

lucab commented Mar 12, 2019

I can't say for sure, but it looks like the boot may be stuck due to connectivity issues. Do you see any DNS/HTTP traffic going out from that node?

Can you try to set TORCX_VERBOSE=debug via /etc/environment and see if it provides more insights?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.