Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Container Linux frequently hangs early during boot #2593

Open
pspiller opened this issue Jun 19, 2019 · 1 comment

Comments

Projects
None yet
2 participants
@pspiller
Copy link

commented Jun 19, 2019

Issue Report

Bug

Container Linux usually gets stuck early during the boot process, at the following point:

  Booting 'CoreOS default'

Early console in extract_kernel
input_data: 0x00000000025483b4
input_len: 0x000000000292baf4
output: 0x0000000001000000
kernel_total_size: 0x0000000003a57000
booted via startup_32()
Physical KASLR using RDRAND RDTSC...
Virtual KASLR using RDRAND RDTSC...

Decompressing Linux... Parsling ELF... Performing relocations... done.
Booting the kernel.

I've left it for minutes at this point without any activity.

The server does successfully boot sometimes, but it often takes 5-10 restarts before this happens.

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=2079.6.0
VERSION_ID=2079.6.0
BUILD_ID=2019-06-11-0821
PRETTY_NAME="Container Linux by CoreOS 2079.6.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

We've also seen this behaviour on older versions, going back to at least 1855.4.0.

Environment

Bare metal, MSI Cubi 2
Intel Celeron CPU 3865U @ 1.80GHz
16GB RAM

Expected Behavior

Container Linux proceeds through the boot process as normal.

Actual Behavior

Container Linux gets stuck early during the boot process.

Reproduction Steps

No specific steps required. Rebooting the server a few times will trigger the issue.

Other Information

I've tried adding debug=vc to the kernel command line, but didn't get any extra output. I also tried nokaslr in case kernel ASLR was somehow causing an issue, but it still got stuck.

I believe this started happening some time after I first started using Container Linux on these machines approximately a year ago. Unfortunately I don't have it running on any other hardware to compare behaviour.

If anybody can suggest anything I can do to debug further, that'd be very helpful.

@ajeddeloh

This comment has been minimized.

Copy link

commented Jun 25, 2019

Thanks for the report! My gut says this is grub bug considering there's nothing printed by the kernel. Are you UEFI booting or BIOS booting?

You might try playing around with various grub debugging levels. You can set them with set debug=<keyword>" (see the grub docs). Our fork of grub (which you can grep through for grub_dprintf() to find various keywords) is at https://github.com/coreos/grub.

Unfortunately we haven't seen any of this on our tests or hardware, which makes it hard to debug from our side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.