New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignition times out creating FS on large disk #2026

Closed
drcrallen opened this Issue Jun 28, 2017 · 4 comments

Comments

Projects
None yet
4 participants
@drcrallen

drcrallen commented Jun 28, 2017

Issue Report

Bug

On large drives file creation can take a long time, so ignition startup can timeout depending on if the file system is setup as a fast-format filesystem (like xfs) or a slower one (like ext4).

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1409.5.0+2017-06-25-1823
VERSION_ID=1409.5.0
BUILD_ID=2017-06-25-1823
PRETTY_NAME="Container Linux by CoreOS 1409.5.0+2017-06-25-1823 (Ladybug)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

AWS EC2 - X1.32xlarge + sc1 EBS

Expected Behavior

Timeouts for filesystem creation are tunable or configurable for when the default of 90s is insufficient.

Actual Behavior

Timeouts via dev-mapper-usr.device causing the system to fail to boot.

[�[0m�[0;31m*     �[0m] (1 of 2) A start job is running for…-mapper-usr.device (7s / 1min 30s)
...SNIP....
�[K[�[0;1;31m TIME �[0m] Timed out waiting for device dev-mapper-usr.device.
[  100.990617] systemd[1]: dev-mapper-usr.device: Job dev-mapper-usr.device/start timed out.
[�[0;1;33mDEPEND�[0m[  100.998658] systemd[1]: Timed out waiting for device dev-mapper-usr.device.

Systemd docs say the default timeout on ****.device is 90 seconds.

This is only when using ext4. When using XFS it finishes the step as expected (within the timeout) and continues to boot.

Reproduction Steps

  1. Create an ebs backed instance with a large array of big disks (16 ct 10TB sc1 in our case)
  2. Setup the disks to be part of a md stripe array.
  3. Try and boot specifying ext4 as the file system on the md array
  4. Fail to boot and constantly restart
  5. Terminate instance, change config to xfs and it will succeed.

Other Information

Only fails when using ext4. XFS works fine.

@himadrisingh001

This comment has been minimized.

Show comment
Hide comment
@himadrisingh001

himadrisingh001 Jul 12, 2017

its failing for xfs too.

himadrisingh001 commented Jul 12, 2017

its failing for xfs too.

@crawford

This comment has been minimized.

Show comment
Hide comment
@crawford

crawford Jul 12, 2017

Member

It looks like we need to increase the device timeout for AWS images. I have no idea what a reasonable amount would be. @euank do you have any thoughts? 5 minutes? 3 years?

Member

crawford commented Jul 12, 2017

It looks like we need to increase the device timeout for AWS images. I have no idea what a reasonable amount would be. @euank do you have any thoughts? 5 minutes? 3 years?

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Jul 19, 2017

Member

For the record, uninit_bg and lazy_itable_init appear to be enabled by default, so we can't use them to speed up mkfs.ext4.

Member

bgilbert commented Jul 19, 2017

For the record, uninit_bg and lazy_itable_init appear to be enabled by default, so we can't use them to speed up mkfs.ext4.

@bgilbert

This comment has been minimized.

Show comment
Hide comment
@bgilbert

bgilbert Jul 26, 2017

Member

This is fixed in coreos/bootengine#125, which should be included in 1492.0.0. Thanks for the report!

Member

bgilbert commented Jul 26, 2017

This is fixed in coreos/bootengine#125, which should be included in 1492.0.0. Thanks for the report!

@bgilbert bgilbert closed this Jul 26, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment