-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flatcar fails to boot on AWS m4.* instance types #665
Comments
Releases with 5.15 fail differently, they can't find the root block device. xen-blkfront module is missing from initramfs, which would explain that. But also no sign of DHCP lease being acquired. |
This is the full list of modules that were present in Stable but are not in Alpha: |
I've isolated this to a kernel change, here's the bisect log so far:
so something between 5.15.10 and 5.15.11 is responsible. Im testing on the 5.15 kernel because I believe this will lead to the answer of what is wrong in Flatcar 3033.2.1. |
Full bisect points towards d8888cdabedf353ab9b5a6af75f70bf341a3e7df (or torvalds/linux@83dbf89 upstream):
|
In case helps, I have also observed this behavior on c4 instances while t2 instances seem like are unaffected by this issue ( instances boot normally ) |
@wincus c4 instances would have the same issue because they use the same Intel VF for enhanced networking. t2 instances are unaffected because they use Xen networking. I've already confirmed that reverting that commit fixes the issue in 5.15 and 5.10 based flatcar releases, so we will be reverting it in all channels while we work upstream to figure out what the actual bug is. |
At least one other report confirms the bisect: https://lore.kernel.org/lkml/Ydh5OCudJKz5Y7jc@arighi-desktop/. |
This PR is needed to fix the stable release: Beta/Alpha need: |
The fixes are present in all branches (3033/3139/3165/main), and will be part of all releases that we do next week. |
wow amazing work! thanks @jepio |
Description
Flatcar stable 3033.2.0 works, 3033.2.1 doesn't. The symptoms are no DHCP address being acquired on the network interface, so fetching instance metadata fails. Boot log shows more or less this:
Impact
[ 1 sentence detailing the impact this bug is creating for you ]
Environment and steps to reproduce
a. [ requested the start of a new pod or container ]
b. [ container image downloaded ]
Expected behavior
[ describe what you expected to happen at 4. above but instead got an error ]
Additional information
Please add any information here that does not fit the above format.
The text was updated successfully, but these errors were encountered: