-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
networkd-test.py test_transient_hostname is flaky #4753
Comments
This test fails sometimes but it is hard to reproduce, so we need more information what happens. Set journal log level to "debug" for the entirety of the test script, and show networkd's and hostnamed's journals on failure of test_transient_hostname(). This should help with tracking down issue systemd#4753.
This test fails sometimes but it is hard to reproduce, so we need more information what happens. Set journal log level to "debug" for the entirety of networkd-test.py, and show networkd's and hostnamed's journals on failure of the two test_transient_hostname* tests. This should help with tracking down issue systemd#4753.
This test fails sometimes but it is hard to reproduce, so we need more information what happens. Set journal log level to "debug" for the entirety of networkd-test.py, and show networkd's and hostnamed's journals and the DHCP server log on failure of the two test_transient_hostname* tests. Also sync the journal before querying it to get more precise output. This should help with tracking down issue systemd#4753.
This test fails sometimes but it is hard to reproduce, so we need more information what happens. Set journal log level to "debug" for the entirety of networkd-test.py, and show networkd's and hostnamed's journals and the DHCP server log on failure of the two test_transient_hostname* tests. Also sync the journal before querying it to get more precise output. This should help with tracking down issue systemd#4753.
This test fails sometimes but it is hard to reproduce, so we need more information what happens. Set journal log level to "debug" for the entirety of networkd-test.py, and show networkd's and hostnamed's journals and the DHCP server log on failure of the two test_transient_hostname* tests. Also sync the journal before querying it to get more precise output. This should help with tracking down issue #4753.
PR #4754 landed now, so the next time we see this failure we should get some clue what goes wrong. |
@martinpitt ,
|
yay, hit from PR #4694, the s390x one from @evverx above and the amd64 one too. Unfortunately the networkd journal is definitively cut short in both logs, and the amd64 one even has a truncated hostnamed log. But at least we know that in both cases dnsmasq actually handed out the expected IP and host name, and in the s390x log hostnamed confirmed the So as the next step I could add some timeout loop (say, 5 seconds?) which waits for hostnamectl to show the expected transient hostname. We would then at least see if this is just a race condition (we look too early) or a permanently wrong one. What I'm not sure about is whether the retry loop is actually justified, or would just hide some bug. |
Sometimes setting the transient hostname does not happen synchronously, so retry up to five times. It is not yet clear whether this is legitimate behaviour or an underlying bug, but this will at least show whether the wrong transient hostname is just a race condition or permanently wrong. See issue systemd#4753
@martinpitt I think this
means that the
means that the hostname was changed at So, yes
makes sense |
Do you know if this fix has been backported to I'm using Most of the time, everything works, but not always. |
Sometimes the recently introduced test for setting the transient host name fails:
full log, also seen on other architectures.
In the above case the transient host name did get set, but is wrong (it's the real host name).
It requires further information to debug this as this is hard to reproduce. It could be a bug in the test or some race condition between networkd and hostnamed. As a starter I'll make the test show journal output on failure.
The text was updated successfully, but these errors were encountered: