New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
assertion failure in route_handler() #7797
Comments
Oops, forgot to include the error: |
cc @ssahani |
Could you please paste the backtrace. |
|
sorry |
It's arch linux, getting a (useful) backtrace is a royal pita. |
If you base that on me not responding immediately then please know that I build from git, but used the same options as the package which includes a release build. But hey, sure let's make fun of distros for no reason at all. Just because users are dicks and mock systemd whenever they get a chance, you don't have to be. BTT:
|
I am trying to reproduce it no luck . Could you paste your conf or reproducer . |
I have a couple configs, so here are all of them: https://paste.xinu.at/m-9JI/
In case you want to download the configs: https://paste.xinu.at/m-9JI/tar |
I've tried to reduce the test case and came up with the following simple setup having only once real device and one bridge: https://paste.xinu.at/m-mSz/ (tarball: https://paste.xinu.at/m-mSz/tar) It works if either, br0 does not exist ( |
I will try it tomorrow Sorry it's late now . Thank you :) |
No problem. Thanks for looking into it. |
I have reproduced it working on this fix. We don;'t expect link state to be configured while entering into route_handler
|
@Bluewind don't take my comment as offensive: it was a simple statement of fact. Getting a usable backtrace on arch is more complicated, when using distro packages, because debug symbols are not readily available. E.g. in Fedora I can do |
Fair enough. We are working on improving that, but I can't tell you a timeframe. |
Now we don't update the link state while the carrier is lost. For example in if we are setting the routes and the carrier goes down then the call back route_hadler think that everyting is ok and updates the state. Again when the carrier comes back we find ourself LINK_STATE_CONFIGURED which is not the right state. Closes systemd#7797
Looks like that solves the problem. Thanks! |
commit 7715629 (networkd: Fix race condition in [RoutingPolicyRule] handling (systemd#7615)). Does not fix race. Still there is a race in case of bride because the bride goes down and up . calling route_configure then link_set_routing_policy_rule and the link_check_ready makes a race between routing_policy_rule_messages and route_messages. While bride comes up and we call the call again route_configure if finds it self in the callback function LINK_STATE_CONFIGURED networkd dies. Let's handle first routing policy rules then route_configure. This fixes the crash. Closes systemd#7797
) commit 7715629 (networkd: Fix race condition in [RoutingPolicyRule] handling (#7615)). Does not fix race. Still there is a race in case of bride because the bride goes down and up . calling route_configure then link_set_routing_policy_rule and the link_check_ready makes a race between routing_policy_rule_messages and route_messages. While bride comes up and we call the call again route_configure if finds it self in the callback function LINK_STATE_CONFIGURED networkd dies. Let's handle first routing policy rules then route_configure. This fixes the crash. Closes #7797
…stemd#7815) commit 7715629 (networkd: Fix race condition in [RoutingPolicyRule] handling (systemd#7615)). Does not fix race. Still there is a race in case of bride because the bride goes down and up . calling route_configure then link_set_routing_policy_rule and the link_check_ready makes a race between routing_policy_rule_messages and route_messages. While bride comes up and we call the call again route_configure if finds it self in the callback function LINK_STATE_CONFIGURED networkd dies. Let's handle first routing policy rules then route_configure. This fixes the crash. Closes systemd#7797 (cherry picked from commit 27c34f7)
Submission type
systemd version the issue has been seen with
236 and current git master
235 worked fine and downgrading fixes the issue
Used distribution
Arch Linux
In case of bug report: Expected behaviour you didn't see
network working after suspend/resume
In case of bug report: Unexpected behaviour you saw
networkd crashes after suspend and continues to crash leaving the network mostly down
In case of bug report: Steps to reproduce the problem
suspend machine with networkd or simply start networkd once the machine is running
I've bisected it to commit 7715629 (networkd: Fix race condition in [RoutingPolicyRule] handling (#7615)).
I'm not sure what's wrong with the assertion and I don't know which configuration files may affect this. If you want, I can upload all my networkd configs, but I have a couple (including vlans and bridges).
The problem does not appear on system boot, but only once the system has been suspended. I didn't check if it also crashes if networkd is restarted prior to suspend.
The text was updated successfully, but these errors were encountered: