New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLOCAL not re-set upon disconnection #6
Comments
One note regarding the patch.... I left the: |
I don't think setting CLOCAL unconditionally on disconnection is the right thing to do for all circumstances. pppd should be restoring the termios settings back to the way they were when pppd started. That happens in restore_tty, called from finish_tty, called from cleanup_tty, which is the tty channels cleanup function, called as (*the_channel->cleanup)(). Is that not happening properly? Or is it that when NM launches pppd, it (or MM) has already cleared CLOCAL? |
NM launches pppd although it doesn't play with the tty beforehand. It is MM the one playing with the tty, and MM always sets CLOCAL. The thing here is that pppd would unset CLOCAL at startup, but never re-sets it if pppd exits via e.g. SIGTERM. Current flow is as follows:
While, IMO, the flow should be:
|
OK, then as I said the place where the original CLOCAL setting should be getting restored is in restore_tty(), and the question is why that isn't working. Can you do some investigation here? Maybe put some logging in restore_tty to see if it is ever getting called, for instance? If it is getting called, what is the value of inittermios.c_cflag? |
I'll check that. Now that I see the code flow you're right, the inittermios setting should be recovered unless there's an error somewhere. I'll try to look at that. But, still, isn't it a bit strange that the: |
Ah, wait, it does make sense that it only runs if there is a disconnect script; you don't want control lines acting while the disconnect script is running; same for the connection sequence if there is a connector or initializer script... right? |
Ok, so I think I now understood why I was seeing the issue... So, the problem is not pppd. The problem is that NetworkManager assumes that it is safe to start using the tty again when "PHASE_DISCONNECT" state is reached, and therefore, upon this state, it will tell ModemManager to disconnect and recover control of the port. Now, upon PHASE_DISCONNECT, pppd still didn't recover the original termios settings; and actually it will do that more than 1 full second after the disconnect phase (due to some sleep(1) during cleanup). This means that it is likely that ModemManager may end up reaching a state in which it starts processing input from the tty while CLOCAL is still unset. Actually... this will only happen in ModemManager 0.6.x, which is where I saw the issue, not in the latest stables. But anyway, NetworkManager should be modified to only pass control of the tty to ModemManager once pppd has really gone to the DEAD phase or even really exited. I'll close the bug, and reopen if I find any other related issue. |
If no disconnection script provided and pppd being halted with SIGTERM, CLOCAL is not re-set during disconnect_tty(). This leads to some nasty race conditions during disconnection of mobile broadband modems in NetworkManager/ModemManager.
See patch at:
https://aleksander.es/patches/0001-pppd-ensure-CLOCAL-is-set-upon-disconnection.patch
Some more background....
pppd will by default unset CLOCAL in the tty termios when connected, in order to use modem control lines. But when pppd is stopped with SIGTERM, CLOCAL is not being reset in the tty.
NetworkManager and ModemManager use pppd to setup the mobile broadband modem connections. Once MM sets the tty in connected mode, NM launches pppd in the tty and the session is established. When the user requests to stop the connection, NM sends a SIGTERM to pppd to get disconnected. Once pppd performs its own disconnection sequence it exits, and once pppd exits, NM will tell MM to handle the disconnection.
Now, if pppd doesn't re-set CLOCAL before exiting, there will be some time (between pppd exit and MM re-setting CLOCAL) where CLOCAL is unset and modem control lines are in effect. If the kernel gets the DCD input control line to low during that time, it may end up launching a full hungup of the device, see Linux kernel's cdc-acm driver snippet:
So, rather than exiting with CLOCAL unset, make sure it is always re-set. It currently was only being re-set in non-hungup cases if there was a disconnection script provided.
Is this the correct approach to solve the issue? Or am I missing something?
Cheers!
The text was updated successfully, but these errors were encountered: