Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLOCAL not re-set upon disconnection #6

Closed
aleksander0m opened this issue Apr 23, 2014 · 7 comments
Closed

CLOCAL not re-set upon disconnection #6

aleksander0m opened this issue Apr 23, 2014 · 7 comments

Comments

@aleksander0m
Copy link

If no disconnection script provided and pppd being halted with SIGTERM, CLOCAL is not re-set during disconnect_tty(). This leads to some nasty race conditions during disconnection of mobile broadband modems in NetworkManager/ModemManager.

See patch at:
https://aleksander.es/patches/0001-pppd-ensure-CLOCAL-is-set-upon-disconnection.patch

Some more background....
pppd will by default unset CLOCAL in the tty termios when connected, in order to use modem control lines. But when pppd is stopped with SIGTERM, CLOCAL is not being reset in the tty.

NetworkManager and ModemManager use pppd to setup the mobile broadband modem connections. Once MM sets the tty in connected mode, NM launches pppd in the tty and the session is established. When the user requests to stop the connection, NM sends a SIGTERM to pppd to get disconnected. Once pppd performs its own disconnection sequence it exits, and once pppd exits, NM will tell MM to handle the disconnection.

Now, if pppd doesn't re-set CLOCAL before exiting, there will be some time (between pppd exit and MM re-setting CLOCAL) where CLOCAL is unset and modem control lines are in effect. If the kernel gets the DCD input control line to low during that time, it may end up launching a full hungup of the device, see Linux kernel's cdc-acm driver snippet:

case USB_CDC_NOTIFY_SERIAL_STATE:
    newctrl = get_unaligned_le16(data);

    if (!acm->clocal && (acm->ctrlin & ~newctrl & ACM_CTRL_DCD)) {
        dev_dbg(&acm->control->dev, "%s - calling hangup\n", __func__);
        tty_port_tty_hangup(&acm->port, false);
    }

So, rather than exiting with CLOCAL unset, make sure it is always re-set. It currently was only being re-set in non-hungup cases if there was a disconnection script provided.

Is this the correct approach to solve the issue? Or am I missing something?

Cheers!

@aleksander0m
Copy link
Author

One note regarding the patch.... I left the:
stop_charshunt(NULL, 0);
call also to be run only if disconnect_script != NULL, not sure if that should also be changed so that it is executed regardless of whether we have a disconnect script or not.

@aleksander0m aleksander0m changed the title CLOCAL not unset upon disconnection CLOCAL not re-set upon disconnection Apr 23, 2014
@paulusmack
Copy link
Collaborator

I don't think setting CLOCAL unconditionally on disconnection is the right thing to do for all circumstances.

pppd should be restoring the termios settings back to the way they were when pppd started. That happens in restore_tty, called from finish_tty, called from cleanup_tty, which is the tty channels cleanup function, called as (*the_channel->cleanup)().

Is that not happening properly? Or is it that when NM launches pppd, it (or MM) has already cleared CLOCAL?

@aleksander0m
Copy link
Author

NM launches pppd although it doesn't play with the tty beforehand. It is MM the one playing with the tty, and MM always sets CLOCAL. The thing here is that pppd would unset CLOCAL at startup, but never re-sets it if pppd exits via e.g. SIGTERM. Current flow is as follows:

  • MM sets CLOCAL in tty. CLOCAL=1: control lines ignored
  • MM connects tty
  • NM launches pppd in tty
  • pppd unsets CLOCAL in tty. CLOCAL=0: control lines in effect
  • killall -TERM pppd
  • pppd exits without restoring CLOCAL
  • DCD control line to LOW
  • kernel detects DCD low and issues port Hangup
  • ModemManager detects Hangup in port, discards port

While, IMO, the flow should be:

  • MM sets CLOCAL in tty. CLOCAL=1: control lines ignored
  • MM connects tty
  • NM launches pppd in tty
  • pppd unsets CLOCAL in tty. CLOCAL=0: control lines in effect
  • killall -TERM pppd
  • pppd restores CLOCAL in tty.CLOCAL=1: control lines ignored
  • pppd exits
  • DCD control line to LOW
  • kernel detects DCD low but ignores it (as CLOCAL=1)

@paulusmack
Copy link
Collaborator

OK, then as I said the place where the original CLOCAL setting should be getting restored is in restore_tty(), and the question is why that isn't working.

Can you do some investigation here? Maybe put some logging in restore_tty to see if it is ever getting called, for instance? If it is getting called, what is the value of inittermios.c_cflag?

@aleksander0m
Copy link
Author

I'll check that. Now that I see the code flow you're right, the inittermios setting should be recovered unless there's an error somewhere. I'll try to look at that.

But, still, isn't it a bit strange that the:
if (real_ttyfd >= 0)
set_up_tty(real_ttyfd, 1);
lines in disconnect_tty() are only run if disconnect_script != NULL? Or is that on purpose?

@aleksander0m
Copy link
Author

Ah, wait, it does make sense that it only runs if there is a disconnect script; you don't want control lines acting while the disconnect script is running; same for the connection sequence if there is a connector or initializer script... right?

@aleksander0m
Copy link
Author

Ok, so I think I now understood why I was seeing the issue...

So, the problem is not pppd. The problem is that NetworkManager assumes that it is safe to start using the tty again when "PHASE_DISCONNECT" state is reached, and therefore, upon this state, it will tell ModemManager to disconnect and recover control of the port. Now, upon PHASE_DISCONNECT, pppd still didn't recover the original termios settings; and actually it will do that more than 1 full second after the disconnect phase (due to some sleep(1) during cleanup).

This means that it is likely that ModemManager may end up reaching a state in which it starts processing input from the tty while CLOCAL is still unset. Actually... this will only happen in ModemManager 0.6.x, which is where I saw the issue, not in the latest stables. But anyway, NetworkManager should be modified to only pass control of the tty to ModemManager once pppd has really gone to the DEAD phase or even really exited.

I'll close the bug, and reopen if I find any other related issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants