Skip to content
This repository has been archived by the owner on Jan 17, 2020. It is now read-only.

tokio2: Disconnect after 1 minute #76

Closed
dbrgn opened this issue Feb 4, 2018 · 13 comments
Closed

tokio2: Disconnect after 1 minute #76

dbrgn opened this issue Feb 4, 2018 · 13 comments

Comments

@dbrgn
Copy link

dbrgn commented Feb 4, 2018

I'm experiencing a disconnect from the server after a bit more than 1 minute.

Here's a wireshark trace:

screenshot

(If required I can also send more details or a pcap file)

The ping messages are sent at t = 25, 35 (+10), 50 (+15), 60 (+10), 75 (+15). First of all, it seems a bit strange that the intervals are not consistent. Then, after sending the ping request at t=75, the connection is immediately closed by the server.

In the ping message itself I cannot find anything suspicious. There's no counter that could go out of sync or something like that, is there?

img

@tekjar
Copy link

tekjar commented Feb 5, 2018

@dbrgn What is your keep alive time?. Can you send rumqtt debug logs

@dbrgn
Copy link
Author

dbrgn commented Feb 5, 2018

Keep alive time is currently 10. (I'll probably increase that, it was mostly for testing.)

    let client_options = MqttOptions::new(client_id, "eu.thethings.network:1883".to_string())
            .unwrap_or_else(|e| {
                println!("Could not initialize MqttOptions: {}", e);
                exit(3);
            })
            .set_keep_alive(10)
            .set_reconnect_opts(ReconnectOptions::AfterFirstSuccess(10))
            .set_security_opts(SecurityOptions::UsernamePassword((
                conf.ttn_app_id.clone(),
                conf.ttn_access_key.clone(),
            )));

And here's the log:

 INFO 2018-02-05T07:58:24Z: rumqtt::client::connection: Address resolved to Some(V4(52.169.76.255:1883))
 INFO 2018-02-05T07:58:24Z: rumqtt::client::connection: mqtt connection successful
DEBUG 2018-02-05T07:58:24Z: rumqtt::client::connection: Sending packet. Subscribe(Subscribe { pid: PacketIdentifier(1), topics: [SubscribeTopic { topic_path: "+/devices/+/activations", qos: AtMostOnce }, SubscribeTopic { topic_path: "+/devices/+/up", qos: AtMostOnce }] })
DEBUG 2018-02-05T07:58:24Z: rumqtt::client::connection: Received packet. "Suback(Suback { pid: PacketIdentifier(1), return_codes: [Success(AtMostOnce), Success(AtMostOnce)] })"
DEBUG 2018-02-05T07:58:29Z: rumqtt::client::state: Last outgoing before 5 seconds. Last incoming packet before 4 seconds
DEBUG 2018-02-05T07:58:34Z: rumqtt::client::state: Last outgoing before 10 seconds. Last incoming packet before 9 seconds
DEBUG 2018-02-05T07:58:34Z: rumqtt::client::connection: Sending packet. Pingreq
DEBUG 2018-02-05T07:58:34Z: rumqtt::client::connection: Received packet. "Pingresp"
DEBUG 2018-02-05T07:58:39Z: rumqtt::client::state: Last outgoing before 5 seconds. Last incoming packet before 4 seconds
DEBUG 2018-02-05T07:58:44Z: rumqtt::client::state: Last outgoing before 10 seconds. Last incoming packet before 9 seconds
DEBUG 2018-02-05T07:58:44Z: rumqtt::client::connection: Sending packet. Pingreq
DEBUG 2018-02-05T07:58:44Z: rumqtt::client::connection: Received packet. "Pingresp"
DEBUG 2018-02-05T07:58:49Z: rumqtt::client::state: Last outgoing before 4 seconds. Last incoming packet before 4 seconds
DEBUG 2018-02-05T07:58:54Z: rumqtt::client::state: Last outgoing before 10 seconds. Last incoming packet before 9 seconds
DEBUG 2018-02-05T07:58:54Z: rumqtt::client::connection: Sending packet. Pingreq
DEBUG 2018-02-05T07:58:54Z: rumqtt::client::connection: Received packet. "Pingresp"
DEBUG 2018-02-05T07:58:59Z: rumqtt::client::state: Last outgoing before 4 seconds. Last incoming packet before 4 seconds
DEBUG 2018-02-05T07:59:04Z: rumqtt::client::state: Last outgoing before 9 seconds. Last incoming packet before 9 seconds
DEBUG 2018-02-05T07:59:09Z: rumqtt::client::state: Last outgoing before 14 seconds. Last incoming packet before 14 seconds
DEBUG 2018-02-05T07:59:09Z: rumqtt::client::connection: Sending packet. Pingreq
DEBUG 2018-02-05T07:59:09Z: rumqtt::client::connection: Received packet. "Pingresp"
DEBUG 2018-02-05T07:59:14Z: rumqtt::client::state: Last outgoing before 5 seconds. Last incoming packet before 4 seconds
DEBUG 2018-02-05T07:59:19Z: rumqtt::client::state: Last outgoing before 10 seconds. Last incoming packet before 9 seconds
DEBUG 2018-02-05T07:59:19Z: rumqtt::client::connection: Sending packet. Pingreq
DEBUG 2018-02-05T07:59:19Z: rumqtt::client::connection: Received packet. "Pingresp"
DEBUG 2018-02-05T07:59:24Z: rumqtt::client::state: Last outgoing before 5 seconds. Last incoming packet before 4 seconds
DEBUG 2018-02-05T07:59:29Z: rumqtt::client::state: Last outgoing before 9 seconds. Last incoming packet before 9 seconds
DEBUG 2018-02-05T07:59:34Z: rumqtt::client::state: Last outgoing before 15 seconds. Last incoming packet before 14 seconds
DEBUG 2018-02-05T07:59:34Z: rumqtt::client::connection: Sending packet. Pingreq
ERROR 2018-02-05T07:59:34Z: rumqtt::client::connection: Reactor stopped. v = ()
 INFO 2018-02-05T07:59:34Z: rumqtt::client: Will sleep for Duration { secs: 10, nanos: 0 } seconds before reconnecting

(Btw, now I understand why the interval is not consistent - you're polling at 5s intervals :) That's probably fine though.)

@tekjar
Copy link

tekjar commented Feb 5, 2018

https://github.com/AtherEnergy/rumqtt/blob/tokio2/src/client/connection.rs#L181

Btw, now I understand why the interval is not consistent

Yeah, Interval to check if ping is required should be less than half the keepalive time to make sure that broker gets pingreq in 1.5 * keepalive time. I think this is right on the edge. Maybe I'll check at keepalive/3.

Also I'm thinking not to allow keepalive less than 10 secs. But need some consensus on this

@tekjar
Copy link

tekjar commented Feb 5, 2018

I've tested with keep_alive = 30 and it works well :)

@dbrgn
Copy link
Author

dbrgn commented Feb 5, 2018

When setting the keepalive to 60, I don't get those disconnects anymore.

Regarding the intervals, since you use tokio, can't you use something like https://github.com/tokio-rs/tokio-timer to schedule the next ping message, instead of checking every 5s?

@tekjar
Copy link

tekjar commented Feb 5, 2018

@dbrgn I'm using reactor's Interval to do this which is almost same as tokio-timer. It's just that the next is_ping_required check most of the times end up crossing the keepalive boundary and we need to optimize the interval time in such a way that you don't cross 1.5 times keepalive time that the broker expects.

In your case, when keep_alive = 60, we do is_ping_required every 10 seconds. So the worst case ping would happen at 70 seconds, which is much less than 1.5 * 60 = 95 which is ok

1 similar comment
@tekjar
Copy link

tekjar commented Feb 5, 2018

@dbrgn I'm using reactor's Interval to do this which is almost same as tokio-timer. It's just that the next is_ping_required check most of the times end up crossing the keepalive boundary and we need to optimize the interval time in such a way that you don't cross 1.5 times keepalive time that the broker expects.

In your case, when keep_alive = 60, we do is_ping_required every 10 seconds. So the worst case ping would happen at 70 seconds, which is much less than 1.5 * 60 = 95 which is ok

@dbrgn
Copy link
Author

dbrgn commented Feb 5, 2018

we need to optimize the interval time in such a way that you don't cross 1.5 times keepalive time that the broker expects

Ah, I wasn't aware that the broker has certain keepalive expectations. I looked at the spec yesterday, but in the area where the ping messages are specified there is no mention of that 🙂

Can't you just do an interval with the exact keepalive duration, and use .and_then to trigger the ping directly?

@tekjar
Copy link

tekjar commented Feb 5, 2018

@dbgrn Maybe we can use this to timeout when there is no activity

@dbrgn
Copy link
Author

dbrgn commented Feb 5, 2018

Hm, isn't that to do asynchronous write/read operations with a timeout (must complete within a certain duration)? Would that help in this case?

@tekjar
Copy link

tekjar commented Feb 5, 2018

@dbrgn I think we can use Timeout error to do pings

@dbrgn
Copy link
Author

dbrgn commented Feb 5, 2018

Yeah, but even if you wrap a timeout around it, the reactor tries to complete the future as soon as possible, right? The timeout is only relevant if the future is blocking longer than the specified duration.

@tekjar
Copy link

tekjar commented Nov 16, 2018

Hi. Can you please test this with master branch and report back if it's still happening? Closing this for now

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants