Skip to content
This repository has been archived by the owner on Jan 17, 2020. It is now read-only.

Feature: Connection status notifications and notifications drop handling #143

Closed
wants to merge 1 commit into from
Closed

Conversation

flxo
Copy link
Contributor

@flxo flxo commented Mar 13, 2019

This is a follow up on #142 without any change on the reconnection behaviour.

Check channel state before sending notifications and distibute connection state changes. The notification Receiver is wrapped into a struct that contains a oneshot Receiver that allows the client to check whether the notification struct is dropped or not. Upon disconnections the corresponding Error is sent.

@flxo
Copy link
Contributor Author

flxo commented Mar 13, 2019

lets think about merging ConnectError and NetworkError. I think that makes live easier for users and obsoletes ConnectError::NetworkError(_) which I needed to have a distinct type for Notification::Disconnected(_).

@flxo
Copy link
Contributor Author

flxo commented Mar 13, 2019

The test story is here is bad - I just added a very very basic one.

Any thoughts about kind of integration tests with a "real" broker? Especially for testing the connection state changes this is definitely more convenient than mocked unit tests...

@flxo
Copy link
Contributor Author

flxo commented Mar 13, 2019

One more note: I didn't increment the version which is kinda needed because of the API change. This can be done when eventually this branch is merged.

Err(false) => break 'reconnection,
Ok(_v) => continue 'reconnection,
}
Err(reconnect) => {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need this handling? I intended to abstract all the reconnection behavior in mqtt_io method and it tells whether to connect or not.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you observer any bug with that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Probably not. I picked the commit from pr-notifications and didn't notice that this is not needed.

@manifest
Copy link
Contributor

@flxo Thanks for working on that. We've just run into that issue and here is already a solution :-)

@manifest
Copy link
Contributor

@tekjar Could we please merge this one and bump the crate version? This feature is quite important for our case.

@tekjar
Copy link

tekjar commented Mar 16, 2019

Sorry for the delay on this. I'll try to close this today

handle_notification(notification, &notification_tx);
future::ok(reply)
let mut mqtt_state = mqtt_state.borrow_mut();
mqtt_state.handle_incoming_mqtt_packet(packet)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you've moved linear and_then to nested and_then? IMO chaining promises one after the other is much more readable than nesting combinators, which can lead to call back hell sort of code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This two lines just remove the explicit future::ok(). The Result<> returned by handle_incoming_mqtt_packet can be used directly and coverts because of IntoFuture for Result. Not sure what you mean by linear and nested her.

Copy link

@tekjar tekjar Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh. Got it.

What I meant was that method 2 below will keep combinators like and_then one after the other instead of one inside the other

.and_then(move |packet| {
                debug!("Incoming packet = {:?}", packet_info(&packet));
                let mut mqtt_state = mqtt_state.borrow_mut();
                mqtt_state.handle_incoming_mqtt_packet(packet)
                    .and_then(|(notification, reply)| {
                        handle_notification(notification)
                        Ok(reply)
                })
})
.filter(should_forward_packet);

vs

.and_then(move |packet| {
        debug!("Incoming packet = {:?}", packet_info(&packet));
        let mut mqtt_state = mqtt_state.borrow_mut();
        let o = mqtt_state.handle_incoming_mqtt_packet(packet)
        future::result(o)
})
.map(|(notification, reply)| {
        handle_notification(notification)
        reply
})
.filter(should_forward_packet);

let mut mqtt_state = mqtt_state.borrow_mut();
mqtt_state.handle_incoming_mqtt_packet(packet)
.and_then(|(notification, reply)| {
match notification {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Can we move this to a method? Helps in understanding the flow for newcomers as well as me when I visit the code after a break

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. I got what you mean and updated the PR. I did not move it to a methods sind the signature would be quite weird and with the not nested and_then I think it's readable. I also remember why I did the nested and_then: Avoid the double clone of the Rc<MqttState.

Feel free to refactor this.

@tekjar
Copy link

tekjar commented Mar 18, 2019

Left some small nits. So AFAIU you are using oneshot channel along with crossbeam channel for synchronization. Give we anyway lose the convenience of crossbeam select with the wrapper, why not just use futures channel?

@tekjar
Copy link

tekjar commented Mar 18, 2019

Coming to your questions

lets think about merging ConnectError and NetworkError

We'll do this in the next PR

Any thoughts about kind of integration tests with a "real" broker? Especially for testing the connection state changes this is definitely more convenient than mocked unit tests

Integration tests with a real broker would be good but I thought we should also simulate bad networks, disconnections and half-open connections to continuously test out corner cases. But we can start with any available broker and later work on rumqttd to simulate those conditions. Gave a shot to toxiproxy but too many bugs

I didn't increment the version which is kinda needed because of the API change. This can be done when eventually this branch is merged

I'll increment when I merge this to master :)

@flxo
Copy link
Contributor Author

flxo commented Mar 18, 2019

Left some small nits. So AFAIU you are using oneshot channel along with crossbeam channel for synchronization. Give we anyway lose the convenience of crossbeam select with the wrapper, why not just use futures channel?

I don't see any drawback for using the futures channel since I'm in most of my project anyway in "future" context. You can still use the crossbeam select stuff with the impl Deref for Notifications that gives access to the crossbeam part. We can also consider to make this pub.

@tekjar
Copy link

tekjar commented Mar 18, 2019

Cool. Let's merge this after those small changes. I can make those changes myself after merging if you are busy

match notification {
Notification::None => (),
// Ignore error on notification_tx send, since the receiver can be dropped at any time
_ if mqtt_state.send_notifications() => drop(notification_tx.send(notification)),
Copy link

@tekjar tekjar Mar 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this will block the event loop. What happens when the receiver is not dropped and the channel is full?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. This blocks the loop. As far a I got it that's ok in order to backpressure on the broker.
A option would be to use a unbounded channel - but this is probably not what we want here.

I'm not fluent in the MQTT spec: What is a broker allowed to do with clients that back pressure?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking the event loop is not ok just because the receiver is not able to catch up. Pings should still continue to happen to prevent broker disconnecting the client and publisher should not be slow because the receiver is doing some heavy computation.

This is getting a little tricky than I anticipated. I'll give proper thought to this and ping you in a few days.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. You're right. I forgot the pings...sorry. Will think about that too.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool :). Sorry for the back and forths on this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Options regarding the pings:

  1. Accept that an overload of the client will stop sending pings. The broker will probably disconnect in such a situation (or has to according the spec?). The timeouts are quite relaxed to be fine with a short delayed ping. This is the behaviour on notification_refactor (this PR).
  2. A unbound notification queue: sorry - bad idea...
  3. Discard notifications when the notification queue is full. Behaviour on master. I'd really prefer option 1 one over 3 since at least you see what happened.
  4. Another idea is to split the notification queue into several ones to give the client more flexibility what to handle or which of the Receivers to drop. e.g Notification::Puback could be less interesting that Notification::Connected or Notification::Disconnected(_). Here still the decision whether a Notification can be discarded or not has to be done.
  5. Configuration options to choose between option 1 and 3. Should not be a feature but could be added to the config struct.

What do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @flxo. Sorry for the delay on this. Coming to the options that you've mentioned

  1. Blocking the event loop, in general, isn't a good idea. It's not just about pings and disconnects. If someone is doing publishes from a different thread and the receiver is doing heavy computations, publishers won't progress. Just because crossbeam channel is full, blocking the publisher doesn't make any sense. Also, the eventloop block will keep interfering with any future features we might need on the eventloop.

  2. Agreed :)

  3. Sounds good

Personally, I feel that we should provide an option for future channels along with crossbeam channel for notifications. Future channels won't have this problem of blocking the eventloop while doing a blocking send. Also we won't lose data silently. If people choose to use crossbeam channel, the get the flexibility of select but might miss incoming data.

I'm open to discussion on this. May on a new issue :)

Also, I'll busy this month and won't be able to do a lot of rumqtt work. But we can try to reach a conclusion this month and implement it in the next

Check channel state before sending notifications and distibute
connection state changes. The notification Receiver is wrapped into a
struct that contains a oneshot Receiver that allows the client to check
whether the notification struct is dropped or not.
@TotalKrill
Copy link
Contributor

TotalKrill commented May 13, 2019

Might i inquire for any news on this, after skimming through this PR, this seems like it would give users such as me ( with devices on bad networks a chance to react on network changes ) a chance to react to those network changes.

@tekjar
Copy link

tekjar commented May 14, 2019

@TotalKrill Couldn't spend any time on this crate now. Will most probably resume the work in June

@flxo
Copy link
Contributor Author

flxo commented May 15, 2019

@TotalKrill As you might have noticed there are a couple of open points that are not super easy to solve or decide. I'm using this branch in an experimental project and it works so far in that specific scenario.
Comments are welcome!

@TotalKrill
Copy link
Contributor

@flxo alright, I might have to take a look into that. A semi-related question though, how are topic subscriptions being handled during disconnect/connect?

Right now I have noticed that long running connections are not receiving subscription notifications after a while. I am suspecting this is due to a disconnect/reconnect. My preferred behavior would be to resubscribe to topics as well on reconnects.

@flxo
Copy link
Contributor Author

flxo commented May 20, 2019

As far as I know resubscriptions upon a reconnect is a open topic. See #85. Probably for now you have to do that on your own.

@TotalKrill
Copy link
Contributor

the clean_session(false) seems to solve it as long as the broker doesnt die

@tekjar
Copy link

tekjar commented May 22, 2019

@TotalKrill

As @flxo mentioned, I'll need some consensus on automatic resubscription. This particular pull request isn't related to this issue. Can you subscribe to issue #85 so that you don't miss any progress on this?

@TotalKrill
Copy link
Contributor

TotalKrill commented Jun 19, 2019

I tried wrapping my head around this some. The problem is that the eventloop handles publish requests, and that if we are handling other events during this time, the publish might stall since the other events from the broker might ( such as recieved mqtt messages ) might be in the way. as well as that client generated pings and events can get lost. And all these problems are due to the fact that the client is overloaded?

Could we not return errors when trying to publish to an overloaded eventloop, or send an error notification in the case when a ping failed to send due to overloading the eventloop. Or is this impossible due to problem with blocking the eventloop?

@marcotuna
Copy link

Any updates on this subject? I am having troubles when needing to resubscribe when broker crashes.

@tekjar
Copy link

tekjar commented Jul 18, 2019

@marcotuna I've implemented reconnection events in the master branch. But I'll have to add automatic resubscription to eventloop (of course based on user configuration).

You can use reconnection events to resubscribe manually but the subscription will only happen after the publishes in the queue are transmitted by the eventloop

@tekjar
Copy link

tekjar commented Jul 18, 2019

@flxo Reconnection notifications and notification drop handling are part of master now. Please feel free to comment on the issue if you don't feel the current behavior solves your usecase

@tekjar tekjar closed this Jul 18, 2019
@tekjar
Copy link

tekjar commented Jul 18, 2019

@marcotuna Can you please head issue #85. I'll expose resubscription option today

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants