Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client Timing Out After 2 Minutes #106

Closed
MattyK14 opened this issue May 23, 2017 · 10 comments
Closed

Client Timing Out After 2 Minutes #106

MattyK14 opened this issue May 23, 2017 · 10 comments

Comments

@MattyK14
Copy link

MattyK14 commented May 23, 2017

Currently using the react-native fork to connect to AWS IoT. After every 2 minutes of a connection, the client times out even if there are messages coming in or being published. In the connectionLost handler, if the error type is timeout I force it to reconnect but after 3 more reconnections it becomes Error: AMQJS0007E Socket error: Unknown socket error.

@rh389

@jpwsutton
Copy link
Contributor

Can you recreate this using the pure client library?

@robhogan
Copy link

Just on my phone so can't properly investigate right now but this could well be an issue with my fork as I haven't done much testing with the keep-alive mechanisms. Do you have a keepAliveInterval configured on the client, and the equivalent on the server?

Failing to reconnect after 3 timeouts is odd in any case though.

@MattyK14
Copy link
Author

MattyK14 commented May 24, 2017

I was on the default keepAliveInterval of 60 seconds. After messing around with it myself, seems no matter what I set it to every second keepAlive ping will cause it to timeout. So if I set it to 5 seconds, it times out after 10.

The failure to reconnect is odd. If I set the keepAliveInterval to 15 seconds, it will fail to reconnect on the 10th try. It seemed consistent yesterday on 60 seconds to fail on the 5th attempt to reconnect, but now the last two times it has failed on the 3rd attempt.

@robhogan
Copy link

@MattyK14 - I've just spotted that a fork of my fork by @clshortfuse has what looks like a better timer implementation (looks like he found a bug or two as well as using background timers). Perhaps you could give his fork a try?

@clshortfuse - mind if I pull in your commits if all goes well?

(PS: I've now enabled issues on my fork so any future issues specific to my RN version can be opened directly there)

@clshortfuse
Copy link

Go for it. I didn't have time to wrap it up, but it's working fine in my end so far. I did change the ping system IIRC, because I was having issues with the dual pinger system, so you have to watch for breaking changes.

The specific reason for changing to native timers was because the JS stack seems to pause when the activity is pause. On ChromeOS, the JS timers wouldn't fire unless the activity had window focus.

@robhogan
Copy link

robhogan commented May 30, 2017

Cheers. I think the rationale behind two timers is:

  1. By the spec, the client is responsible for sending a control packet (ping or otherwise) every keepAliveInterval seconds, otherwise the server disconnects us (after 1.5 * keepAliveSeconds). So we use sendPinger to track when we've been quiet long enough that we must ping the server.

  2. If the server has been quiet for a while, we choose to use the receivePinger to make sure it's still there. If we don't get a response to our ping we disconnect. As far as I can tell, this is additional to the spec. The only use I can think of is to ensure that we close the socket (and so prepare for a reconnect) in cases where the socket is still functional but the server isn't all there.

Right now though it looks like the receivePinger is never reset, either in my fork or in the eclipse original. Hence clshortfuse@b94ef43#diff-1983c3869382e68a08044cf44a806a41 presumably. But even then, we're not acting on a lack of response.

None of this really explains the problem @MattyK14 is seeing though ;). What server are you using @MattyK14? Are the messages getting through (both ways?) until the disconnect?

Edit: The receivePinger is never reset explicitly, but of course it resets itself every time it sends a ping, so in effect it looks like it's just a continuous pinger, which makes the sendPinger redundant.

@jpwsutton - any idea what the intention was here? It doesn't make sense to me.

@clshortfuse
Copy link

clshortfuse commented May 30, 2017

I'm actually using the react native fork by @rh389 for the same reason @MattyK14 is, namely, for Amazon IoT.

I'm more interested the actual error code returned by connectionLost. I know that if you try to perform an action not permitted by your Amazon IoT Policy, it will boot you off the Mqtt connection. Though, usually, you'll get a specific error code.

Also, are you sure you are randomizing your Client IDs properly? I believe Amazon will also boot an Mqtt session if somebody else connects with the same Client ID. My bet is on that.

Just for reference, this is the configuration I have working (with a few minor code changes):

var clientId = 'myappname-client-' + (Math.floor((Math.random() * 1000000) + 1));
console.log('Connecting to MQTT with client', clientId);
var client = new MqttClient({
  uri: url,
  clientId: clientId,
  storage: storage
});

client.on('connectionLost', (error) => {
  switch (error.errorCode){
    case 0:
      return;
    default:
    case 1:
    case 4:
    case 8:
      console.log('###CONNECTION LOST###', error);
      this.disconnectMqttClient(connectionId).then(() => {
        this.events.emit('mqttClientConnectionLost', connectionId);
      });
  }
});

client.on('messageReceived', (msg) => {
  console.log('message received', msg.destinationName);
  console.log(this.mqttClients[connectionId].callbacks);
  var array = this.mqttClients[connectionId].callbacks[msg.destinationName];
  if (Array.isArray(array)) {
    array.forEach(cb => cb(msg));
  }
});

let options = {
  useSSL: true,
  mqttVersion: 4,
  cleanSession: true,
  keepAliveInterval: 15,
  timeout: 15000
};

console.log('Performing MQTT Connect');
return client.connect(options)
  .then(() => {
    console.log('Connected MQTT');
  });

Edit: As for the pinger changes, I could see how there could be a flaw where if a client is getting a stream of incoming packets, and never sends a response, the server could think it's no longer there. From the client-side it knows the connection is alive, but not server-side.

@MattyK14
Copy link
Author

MattyK14 commented May 30, 2017

@clshortfuse I'm currently generating a random UUID for the Client Id. It times out at 2x the keepInterval. When I get the timeout error I get it to reconnect, after a few more timeouts I get Error: AMQJS0007E Socket error: Unknown socket error. It seems to be client side and not because of IoT. Messages are successfully being published to topics and received using the AWS console.

@rh389 Yes messages are successfully going both ways until the disconnect!

I will try the fork hopefully later this week. Thanks for the input guys.

@MattyK14
Copy link
Author

After using @clshortfuse's fork I'm not getting timeouts anymore, but AWS IoT is closing the socket after 1.5x the keepAliveInterval as seen in the documentation if a publish is not made.

I don't have a chance to pick through the source code, but I guess it's not sending the ping messages?

@robhogan
Copy link

robhogan commented Sep 1, 2017

This can be closed, I'm pretty sure it was just a RN fork issue - it's covered by robhogan#4 and now fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants