Skip to content

MQTT infinite reconnection #78

@cupostv

Description

@cupostv

Confirm by changing [ ] to [x] below:

Known Issue

  • I'm using ATS data type endpoint: the endpoint should look like <prefix>-ats.iot.<region>.amazonaws.com

Platform/OS/Hardware/Device
What are you running the sdk on? Primarly Raspberry PI. For testing purposes Windows 10.

Describe the question
I am creating a MQTT connection like this:

    const client_bootstrap = new io.ClientBootstrap();

    let config_builder = iot.AwsIotMqttConnectionConfigBuilder.new_mtls_builder_from_path(config.aws.cert_path, config.aws.key_path);

    config_builder.with_certificate_authority_from_path(undefined, config.aws.root_ca_path);

    config_builder.with_clean_session(false);
    config_builder.with_client_id(config.aws.thing_name);
    config_builder.with_endpoint(config.aws.iot_endpoint);

    const mqttConfig = config_builder.build();

    const client = new mqtt.MqttClient(client_bootstrap);
    connection = client.new_connection(mqttConfig);

    shadow = new iotshadow.IotShadowClient(connection);

    thingName = config.aws.thing_name;

    return connection.connect();

Everything works perfectly when there is connection. If i disconnect from internet, after sometime (20-30 seconds) i will get interrupt event with the following error:

CrtError: aws-c-io: AWS_IO_SOCKET_CLOSED, socket is closed.
    at MqttClientConnection._on_connection_interrupted (/home/mstupar/device/node_modules/aws-crt/dist/native/mqtt.js:334:32)
    at /home/mstupar/device/node_modules/aws-crt/dist/native/mqtt.js:114:113 {
  error: 1051,
  error_code: 1051,
  error_name: 'AWS_IO_SOCKET_CLOSED'
}

If I connect to internet after this interrupt, i will get an unhandled exception with the following stacktrace:

################################################################################
Resolved stacktrace:
################################################################################
0x00007fc3193b781b: ?? ??:0
0x00000000000b1383: s_print_stack_trace at module.c:?
0x00000000000128a0: __restore_rt at ??:?
0x00007fc31c4bef47: ?? ??:0
0x00007fc31c4c08b1: ?? ??:0
node() [0x95c589]
node(napi_acquire_threadsafe_function+0x27) [0x9cf1e7]
0x00007fc3192f1931: ?? ??:0
0x00000000000b4306: s_on_connection_resumed at mqtt_client_connection.c:?
0x00000000000de34c: s_packet_handler_connack at client_channel_handler.c:?
0x00000000000dee9e: s_process_read_message at client_channel_handler.c:?
0x000000000011269d: s_s2n_handler_process_read_message at s2n_tls_channel_handler.c:?
0x0000000000113bbe: s_do_read at socket_channel_handler.c:?
0x00000000001141f2: s_on_readable_notification at socket_channel_handler.c:?
0x0000000000110a5a: s_on_socket_io_event at socket.c:?
0x000000000010bb9a: s_main_loop at epoll_event_loop.c:?
0x0000000000177c1b: thread_fn at thread.c:?
0x00000000000076db: start_thread at ??:?
0x00007fc31c5a1a3f: ?? ??:0
################################################################################
Raw stacktrace:
################################################################################
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(aws_backtrace_print+0x4b) [0x7fc3193b781b]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xb1383) [0x7fc3192f1383]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x128a0) [0x7fc31c8928a0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0xc7) [0x7fc31c4bef47]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x141) [0x7fc31c4c08b1]
node() [0x95c589]
node(napi_acquire_threadsafe_function+0x27) [0x9cf1e7]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(aws_napi_queue_threadsafe_function+0x11) [0x7fc3192f1931]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xb4306) [0x7fc3192f4306]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xde34c) [0x7fc31931e34c]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0xdee9e) [0x7fc31931ee9e]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x11269d) [0x7fc31935269d]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x113bbe) [0x7fc319353bbe]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x1141f2) [0x7fc3193541f2]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x110a5a) [0x7fc319350a5a]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x10bb9a) [0x7fc31934bb9a]
/home/mstupar/device/node_modules/aws-crt/dist/bin/linux-x64/aws-crt-nodejs.node(+0x177c1b) [0x7fc3193b7c1b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db) [0x7fc31c8876db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f) [0x7fc31c5a1a3f]

Is it possible to avoid AWS_IO_SOCKET_CLOSED with additional configuration?
How should I properly reconnect? Is there some automatic way like socket.io has.
The device should be able to run at least 1 day without restart. In that day, multiple connection losses may occur. Process restart with systemd is not good solution for this case because i am running some other modules in the same process that should not be affected with connection loss.

Is there any additional configuration needed for MqttClientConnection?
Should I call some functions from interrupt to resume connection manually?

Thanks in advance,
Mladen Stupar

Metadata

Metadata

Assignees

Labels

guidanceQuestion that needs advice or information.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions