Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rascal doesn't reconnect when connection with the broker is dropped #232

Closed
Zoddo opened this issue Apr 10, 2024 · 6 comments
Closed

Rascal doesn't reconnect when connection with the broker is dropped #232

Zoddo opened this issue Apr 10, 2024 · 6 comments

Comments

@Zoddo
Copy link

Zoddo commented Apr 10, 2024

Description

When the connection with the broker is closed unexpectedly (for example, when the broker is restarted), Rascal never attempts to reconnect to the broker.

Expected Behavior

Rascal try to reconnect to the broker.

Actual Behavior

Rascal log a connection error in debug mode, but never reconnect.

Possible Fix

Steps to Reproduce

  1. Initialize a Rascal instance like below:
const broker = await Rascal.createBrokerAsPromised(
  Rascal.withDefaultConfig({
    vhosts: {
      '/': {
        connection: {
          url: process.env.AMQP_URI,
          socketOptions: {
            clientProperties: { connection_name: hostname(), service: 'Worker' },
          },
        },
        queues: {
          'test.queue': {
            options: {
              durable: false,
              messageTtl: 60000,
              maxPriority: 1,
            },
          },
        },
      },
    },
  }),
);

const subscription = await this.broker.subscribe('/test.queue', {
  contentType: 'application/json',
});

subscription
  .on('message', async (_message, _content, ackOrNack) => {
    ackOrNack();
  })
  .on('invalid_content', (err, _message, ackOrNack) => {
    ackOrNack(err);
  });
  1. Start the application with DEBUG='rascal:*'
  2. Once connected, restart the broker.
  3. The following lines are logged, but Rascal never attempts to reconnect:
rascal:SubscriberSession Removing channel: 7d7e11bf-bcda-411e-98ac-84e112ce8bd2 from session +11s
rascal:Vhost Handling connection error: Connection closed: 320 (CONNECTION-FORCED) with message "CONNECTION_FORCED - broker forced connection closure with reason 'shutdown'" initially from connection: fdf8dae3-d74e-49c0-831a-615bb402f4e5, amqp://127.0.0.1:5672?heartbeat=10&connection_timeout=10000&channelMax=100 +11s

Context

I also tried to explicitly define the retry option on the connection, without success.

Your Environment

  • Version used: 18.0.0
  • Environment name and version (e.g. Chrome 39, node.js 5.4): Node 20.12.0
@Zoddo Zoddo added the bug label Apr 10, 2024
@cressie176 cressie176 removed the bug label Apr 10, 2024
@cressie176
Copy link
Collaborator

Hi @Zoddo. Have you added error handlers as described in the very important section about error handling?

@cressie176
Copy link
Collaborator

I added the following code to your example, and the app restarts the connection as expected

  broker.on('error', (err) => {
    console.error({ err })
  });

However I'm confused as to why the application doesn't crash without this. The error event documentation still say

If an EventEmitter does not have at least one listener registered for the 'error' event, and an 'error' event is emitted, the error is thrown, a stack trace is printed, and the Node.js process exits.

I'll continue to investigate

@Zoddo
Copy link
Author

Zoddo commented Apr 11, 2024

Oh, I have an uncaughtException handler, that's why my app didn't crash (and why I didn't explicitly register an error listener).

I believe you check if an error listener is attached to the broker instance and don't automatically reconnect if none are attached?

I confirm explicitly attaching an error listener to the broker instance fix the issue.

@cressie176
Copy link
Collaborator

cressie176 commented Apr 11, 2024

Hi @Zoddo,

Rascal doesn't check whether an error listener is attached. The code simply emits an error then schedules the retry. I don't think that uncaughtException handlers catch error events, so it should have crashed your application.

@cressie176 cressie176 changed the title BUG: Rascal doesn't reconnect when connection with the broker is dropped Rascal doesn't reconnect when connection with the broker is dropped Apr 11, 2024
@cressie176
Copy link
Collaborator

I worked out what's going on.

amqplib is catching the error in what it calls it's main accept loop and [re-emitting]((https://github.com/amqp-node/amqplib/blob/bbe579e467866a40ff1c0ae2428c416111340364/lib/connection.js#L429) it as a frame error. This is handled by the socket error code, which expects a socket close due to the connection having been dropped, and does nothing.

I'll change Rascal to emit the error event (and schedule the reconnection) from within a setImmediate. This will cause the error bypasses amqplib's main accept loop and bubble up to the process as expected.

I've tested locally and it is caught by the uncaughtException handler

cressie176 added a commit that referenced this issue Apr 12, 2024
@cressie176
Copy link
Collaborator

Hi @Zoddo

Rascal@19.0.0 emits the error events via setImmediate. I issued a major release in case there are unintended side effects, but I can't see a reason why there would be. You do need to add the explicit error handlers, when I tested using a global uncaughtException handler, it caught the errors but prevented the reconnection / resubscription code from running.

In making the change, I spotted that I had been handling channel close and connection error events incorrectly - the code was shared with the channel close and connection error, and expected an error event. Consequently it threw an exception that was being swallowed by the aforementioned main accept loop. In practice, I'm not sure this mattered though, as connection close and channel close events are only emitted without errors during a client initiated shutdown.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants