Skip to content

Create a listener connection supervisor so we can recover in case of database connection failure #74

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jun 17, 2021

Conversation

diogob
Copy link
Owner

@diogob diogob commented Apr 21, 2021

For more information on what this feature was designed to address check the issue #71

Goal

The user should be able to configure a supervisor thread for the listener connecion. When a listener connection is not detected by the supervisor it will open a new listener connection and reconnect the multiplexer.

Configuration

We can use a variable such as PGWS_CHECK_LISTENER_INTERVAL where the user can configure a number of milliseconds between checks, and keep the current system behaviour by setting it to 0. The default configuration would be 0 so the feature is backwards compatible with the current behaviour.

The time configured is roughly the maximum amount of time where messages could be lost due to the lack of listener.

Implementation details

We will have to ensure that the producer thread for the Multiplexer is dead to avoid leaving any dangling resource. It seems that the supervisor and the producer termination behaviour are orthogonal but to avoid additional complexity the new setting will bypass the existing shutdown the whole server and just re-spawn the producer.

In order to cancel the old thread we need to call the supervisor where the thread id of the open producer is available. Also, using the same openProducer function might simplify the code. One way to achieve both things is to keep both the openProducer function and the producer threadId in the Multiplexer and have another function such as superviseMultiplexer :: Multiplexer -> IO () which would take a multiplexer and launch the supervisor. We could also bake the supervisor inside either newHasqlBroadcasterForChannel or newMultiplexer but those functions are complex enough as they are.

TODO:

  • Write a good description here of how the new feature is supposed to behave
  • Use that info to adjust the README so we have a clear target

diogob added 5 commits May 27, 2021 12:13
…eed to replace the producer thread. Also that thread id was not in use.
…nnel so we can use that configuration to configure a supervisor in the multiplexer producer thread
will be eventually represented as JSON id does not make much sense to
keep them as ByteString around.
@diogob diogob force-pushed the listener-supervisor branch from 2ac092c to 1b7f8ac Compare May 27, 2021 16:13
…ending on the value of PGWS_CHECK_LISTENER_INTERVAL
@diogob
Copy link
Owner Author

diogob commented May 27, 2021

Instead of using the value 0 to encode the disabled parameter I just used an optional parameter. So when PGWS_CHECK_LISTENER_INTERVAL is not set the server behaves as the previous version.

… failure and about new configuration options
@diogob diogob marked this pull request as ready for review May 27, 2021 22:14
@diogob
Copy link
Owner Author

diogob commented May 27, 2021

Good news @W1M0R , I have finally finished the feature 😄
I understand that you are using an external supervisor as a workaround, but the new feature should make the database connection recovery seamless by keeping all websocket connections intact.

I'll just let the code soaking for a couple of days but I should release a new version very soon. In case you compile from the branch and test it out please let me know the results.

@W1M0R
Copy link

W1M0R commented May 30, 2021

Thanks@diogob!

@diogob diogob merged commit 040cdde into master Jun 17, 2021
@diogob diogob deleted the listener-supervisor branch June 17, 2021 01:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants