Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async reconnections #400

Merged
merged 7 commits into from
Mar 22, 2024
Merged

Async reconnections #400

merged 7 commits into from
Mar 22, 2024

Conversation

rageshkrishna
Copy link
Contributor

@rageshkrishna rageshkrishna commented Mar 18, 2024

The async client and builder can now attempt to reconnect to the server on connection errors.

Fixes #291

Copy link
Owner

@1c3t3a 1c3t3a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the PR! Really appreciate you working on that :) I left a few comments, generally this looks good.

socketio/src/asynchronous/client/builder.rs Outdated Show resolved Hide resolved
socketio/src/asynchronous/client/builder.rs Outdated Show resolved Hide resolved
socketio/src/asynchronous/client/builder.rs Show resolved Hide resolved
socketio/src/asynchronous/client/client.rs Outdated Show resolved Hide resolved
socketio/src/asynchronous/client/client.rs Outdated Show resolved Hide resolved
socketio/src/asynchronous/client/client.rs Outdated Show resolved Hide resolved
socketio/src/asynchronous/client/client.rs Outdated Show resolved Hide resolved
@rageshkrishna rageshkrishna force-pushed the feat/async-reconnect branch 3 times, most recently from afb2e91 to fab1dab Compare March 19, 2024 07:58
@rageshkrishna
Copy link
Contributor Author

@1c3t3a I've incorporated all the suggestions. While adding the test I realized I hadn't made Client::socket an Arc. This is necessary because the reconnect thread is working on a clone and we need to be able to share the inner socket instance with all the clones. That's the only big change other than what's in the PR comments.

I made quite a few commits while trying to fix that, but I've squashed everything into just one so it's a clean update. Please take another look and let me know if I've missed something. Thanks!

@rageshkrishna rageshkrishna marked this pull request as ready for review March 19, 2024 08:03
@rageshkrishna
Copy link
Contributor Author

There's one more problem that I really think should be solved with reconnects. The async client today has no way to update the auth field once it's set. This is a problem because, in my case, I'm using short-lived tokens to authenticate the connection. It's very unlikely that this token will be valid when the time comes to reconnect, so we need some way for the user to provide new auth data.

This might be beyond the scope of this PR, though. Maybe I should open another issue for it? Let me know what you think.

Copy link
Owner

@1c3t3a 1c3t3a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

I hadn't made Client::socket an Arc

Yes, that's one necessary step!

auth field

Maybe better to do this separately! :)

socketio/src/asynchronous/client/client.rs Outdated Show resolved Hide resolved
socketio/src/asynchronous/client/client.rs Outdated Show resolved Hide resolved
socketio/src/asynchronous/client/client.rs Outdated Show resolved Hide resolved
@rageshkrishna rageshkrishna force-pushed the feat/async-reconnect branch 2 times, most recently from 167edd4 to 3dcf5ee Compare March 19, 2024 10:16
Copy link
Collaborator

@Colin1860 Colin1860 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @1c3t3a asked me to take a look at this. These are my first thoughts.

socketio/src/asynchronous/client/client.rs Show resolved Hide resolved
socketio/src/asynchronous/client/client.rs Outdated Show resolved Hide resolved
@1c3t3a
Copy link
Owner

1c3t3a commented Mar 21, 2024

Lets run the CI!

@1c3t3a
Copy link
Owner

1c3t3a commented Mar 21, 2024

I fixed the tokio CI issue in main, feel free to rebase :) Then we see the "real" tests of this PR.

The builder can now be configured with the following properties to
control automatic reconnections on network errors or server disconnects:

- reconnect
- reconnect_on_disconnect
- reconnect_delay (min and max)
- max_reconnect_attempts
The async client can now be configured to automatically reconnect on
network errors or server disconnections
The enum replaces the need for multiple `AtomicBool`'s to maintain the
disconnection reason. This makes the code easier to read and more
ergonomic to maintain the state.
@rageshkrishna
Copy link
Contributor Author

Thanks @1c3t3a. Rebased just now. Let's see how it goes! 😄

@rageshkrishna
Copy link
Contributor Author

Well, that was dumb of me. We can't do a blocking_read inside as_stream because it blocks the runtime. I've changed as_stream to be async. I've pushed the update now with all tests passing locally.

I somehow forgot to run the tests after making that change yesterday. Sorry about that!

@@ -854,7 +847,7 @@ mod test {
.await?;

// open packet
let mut socket_stream = socket.as_stream();
let mut socket_stream = socket.as_stream().await;
let _ = socket_stream.next().await.unwrap()?;

println!("Here12");
Copy link
Owner

@1c3t3a 1c3t3a Mar 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just saw this now: Where does that come from? :) (The println)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@rageshkrishna
Copy link
Contributor Author

Is the current failure a flaky test? It's coming from a sync client test, but I'm getting 100% pass locally.

@1c3t3a
Copy link
Owner

1c3t3a commented Mar 22, 2024

Is the current failure a flaky test?

I never saw a flaky test in this repo.. that is weird. How do you execute the servers locally? Do you use the devcontainer?

@1c3t3a
Copy link
Owner

1c3t3a commented Mar 22, 2024

Does the server side test have any state? E.g. you first execute the async test and then it doesn't work anymore?
Can you try executing both tests after each other locally?

@rageshkrishna
Copy link
Contributor Author

I never saw a flaky test in this repo.. that is weird. How do you execute the servers locally? Do you use the devcontainer?

No, I fired up the docker container for the websocket server and ran cargo test like it's written in the ci readme.

@rageshkrishna
Copy link
Contributor Author

Ahh, I see what I was missing now... I wasn't passing --features async to the test. I'm able to repro this failure on my branch. I'll try to get to the bottom of this.

Tests that rely on the reconnect socket server cannot run in parallel
because they ask the server to disconnect and restart as part of the
test. This can cause other tests running in parallel to fail in
mysterious ways.

By using the `serial_test` module and applying it selectively to the
affected tests, we can ensure that they are never executed concurrently.
@rageshkrishna
Copy link
Contributor Author

It was because the new test I added in the async client uses the same crate::test::socket_io_restart_server() that the existing test in the sync client uses. Since these tests execute in parallel, they clobber each other's expectations of when the server will connect/disconnect their connections.

My solution to this is adding a new dev-dependency on the serial_test crate, and then marking both of these reconnect tests with the [serial_test(reconnect)] attribute. reconnect is just a key to identify which tests must be executed serially, so this doesn't affect the other tests in any way.

With this, cargo test --features async is all green. Please take a look and let me know what you think.

@rageshkrishna
Copy link
Contributor Author

Here's the commit that adds it: e2eea62

Copy link

codecov bot commented Mar 22, 2024

Codecov Report

Attention: Patch coverage is 91.09948% with 17 lines in your changes are missing coverage. Please review.

Project coverage is 92.05%. Comparing base (4324d7b) to head (cb7311f).

Files Patch % Lines
socketio/src/asynchronous/client/client.rs 91.61% 13 Missing ⚠️
socketio/src/asynchronous/client/builder.rs 88.57% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #400      +/-   ##
==========================================
+ Coverage   91.97%   92.05%   +0.07%     
==========================================
  Files          36       36              
  Lines        4872     5007     +135     
==========================================
+ Hits         4481     4609     +128     
- Misses        391      398       +7     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Owner

@1c3t3a 1c3t3a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! This was a big one, thanks a lot @rageshkrishna for implementing this, really appreciated :)

@1c3t3a 1c3t3a merged commit 5ae3d4d into 1c3t3a:main Mar 22, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Async ClientBuilder dose not impl reconnect
3 participants