Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(phoenix-channel): fail on missing heartbeat after 5s #4296

Merged
merged 6 commits into from
Mar 25, 2024

Conversation

thomaseizinger
Copy link
Member

This PR fixes a bug and adds a missing feature to phoenix-channel.

  1. Previously, we used to erroneously reset the heartbeat state on all sorts of empty replies, not just the specific one from the heartbeat.
  2. We only failed on missing heartbeats when it was time to send the next one.

With this PR, we correct the first bug and add a dedicated timeout of 5s for the heartbeat reply.

Copy link

vercel bot commented Mar 25, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

1 Ignored Deployment
Name Status Preview Updated (UTC)
firezone ⬜️ Ignored (Inspect) Mar 25, 2024 2:46am

Copy link

Terraform Cloud Plan Output

Plan: 9 to add, 8 to change, 9 to destroy.

Terraform Cloud Plan

Copy link

Performance Test Results

TCP

Test Name Received/s Sent/s Retransmits
direct-tcp-client2server 209.4 MiB (-7%) 212.1 MiB (-6%) 270 (-6%)
direct-tcp-server2client 230.7 MiB (-1%) 232.0 MiB (-1%) 391 (+27%)
relayed-tcp-client2server 134.8 MiB (-10%) 135.5 MiB (-10%) 163 (+3%)
relayed-tcp-server2client 150.7 MiB (-3%) 151.1 MiB (-3%) 206 (+34%)

UDP

Test Name Total/s Jitter Lost
direct-udp-client2server 50.0 MiB (-0%) 0.29ms (+6%) 0.00% (NaN%)
direct-udp-server2client 50.0 MiB (+0%) 0.01ms (+19%) 0.00% (NaN%)
relayed-udp-client2server 50.0 MiB (-0%) 0.20ms (+124%) 0.00% (NaN%)
relayed-udp-server2client 50.0 MiB (+0%) 0.04ms (-19%) 0.00% (NaN%)

Copy link
Member

@jamilbk jamilbk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change description LGTM, but I'll defer to @conectado and @ReactorScram for the Rust changes.

@jamilbk jamilbk enabled auto-merge March 25, 2024 20:03
@jamilbk jamilbk disabled auto-merge March 25, 2024 20:03
@ReactorScram
Copy link
Collaborator

"erroneously reset the heartbeat state on all sorts of empty replies, not just the specific one from the heartbeat."

Meaning we would not send our heartbeat, and the peer would disconnect since we weren't sending anything?

Copy link
Collaborator

@conectado conectado left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just phoenix-channel seems to need a hearbeat only if there were no other messages: https://github.com/phoenixframework/phoenix/blob/main/guides/howto/writing_a_channels_client.md?plain=1#L57

interval.set_missed_tick_behavior(MissedTickBehavior::Skip);
let next_id = self
.next_request_id
.fetch_add(1, std::sync::atomic::Ordering::SeqCst);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have an atomic value here? Since you already have &mut self

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jamilbk jamilbk added this pull request to the merge queue Mar 25, 2024
Merged via the queue into main with commit ecce024 Mar 25, 2024
138 checks passed
@jamilbk jamilbk deleted the feat/phoenix-channel/introduce-heartbeat-timeout branch March 25, 2024 23:23
@thomaseizinger
Copy link
Member Author

Because I wanted the Heartbeat struct to inde

"erroneously reset the heartbeat state on all sorts of empty replies, not just the specific one from the heartbeat."

Meaning we would not send our heartbeat, and the peer would disconnect since we weren't sending anything?

No, I think the resulting behaviour was that we would not detect missed heartbeats because the state was cleared too early. Not fatal I think because any reply basically fulfills the role of the heartbeat but it is semantically still incorrect.

ReactorScram added a commit that referenced this pull request Mar 27, 2024
This was introduced in #4296 and I'm guessing it shouldn't be there because we
are standardized on `tracing::*` and this goes straight to stderr, can't be filtered out, etc.
github-merge-queue bot pushed a commit that referenced this pull request Mar 27, 2024
This was introduced in #4296 and I'm guessing it shouldn't be there
because we are standardized on `tracing::*` and this goes straight to
stderr, can't be filtered out, etc.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants