Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(emqx_channel): do not log stale sock_close event as error #11975

Merged
merged 2 commits into from Nov 21, 2023

Conversation

zmstone
Copy link
Member

@zmstone zmstone commented Nov 19, 2023

Fixes Issue #11812 and EMQX-11369
In some cases, EMQX may decide to close socket and mark connection at 'disconnected' state, for example, when DISCONNECTE packet is received, or, when failed to write data to socket. However, by the time EMQX decided to close the socket, the socket might have already been closed by peer, and the tcp_closed envet is already delivered to the process mailbox -- causing EMQX to handle sock_close event at 'disconnected' state.

This PR removes the error level log.

Summary

馃[deprecated] Generated by Copilot at eff79b3

Refactor and fix bugs in channel modules for MQTT and STOMP gateways. Remove error log for socket close race condition in emqx_channel.erl and update emqx_mqtt_channel.erl and emqx_stomp_channel.erl to use the new channel API.

PR Checklist

Please convert it to a draft if any of the following conditions are not met. Reviewers may skip over until all the items are checked:

  • Added tests for the changes
  • Added property-based tests for code which performs user input validation
  • Changed lines covered in coverage report
  • Change log has been added to changes/(ce|ee)/(feat|perf|fix|breaking)-<PR-id>.en.md files
  • For internal contributor: there is a jira ticket to track this change
  • Created PR to emqx-docs if documentation update is required, or link to a follow-up jira ticket
  • Schema changes are backward compatible

Checklist for CI (.github/workflows) changes

  • If changed package build workflow, pass this action (manual trigger)
  • Change log has been added to changes/ dir for user-facing artifacts update

@zmstone zmstone requested review from lafirest and a team as code owners November 19, 2023 21:23
@zmstone zmstone force-pushed the 1119-fix-socket-close-race-condition branch from eff79b3 to ce84e83 Compare November 19, 2023 21:26
In some cases, EMQX may decide to close socket and mark connection
at 'disconnected' state, for example, when DISCONNECTE packet is
received, or, when failed to write data to socket.
However, by the time EMQX decided to close the socket, the socket
might have already been closed by peer, and the `tcp_closed` envet
is already delivered to the process mailbox -- causing EMQX to
handle sock_close event at 'disconnected' state.
@zmstone zmstone force-pushed the 1119-fix-socket-close-race-condition branch from ce84e83 to e73bf71 Compare November 19, 2023 21:27
@@ -1246,8 +1246,10 @@ handle_info(
{ok, Channel3} -> {ok, ?REPLY_EVENT(disconnected), Channel3};
Shutdown -> Shutdown
end;
handle_info({sock_closed, Reason}, Channel = #channel{conn_state = disconnected}) ->
?SLOG(error, #{msg => "unexpected_sock_close", reason => Reason}),
handle_info({sock_closed, _Reason}, Channel = #channel{conn_state = disconnected}) ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember we also translate MQTT.disconnect to sock_closed some where.

Need double check

Copy link
Member Author

@zmstone zmstone Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we do, it's in emqx_connection.erl.
it's one of the causes of this race:

  1. receive DISCONNECT (being processed), and tcp_closed (in the mailbox)
  2. handle DISCONNECT as: a) close socket, b) mark conn_state as disconnected
  3. handle tcp_closed at disconnected state.

@zmstone zmstone merged commit fa91bac into emqx:release-53 Nov 21, 2023
156 of 157 checks passed
@zmstone zmstone deleted the 1119-fix-socket-close-race-condition branch November 21, 2023 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants