Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(emqx_mgmt): catch OOM shutdown exits properly when calling a conn procces #12814

Conversation

SergeTupchiy
Copy link
Contributor

@SergeTupchiy SergeTupchiy commented Mar 29, 2024

  1. The exit reason is expected to include gen_server Location:
    {{shutdown, OOMInfo}, Location}.
  2. Handle call client timeouts

Fixes EMQX-12124

Release version: v/e5.6.1

Summary

PR Checklist

Please convert it to a draft if any of the following conditions are not met. Reviewers may skip over until all the items are checked:

  • Added tests for the changes
  • Added property-based tests for code which performs user input validation
  • Changed lines covered in coverage report
  • Change log has been added to changes/(ce|ee)/(feat|perf|fix|breaking)-<PR-id>.en.md files
  • For internal contributor: there is a jira ticket to track this change
  • Created PR to emqx-docs if documentation update is required, or link to a follow-up jira ticket
  • Schema changes are backward compatible

Checklist for CI (.github/workflows) changes

  • If changed package build workflow, pass this action (manual trigger)
  • Change log has been added to changes/ dir for user-facing artifacts update

…nt conn process

The exit reason is expected to include gen_server `Location`:
  `{{shutdown, OOMInfo}, Location}`.
@SergeTupchiy SergeTupchiy requested review from lafirest and a team as code owners March 29, 2024 11:12
@@ -0,0 +1,4 @@
Handle several errors in `/clients/{clientid}/mqueue_messages` and `/clients/{clientid}/inflight_messages` APIs:

- Internal timeout, which means that EMQX failed to get the list of Inflight/Mqueue messages within the default timeout of 5 s. This error may occur when the system is under a heavy load. The API will return 500 `{"code":"INTERNAL_ERROR","message":"timeout"}` response and log additional details.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be that the client process is stuck. Maybe even worth logging the current stacktrace from process_info

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a client proc stacktrace to the log event.

@SergeTupchiy SergeTupchiy force-pushed the EMQX-12124-fix-msgs-api-client-shutdown branch from 898895a to 6cdf876 Compare March 29, 2024 21:03
@SergeTupchiy SergeTupchiy merged commit dd6f65f into emqx:release-56 Apr 1, 2024
166 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants