Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix rebalance issues #10967

Merged
merged 3 commits into from Jun 8, 2023
Merged

Conversation

savonarola
Copy link
Contributor

@savonarola savonarola commented Jun 7, 2023

Fixes:

We fix error message formatting in rebalance API: previously they could be displayed as iolist dumps (internal Erlang structure).
We add wait_health_check option to node evacuation CLI and API. This is a time interval when the node reports "unhealthy status" without beginning actual evacuation. We need this to allow a Load Balancer (if any) to remove the evacuated node from balancing and not forward (re)connecting clients to the evacuated node.

emqx/emqx-docs#1925

Summary

🤖 Generated by Copilot at d32ca8b

This pull request enhances the node rebalance and evacuation features of emqx by adding a wait_health_check option to the API and CLI, improving the error message formatting, and adjusting the timeouts and options for better performance and reliability. It also updates the tests and the documentation to reflect the changes.

PR Checklist

Please convert it to a draft if any of the following conditions are not met. Reviewers may skip over until all the items are checked:

  • Added tests for the changes
  • Changed lines covered in coverage report
  • Change log has been added to changes/{ce,ee}/(feat|perf|fix)-<PR-id>.en.md files
  • For internal contributor: there is a jira ticket to track this change
  • If there should be document changes, a PR to emqx-docs.git is sent, or a jira ticket is created to follow up
  • Schema changes are backward compatible

@savonarola savonarola force-pushed the 0607-rebalance-fixes branch 2 times, most recently from d32ca8b to a18e6bd Compare June 7, 2023 17:45
@savonarola savonarola marked this pull request as ready for review June 7, 2023 18:16
@savonarola savonarola requested a review from a team as a code owner June 7, 2023 18:16
apps/emqx_node_rebalance/src/emqx_node_rebalance.erl Outdated Show resolved Hide resolved
changes/ee/fix-10967.en.md Outdated Show resolved Hide resolved
savonarola and others added 2 commits June 7, 2023 21:37
Co-authored-by: Thales Macedo Garitezi <thalesmg@gmail.com>
Co-authored-by: Thales Macedo Garitezi <thalesmg@gmail.com>
@savonarola savonarola merged commit b9f1a70 into emqx:release-51 Jun 8, 2023
110 checks passed
@yanzhiemq
Copy link
Collaborator

yanzhiemq commented Jun 13, 2023

Bug Fixes

  • Fixed error message formatting in rebalance API: previously they could be displayed as unclear dumps of internal Erlang structures.

    Added wait_health_check option to node evacuation CLI and API. This is a time interval when the node reports "unhealthy status" without beginning actual evacuation. We need this to allow a Load Balancer (if any) to remove the evacuated node from balancing and not forward (re)connecting clients to the evacuated node.

@yanzhiemq
Copy link
Collaborator

yanzhiemq commented Jun 13, 2023

修复

  • 修复了重平衡 API 中错误消息的格式问题:之前它们可能以不清晰的 Erlang 内部结构转储的形式显示。

    在节点疏散的 CLI 和 API 中添加了 wait_health_check 选项。这是一个时间间隔,节点在此期间报告为“不健康状态”,但不会开始实际的疏散操作。我们需要这个选项来允许负载均衡器(如果有)将已疏散的节点从负载均衡中移除,并且不将(重新)连接的客户端转发到已疏散的节点。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants