Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v23.3.x] Distribute health report collection #17360

Conversation

mmaslankaprv
Copy link
Member

Backport of PR #17158

@mmaslankaprv mmaslankaprv added this to the v23.3.x-next milestone Mar 25, 2024
@mmaslankaprv mmaslankaprv linked an issue Mar 25, 2024 that may be closed by this pull request
@mmaslankaprv
Copy link
Member Author

/dt

@mmaslankaprv mmaslankaprv marked this pull request as ready for review March 25, 2024 16:20
@mmaslankaprv
Copy link
Member Author

/ci-repeat 1

Size of partition contained in the node health report doesn't have to be
equal on all of the nodes. Change the health monitor test to account for
that fact.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 4e64c0c)
Change the health monitor logic to distribute the health report
collection logic. Previously all the nodes queried the cluster health
from the `redpanda/controller/0` partition leader. This put additional
pressure on that node as it had to deal with serialization of node
reports.

Changed health report collection logic so that every node queries each
other to collect its health report statistics. This way the overhead
related with serialization and handling health report request is evenly
distributed among all the nodes in the cluster.

Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 7517e9c)
Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit df0c94b)
Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 086c032)
Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 977b21b)
Signed-off-by: Michal Maslanka <michal@redpanda.com>
(cherry picked from commit 37b7f24)
@piyushredpanda piyushredpanda modified the milestones: v23.3.x-next, v23.3.10 Mar 26, 2024
@mmaslankaprv mmaslankaprv force-pushed the vbotbuildovich/backport-17158-v23.3.x-47 branch from 6cd9b9e to 855e868 Compare March 26, 2024 16:52
@CLAassistant
Copy link

CLAassistant commented Mar 26, 2024

CLA assistant check
All committers have signed the CLA.

@mmaslankaprv mmaslankaprv force-pushed the vbotbuildovich/backport-17158-v23.3.x-47 branch from 855e868 to b43abf0 Compare March 27, 2024 06:53
Signed-off-by: Michał Maślanka <michal@redpanda.com>
@mmaslankaprv mmaslankaprv force-pushed the vbotbuildovich/backport-17158-v23.3.x-47 branch from b43abf0 to 6282000 Compare March 27, 2024 07:14
@bharathv
Copy link
Contributor

Failure : known issues

#14646
#17247

@mmaslankaprv mmaslankaprv merged commit 8bb01d0 into redpanda-data:v23.3.x Mar 27, 2024
13 of 17 checks passed
@mmaslankaprv mmaslankaprv deleted the vbotbuildovich/backport-17158-v23.3.x-47 branch March 27, 2024 18:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[v23.3.x] Distribute health report collection
4 participants