Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: dn doesn't have chance to send a heartbeat to the new leader #2471

Merged
merged 6 commits into from
Sep 26, 2023

Conversation

WenyXu
Copy link
Member

@WenyXu WenyXu commented Sep 22, 2023

I hereby agree to the terms of the GreptimeDB CLA

What's changed and what's your intention?

If the metasrv leader is dead, the datanode only has a small window(Datanode Lease - (<Heartbeat interval + Timeout> + <Meta Lease> + <Ask Leader Timeout + Connect Timeout>) to resend region stats. We must ensure the small window is greater than 0s. After this PR, the datanode will have a 5.5s~ (13s - 3.5s - 3s - 1s) window to resend the heartbeat.

  1. refactor: set meta leader lease secs to 3s (origin: 5s)
  2. fix: correct default heartbeat interval
  3. refactor: ask the metasrv leader in parallel
  4. feat: configure heartbeat client timeout to 500ms
  5. fix: trigger to send heartbeat immediately after fail

Checklist

  • I have written the necessary rustdoc comments.
  • I have added the necessary unit tests and integration tests.

Refer to a related PR or issue link (optional)

@WenyXu WenyXu added the O-chaos Found by chaos tests label Sep 22, 2023
@WenyXu WenyXu changed the title fix: fix datanode heartbeat fix dn doesn't have chance to send a heartbeat to the new leader Sep 22, 2023
@codecov
Copy link

codecov bot commented Sep 22, 2023

Codecov Report

Merging #2471 (099ca82) into develop (aef9e7b) will decrease coverage by 0.38%.
Report is 1 commits behind head on develop.
The diff coverage is 70.45%.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2471      +/-   ##
===========================================
- Coverage    84.88%   84.50%   -0.38%     
===========================================
  Files          725      725              
  Lines       115186   115212      +26     
===========================================
- Hits         97780    97365     -415     
- Misses       17406    17847     +441     

@WenyXu WenyXu changed the title fix dn doesn't have chance to send a heartbeat to the new leader fix: dn doesn't have chance to send a heartbeat to the new leader Sep 22, 2023
Copy link
Collaborator

@fengjiachun fengjiachun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@MichaelScofield MichaelScofield added this pull request to the merge queue Sep 26, 2023
Merged via the queue into GreptimeTeam:develop with commit 230a302 Sep 26, 2023
32 of 33 checks passed
paomian pushed a commit to paomian/greptimedb that referenced this pull request Oct 19, 2023
…eptimeTeam#2471)

* refactor: set meta leader lease secs to 3s

* fix: correct default heartbeat interval

* refactor: ask meta leader in parallel

* feat: configure heartbeat client timeout to 500ms

* fix: trigger to send heartbeat immediately after fail

* fix: fix clippy
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-chaos Found by chaos tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants