Skip to content

Conversation

@hpatro
Copy link
Contributor

@hpatro hpatro commented Apr 26, 2025

This fail logic was added in #1191, we should also clear the
pfail flag in this case.

Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
@codecov
Copy link

codecov bot commented Apr 26, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 71.00%. Comparing base (0b94ca6) to head (72ce517).
Report is 79 commits behind head on unstable.

Additional details and impacted files
@@             Coverage Diff              @@
##           unstable    #2012      +/-   ##
============================================
- Coverage     71.01%   71.00%   -0.01%     
============================================
  Files           123      123              
  Lines         66033    66115      +82     
============================================
+ Hits          46892    46945      +53     
- Misses        19141    19170      +29     
Files with missing lines Coverage Δ
src/cluster_legacy.c 86.41% <100.00%> (+0.32%) ⬆️

... and 25 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Can this difference be observed by users at all?

@hpatro
Copy link
Contributor Author

hpatro commented Apr 28, 2025

Why?

Can this difference be observed by users at all?

As far as I understand, the cluster node state transition from PFAIL to FAIL and both of these are mutually exclusive and shouldn't exist together. So, I think finding the below response under CLUSTER NODES is incorrect.

> ./valkey-cli cluster nodes | grep fail?
511676acd16696d5b8767aa6c4a8ba54c69b70a0 :0@0 master,fail?,fail,noaddr - 1745372213541 0 1993 disconnected

@sarthakaggarwal97
Copy link
Contributor

looks like we do a similar change here in markNodeAsFailingIfNeeded. It is called in clusterCron and through cluster gossip.

@hpatro hpatro requested a review from madolson April 29, 2025 15:38
Copy link
Contributor

@zuiderkwast zuiderkwast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, makes sense to me.

Copy link
Member

@enjoy-binbin enjoy-binbin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. btw, how do you reproduce it?

@hwware hwware merged commit 689a3b1 into valkey-io:unstable May 27, 2025
51 checks passed
@zuiderkwast
Copy link
Contributor

zuiderkwast commented May 27, 2025

Backport this one? Is it a follow-up of #1191 which is included in 8.1.

chzhoo pushed a commit to chzhoo/valkey that referenced this pull request Jun 12, 2025
This fail logic was added in valkey-io#1191, we should also clear the
pfail flag in this case.

Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Signed-off-by: chzhoo <czawyx@163.com>
shanwan1 pushed a commit to shanwan1/valkey that referenced this pull request Jun 13, 2025
This fail logic was added in valkey-io#1191, we should also clear the
pfail flag in this case.

Signed-off-by: Harkrishn Patro <bunty.hari@gmail.com>
Signed-off-by: shanwan1 <shanwan1@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants