Skip to content

Conversation

@fraidev
Copy link
Contributor

@fraidev fraidev commented Apr 8, 2025

No description provided.

@fraidev fraidev force-pushed the store_file_replica_error branch from b136d2e to cdf320d Compare April 8, 2025 23:04
@fraidev fraidev requested review from nacardin and sehz April 8, 2025 23:37
@fraidev fraidev marked this pull request as ready for review April 8, 2025 23:38
Copy link
Contributor

@nacardin nacardin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would help in discovering initial error from latest log entry, but we need to solve the issue of spamming error logs in a tight loop.

cleaner: Arc<Cleaner>,
size: Arc<ReplicaSize>,
append_failure: bool, // if this is true, last append failed, should not append again
error_msg: Option<String>,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error message can be recovered from log, so there is no need hold on this. Beside which error?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as @nacardin, in the SPU, if there are certain number of errors happen, entire replica should be disabled and return offline status

@fraidev fraidev closed this May 10, 2025
@fraidev
Copy link
Contributor Author

fraidev commented May 10, 2025

Fixed it in #4521 and #4527

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants