Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix error reporting when R\W path has errors #6102

Merged
merged 1 commit into from Jul 27, 2020

Conversation

dannyzaken
Copy link
Contributor

Explain the changes

  1. in nodes_monitor report_error_on_node_blocks
    • each error reported from the I\O path caused the node with error to be disconnected.
    • each disconnection caused more errors and more disconnections
    • changed the function to only disconnect once every 2 minutes.
    • maybe we need to consider if the disconnect is always necessary or we should do it selectively.
  2. some log messages in block_store_client were referencing properties of undefined objects. fixed messages

Issues: Fixed #xxx / Gap #xxx

  1. Fixes https://bugzilla.redhat.com/show_bug.cgi?id=1847099

Testing Instructions:

Signed-off-by: Danny Zaken <dannyzaken@gmail.com>
@dannyzaken
Copy link
Contributor Author

I created an image based on 5.5 (dannyzaken/noobaa-core:test-s3-fix). waiting for @MeridianExplorer to run the failing test on it

@Neon-White
Copy link
Contributor

I tried the dannyzaken/noobaa-core:test-s3-fix image and failed to reproduce the bug ever since. Seems like this PR fixes the issue properly.

@dannyzaken dannyzaken changed the title [WIP] Fix error reporting when R\W path has errors Fix error reporting when R\W path has errors Jul 26, 2020
@@ -3494,7 +3495,12 @@ class NodesMonitor extends EventEmitter {
'issues_report', item.node.issues_report,
'block_report', block_report);
// disconnect from the node to force reconnect
this._disconnect_node(item);
// only disconnect if enough time passed since last disconnect to avoid amplification of errors in R\W flows
const DISCONNECT_GRACE_PERIOD = 2 * 60 * 1000; // 2 minutes grace before another disconnect
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add this to config.js?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s something very specific to this flow. I usually prefer to put it close to the usage in these cases

@nimrod-becker nimrod-becker merged commit c7188fe into noobaa:master Jul 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants