Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-3117. Recon throws InterruptedException while getting new snapshot from OM. #648

Merged
merged 8 commits into from Mar 11, 2020

Conversation

avijayanhwx
Copy link
Contributor

What changes were proposed in this pull request?

On a cluster where an ozone client is continuously pushing data into OM, we can have cases where the OM Delta updates request timing out or failing due to some unexpected error. In that case, the expected behavior of Recon is to request the whole snapshot from OM. The bug was that we were interrupting the thread on exception from delta updates query which caused the sync process to stop. Removed the interrupt call.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-3117

How was this patch tested?

Added acceptance tests for Recon's OM related APIs.
Manually tested by creating 50 million keys on OM, and verified that Recon OM DB sync works as expected.

@avijayanhwx avijayanhwx self-assigned this Mar 6, 2020
@avijayanhwx
Copy link
Contributor Author

@swagle / @vivekratnavel Please review.

@vivekratnavel
Copy link
Contributor

Acceptance test failure is related to this change:

[{"fileSize":1024,"count":24},{"fileSize":2048,"count":0},{"fileSize":4096,"count":0},{"fileSize":8192,"count":0},{"fileSize":16384,"count":927},{"fileSize":32768,"count":27},{"fileSize":65536,"count":0},{"fileSize":131072,"count":0},{"fileSize":262144,"count":0},{"fileSize":524288,"count":0},{"fileSize":1048576,"count":0},{"fileSize":2097152,"count":0},{"fileSize":4194304,"count":0},{"fileSize":8388608,"count":6},{"fileSize":16777216,"count":2},{"fileSize":33554432,"count":2},{"fileSize":67108864,"count":0},{"fileSize":134217728,"count":0},{"fileSize":268435456,"count":0},{"fileSize":536870912,"count":0},{"fileSize":1073741824,"count":0},{"fileSize":2147483648,"count":0},{"fileSize":4294967296,"count":0},{"fileSize":8589934592,"count":0},{"fileSize":17179869184,"count":0},{"fileSize":34359738368,"count":0},{"fileSize":68719476736,"count":0},{"fileSize":137438953472,"count":0},{"fileSize":274877906944,"count":0},{"fileSize":549755813888,"count":0},{"fileSize":1099511627776,"count":0},{"fileSize":2199023255552,"count":0},{"fileSize":4398046511104,"count":0},{"fileSize":8796093022208,"count":0},{"fileSize":17592186044416,"count":0},{"fileSize":35184372088832,"count":0},{"fileSize":70368744177664,"count":0},{"fileSize":140737488355328,"count":0},{"fileSize":281474976710656,"count":0},{"fileSize":562949953421312,"count":0},{"fileSize":1125899906842624,"count":0},{"fileSize":9223372036854775807,"count":0}]' does not contain '"fileSize":16384,"count":10'

Filesize count is spread across many buckets because of the previous tests. We might need to change the filesize while running freon.

Copy link
Contributor

@swagle swagle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 LGTM

Copy link
Contributor

@vivekratnavel vivekratnavel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @avijayanhwx for working on this. Can you please check if the acceptance test delay can be improved?

@adoroszlai
Copy link
Contributor

Thanks @avijayanhwx for the fix, @swagle and @vivekratnavel for the reviews.

@adoroszlai adoroszlai merged commit 0627520 into apache:master Mar 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants