Skip to content

Comments

RATIS-1375. Handle bad storage dir due to disk failures.#477

Merged
szetszwo merged 1 commit intoapache:masterfrom
guihecheng:RATIS-1375
May 12, 2021
Merged

RATIS-1375. Handle bad storage dir due to disk failures.#477
szetszwo merged 1 commit intoapache:masterfrom
guihecheng:RATIS-1375

Conversation

@guihecheng
Copy link

What changes were proposed in this pull request?

Handle bad storage dir due to disk failures.
This includes a small fix for the remove behavior of the directories list.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/RATIS-1375

How was this patch tested?

manual tests

@guihecheng
Copy link
Author

Hello @szetszwo @runzhiwang , I'm new to ratis, could you help review this one when you are free? Thanks~

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@guihecheng , thanks a lot for working on this. Some comments inlined.

@guihecheng
Copy link
Author

@szetszwo updated thanks~

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

Will wait for the Jenkins build results.

@guihecheng
Copy link
Author

guihecheng commented May 11, 2021

@szetszwo oh, sorry for the style problem, will fix it.
And for the ut case, I should handle the exception path properly.
Originally, it should get an OverlappingFileLockException followed/wrapped by an IOException, and they will not match 'AccessDeniedException' and will be thrown inside the loop.
For now, the second IOException is caught and handled, and we throw a new IOException outside the loop.
I tried to keep the test logic and throw the specific exception.

@szetszwo szetszwo merged commit 363dd07 into apache:master May 12, 2021
@guihecheng guihecheng deleted the RATIS-1375 branch May 12, 2021 03:30
alluxio-bot referenced this pull request in Alluxio/alluxio Sep 14, 2021
### What changes are proposed in this pull request?
Fix the job server service hangs on when set
alluxio.master.journal.folder a no privileged path.

### Why are the changes needed?
Fix the job server service hangs on when set
alluxio.master.journal.folder a no privileged path.
Because of ratis's bug. See this pr
`https://github.com/apache/ratis/pull/477`.

### Does this PR introduce any user facing changes?
No

pr-link: #13906
change-id: cid-628bfd3a1c912650ae874406d0a3e9a6bdba3be8
symious pushed a commit to symious/ratis that referenced this pull request Feb 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants