HDDS-7279. Snapshot Create requires Double Buffer Flush thread to split the commit batch.#3958
Conversation
…it the commit batch
|
Hi @GeorgeJahad , if you are interested you can also take a look at this draft as we discussed earlier. |
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Outdated
Show resolved
Hide resolved
…notify in OzoneManagerDoubleBuffer
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Outdated
Show resolved
Hide resolved
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Outdated
Show resolved
Hide resolved
…eateSnapshot request as barrier
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Show resolved
Hide resolved
...one-manager/src/test/java/org/apache/hadoop/ozone/om/ratis/TestOzoneManagerDoubleBuffer.java
Outdated
Show resolved
Hide resolved
...one-manager/src/test/java/org/apache/hadoop/ozone/om/ratis/TestOzoneManagerDoubleBuffer.java
Show resolved
Hide resolved
…dyFutureQueue before cache cleaning
|
Looks like there are a bunch of failures in the newly added test: https://github.com/apache/ozone/actions/runs/3510796319/jobs/5880950816 |
Yes, It worked fine on mac but seems like failing due to some race condition in workflow. I thought it will be fixed if we disabled the daemon thread and enable it after adding responses to the currentBuffer. But still failing. I'm gone extract out the flushing logic and test it independently. |
smengcl
left a comment
There was a problem hiding this comment.
Thanks @hemanthboyina for the patch. Overall looks good. I have some minor comments inline.
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Show resolved
Hide resolved
| // New queue gets created in three conditions: | ||
| // 1. It is first element in the response, | ||
| // 2. Current request is createSnapshot request. | ||
| // 3. Previous request was createSnapshot request. | ||
| if (response.isEmpty() || | ||
| omResponse.getCreateSnapshotResponse() != null || | ||
| (previousOmResponse != null && | ||
| previousOmResponse.getCreateSnapshotResponse() != null)) { |
There was a problem hiding this comment.
Yes, I think it is because CreateSnapshotResponse is optional attribute.
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Show resolved
Hide resolved
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Outdated
Show resolved
Hide resolved
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Outdated
Show resolved
Hide resolved
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Show resolved
Hide resolved
...e/ozone-manager/src/main/java/org/apache/hadoop/ozone/om/ratis/OzoneManagerDoubleBuffer.java
Outdated
Show resolved
Hide resolved
prashantpogde
left a comment
There was a problem hiding this comment.
Thanks for the patch @hemantk-12. LGTM
|
Thanks @hemantk-12 for the patch, @GeorgeJahad and @prashantpogde for reviewing this. |

What changes were proposed in this pull request?
The OmRequest double buffer flush thread flushes the entire buffer as a batch. Since follower OM's will flush batches with different contents, snapshots can't stay consistent between the leader and the followers.
This change is to flush double buffer snapshot aware and flush all the transactions before snapshot create request.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-7279
How was this patch tested?