Skip to content

Conversation

@adoroszlai
Copy link
Contributor

@adoroszlai adoroszlai commented May 5, 2023

What changes were proposed in this pull request?

Upgrade Ratis to 2.5.1.

Fix for compilation error in tests due to interface change was provided by @szetszwo. When merging this, please keep the following detail in the commit message:

Co-authored-by: Tsz-Wo Nicholas Sze <szetszwo@apache.org>

https://issues.apache.org/jira/browse/HDDS-8382

How was this patch tested?

CI:
https://github.com/adoroszlai/hadoop-ozone/actions/runs/4891132333

@adoroszlai adoroszlai self-assigned this May 5, 2023
@adoroszlai adoroszlai added the dependencies Pull requests that update a dependency file label May 5, 2023
@adoroszlai adoroszlai marked this pull request as draft May 6, 2023 07:26
@adoroszlai
Copy link
Contributor Author

adoroszlai commented May 6, 2023

TestDeleteWithSlowFollower is flaky (HDDS-8430), but it seems to be failing more frequently with new Ratis (passed only 13 / 100 runs).

@adoroszlai
Copy link
Contributor Author

With Ratis 2.5.1, block is deleted contrary to test expectations. This results in two kinds of failures, depending on timing:

Assert.assertTrue(
containerData.getNumPendingDeletionBlocks() > numPendingDeletionBlocks);

// make sure the chunk was never deleted on the leader even though
// deleteBlock handler is invoked
try {
for (ContainerProtos.ChunkInfo chunkInfo : blockData.getChunks()) {
keyValueHandler.getChunkManager()
.readChunk(container, blockID, ChunkInfo.getFromProtoBuf(chunkInfo),
null);
}
} catch (IOException ioe) {
Assert.fail("Exception should not be thrown.");
}

When block deletion reaches the following condition:

if (minReplicatedIndex >= 0 && minReplicatedIndex < containerBCSID) {
LOG.warn("Close Container log Index {} is not replicated across all"
+ " the servers in the pipeline {} as the min replicated "
+ "index is {}. Deletion is not allowed in this container "
+ "yet.", containerBCSID,
containerData.getOriginPipelineId(), minReplicatedIndex);
return false;
} else {
return true;
}

with Ratis 2.5.1 minReplicatedIndex is always -1 (I guess due to RATIS-1767, but could be wrong), so deletion is allowed.

I think the minReplicatedIndex >= 0 condition is wrong. If minReplicatedIndex == -1, then we should not allow deleting the blocks, so I removed that condition. Test passes locally. Pushed new commit, waiting for CI in fork.

@adoroszlai adoroszlai marked this pull request as ready for review May 6, 2023 20:43
@adoroszlai adoroszlai requested review from duongkame and smengcl May 8, 2023 17:06
Comment on lines 220 to -223
long minReplicatedIndex =
ratisServer.getMinReplicatedIndex(pipelineID);
long containerBCSID = containerData.getBlockCommitSequenceId();
if (minReplicatedIndex >= 0 && minReplicatedIndex < containerBCSID) {
Copy link
Contributor

@smengcl smengcl May 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intended? minReplicatedIndex could be -1, thus getting rid of the minReplicatedIndex >= 0 condition here changes the logic slightly. oh I should've read the comment above first: #4664 (comment)

cc @szetszwo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is correct to remove minReplicatedIndex >= 0.

Copy link
Contributor

@smengcl smengcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. but I'd prefer another eye on this.

@adoroszlai adoroszlai requested a review from szetszwo May 9, 2023 06:39
Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

Comment on lines 220 to -223
long minReplicatedIndex =
ratisServer.getMinReplicatedIndex(pipelineID);
long containerBCSID = containerData.getBlockCommitSequenceId();
if (minReplicatedIndex >= 0 && minReplicatedIndex < containerBCSID) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it is correct to remove minReplicatedIndex >= 0.

@szetszwo szetszwo merged commit 5e8b806 into apache:master May 9, 2023
@adoroszlai adoroszlai deleted the HDDS-8382 branch May 9, 2023 16:46
@adoroszlai
Copy link
Contributor Author

Thanks @smengcl for the review, @szetszwo for helping out with the state machine tests, reviewing and merging this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants