Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-8535. ReplicationManager: Unhealthy containers could block EC recovery in small clusters #4756

Merged
merged 2 commits into from May 25, 2023

Conversation

siddhantsangwan
Copy link
Contributor

What changes were proposed in this pull request?

With EC containers, if there is a small cluster of say 6 nodes with EC-3-2, a container will require 5 nodes. If 2 containers become unhealthy, reconstruction will be required to recover the 2 containers, but there is only 1 spare node.
This means one will get recovered, and we will have 4 "good" containers and 2 UNHEALTHY and the container will remain stuck like this because UNHEALTHY containers are only removed once the container has no over or under replication.
A similar problem was resolved previously where an EC container with both over and under replication can meet the same problem, where under replication cannot proceed due to insufficient spare nodes. In that case, the solution was to check for this case, and call the over-replication handler to clear up the excess replicas.

This PR is still in draft state for some early reviews while I write tests and think about edge cases. Here, we try to delete an UNHEALTHY replica in the same handler to free up a DN. Then we throw the exception so that this container gets queued again in the under replication queue. Perhaps it's better to throw first if over replication handling is invoked, so we don't delete multiple replicas in one go.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8535

How was this patch tested?

Wrote one UT.


// remove replicas that aren't on IN_SERVICE and HEALTHY DNs
// the leftover replicas will be eligible for deletion
Iterator<ContainerReplica> iterator = replicaCount.getReplicas().iterator();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its not clear to me why we are removing the replicas from the iterator - this is the list which is internal to replicaCount, so we are modifying it, as it does not return a copy or an unModifiable list.

We don't see to use the replicaCount.getReplicas() again in the rest of the method - we just form the closed list and use it. Do we the iterator.remove() calls?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, this area has incorrect logic. My goal is to consider only those replicas that are on in service nodes. Will correct this this.

@siddhantsangwan
Copy link
Contributor Author

@sodonnel Thanks for the review. I've addressed the buggy code so that there's no removal using the iterator now. I just add the needed replicas to the sets. Also added another unit test in the latest commit.

@siddhantsangwan siddhantsangwan marked this pull request as ready for review May 25, 2023 13:47
Copy link
Contributor

@sodonnel sodonnel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sodonnel sodonnel merged commit 6f17f98 into apache:master May 25, 2023
36 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants