Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-7092. EC Offline Recovery with simultaneous Over Replication #3667

Closed
wants to merge 4 commits into from

Conversation

swamirishi
Copy link
Contributor

…& Under Replication

What changes were proposed in this pull request?

In case of overreplication & underreplication happening together, it could so happen that all nodes are excluded in case of underreplication. Thus only underreplicationHandler would continuously fail & overreplication will not run.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7092

How was this patch tested?

Unit Tests, Integration Tests

@swamirishi swamirishi changed the title HDDS-7092. : EC: Offline Recovery with simultaneous Over Replication … HDDS-7092. EC: Offline Recovery with simultaneous Over Replication … Aug 10, 2022
@swamirishi swamirishi changed the title HDDS-7092. EC: Offline Recovery with simultaneous Over Replication … HDDS-7092. EC Offline Recovery with simultaneous Over Replication Aug 10, 2022
@swamirishi swamirishi changed the title HDDS-7092. EC Offline Recovery with simultaneous Over Replication HDDS-7092. EC Offline Recovery with simultaneous Over Replication Aug 10, 2022
Copy link
Contributor

@umamaheswararao umamaheswararao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @swamirishi for working on this JIRA. I have reviewed the patch.

}

// No issues detected, so return healthy.
return new ContainerHealthResult.HealthyResult(container);
// If No issues detected, return healthy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit typo: No -> no

LOG.info("The container {} state changed and it's not in over"
+ " replication any more. Current state is: {}",
container.getContainerID(), currentUnderRepRes);
+ " replication any more. Current state is: {}",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there tabs here?

@@ -342,6 +342,10 @@ public List<Integer> overReplicatedIndexes(boolean includePendingDelete) {
return indexes;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right now you are using it only for size? why it is returning set. Is there any other plans to use? Just healthy index count may work?

@sodonnel
Copy link
Contributor

I'm not sure about this change. I feel it adds quite a bit of complexity to the model to handle what is really an edge case for a small cluster. I'd like a container to be handled for only a single state at a time as we could get into a position where we try reconstruction only to have a replica we are reading from removed by over-replication handling at the same time. It also makes things easier to think about, rather than the container having many states.

Few thoughts come to mind:

  1. What if we return over-replication health first, before under-replication? That way, we can fix this with a small change, by just changing the order in the ECContainerHealthCheck. The negative is that we have to wait longer to fix an under-replication, but the change here will be small and the general flow stays the same.

  2. Or, inside the under-replication handler, we have access to the methods in ECContainerReplicaCount which can tell us if there are over-replicated indexes. If, and only if, we get a failure in the UnderRep handler due to not finding enough nodes, we can somehow switch to over-replication handling and remove the excess replicas. However that starts to muddle the over and under replication logic together, which I was trying to avoid.

I probably need to think about it a bit more.

@umamaheswararao
Copy link
Contributor

we could get into a position where we try reconstruction only to have a replica we are reading from removed by over-replication handling at the same time.

I am not sure this patch added much complexity than what we have, but your pointed case is a good one to rethink on this patch.

I agree this should be a corner case and this is possible only for smaller clusters. However we should fix this though as some test clusters can go into this situations and we should have some solution for it.
I think delaying under replication is not a good idea. or should be check cluster size and over replications are at same size then only we return over replication first? I know this would introduce one more ugly check though.

@sodonnel
Copy link
Contributor

I feel we should handle this in the under-replication handler. In normal circumstances, the container can be under-replicated and it might be over replicated too, but if we can create enough new replicas for under-replication we don't care about the over-replication at this stage.

However if we find we cannot place enough new replicas, we can check for over-replication. If it is over replicated too, we can refactor the code so we can call into the over-replication handler and schedule the delete container commands.

This would avoid removing replicas we may depend on for reconstruction or copy. It also avoids the race condition where we queue both an under and over replication, and the under-replication would fail until the over replication gets processed and completed.

If we handle it in the under replication handler, the health check code still only returns a single state (healthy, under or over replicated). When we integrate the Ratis code, it is not possible for a container to be both under and over replicated, so it keeps the code consistent between the two flows, which is helpful to understanding in the future.

What do you think?

@adoroszlai adoroszlai added the EC label Nov 18, 2022
@sodonnel
Copy link
Contributor

Fixed with an alternative approach in #3984

@sodonnel sodonnel closed this Nov 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants