HDDS-7092. EC Offline Recovery with simultaneous Over Replication #3667

swamirishi · 2022-08-09T18:10:13Z

…& Under Replication

What changes were proposed in this pull request?

In case of overreplication & underreplication happening together, it could so happen that all nodes are excluded in case of underreplication. Thus only underreplicationHandler would continuously fail & overreplication will not run.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-7092

How was this patch tested?

Unit Tests, Integration Tests

…& Under Replication

umamaheswararao

Thanks @swamirishi for working on this JIRA. I have reviewed the patch.

umamaheswararao · 2022-08-17T06:25:22Z

...m/src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECContainerHealthCheck.java

    }

-    // No issues detected, so return healthy.
-    return new ContainerHealthResult.HealthyResult(container);
+    // If No issues detected, return healthy.


Nit typo: No -> no

umamaheswararao · 2022-08-17T06:35:45Z

...src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECOverReplicationHandler.java

      LOG.info("The container {} state changed and it's not in over"
-              + " replication any more. Current state is: {}",
-          container.getContainerID(), currentUnderRepRes);
+                      + " replication any more. Current state is: {}",


Are there tabs here?

umamaheswararao · 2022-08-17T06:42:03Z

.../src/main/java/org/apache/hadoop/hdds/scm/container/replication/ECContainerReplicaCount.java

@@ -342,6 +342,10 @@ public List<Integer> overReplicatedIndexes(boolean includePendingDelete) {
    return indexes;
  }



right now you are using it only for size? why it is returning set. Is there any other plans to use? Just healthy index count may work?

sodonnel · 2022-08-22T14:43:51Z

I'm not sure about this change. I feel it adds quite a bit of complexity to the model to handle what is really an edge case for a small cluster. I'd like a container to be handled for only a single state at a time as we could get into a position where we try reconstruction only to have a replica we are reading from removed by over-replication handling at the same time. It also makes things easier to think about, rather than the container having many states.

Few thoughts come to mind:

What if we return over-replication health first, before under-replication? That way, we can fix this with a small change, by just changing the order in the ECContainerHealthCheck. The negative is that we have to wait longer to fix an under-replication, but the change here will be small and the general flow stays the same.
Or, inside the under-replication handler, we have access to the methods in ECContainerReplicaCount which can tell us if there are over-replicated indexes. If, and only if, we get a failure in the UnderRep handler due to not finding enough nodes, we can somehow switch to over-replication handling and remove the excess replicas. However that starts to muddle the over and under replication logic together, which I was trying to avoid.

I probably need to think about it a bit more.

umamaheswararao · 2022-08-22T17:00:26Z

we could get into a position where we try reconstruction only to have a replica we are reading from removed by over-replication handling at the same time.

I am not sure this patch added much complexity than what we have, but your pointed case is a good one to rethink on this patch.

I agree this should be a corner case and this is possible only for smaller clusters. However we should fix this though as some test clusters can go into this situations and we should have some solution for it.
I think delaying under replication is not a good idea. or should be check cluster size and over replications are at same size then only we return over replication first? I know this would introduce one more ugly check though.

sodonnel · 2022-08-23T11:28:22Z

I feel we should handle this in the under-replication handler. In normal circumstances, the container can be under-replicated and it might be over replicated too, but if we can create enough new replicas for under-replication we don't care about the over-replication at this stage.

However if we find we cannot place enough new replicas, we can check for over-replication. If it is over replicated too, we can refactor the code so we can call into the over-replication handler and schedule the delete container commands.

This would avoid removing replicas we may depend on for reconstruction or copy. It also avoids the race condition where we queue both an under and over replication, and the under-replication would fail until the over replication gets processed and completed.

If we handle it in the under replication handler, the health check code still only returns a single state (healthy, under or over replicated). When we integrate the Ratis code, it is not possible for a container to be both under and over replicated, so it keeps the code consistent between the two flows, which is helpful to understanding in the future.

What do you think?

sodonnel · 2022-11-23T09:22:47Z

Fixed with an alternative approach in #3984

swamirishi added 2 commits August 9, 2022 11:07

HDDS-7092. : EC: Offline Recovery with simultaneous Over Replication …

4767327

…& Under Replication

HDDS-7092. : Fix Test case

5f14fa1

swamirishi changed the title ~~HDDS-7092. : EC: Offline Recovery with simultaneous Over Replication …~~ HDDS-7092. EC: Offline Recovery with simultaneous Over Replication … Aug 10, 2022

swamirishi changed the title ~~HDDS-7092. EC: Offline Recovery with simultaneous Over Replication …~~ HDDS-7092. EC Offline Recovery with simultaneous Over Replication Aug 10, 2022

swamirishi changed the title ~~HDDS-7092. EC Offline Recovery with simultaneous Over Replication~~ HDDS-7092. EC Offline Recovery with simultaneous Over Replication Aug 10, 2022

swagle requested review from umamaheswararao and sodonnel August 15, 2022 18:24

umamaheswararao reviewed Aug 17, 2022

View reviewed changes

HDDS-7092. : Address Review Comments

da7d209

HDDS-7092. : Address Review Comment(Fix Available Index Size Function)

70512b6

adoroszlai added the EC label Nov 18, 2022

sodonnel closed this Nov 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-7092. EC Offline Recovery with simultaneous Over Replication #3667

HDDS-7092. EC Offline Recovery with simultaneous Over Replication #3667

swamirishi commented Aug 9, 2022

umamaheswararao left a comment

umamaheswararao Aug 17, 2022

umamaheswararao Aug 17, 2022

umamaheswararao Aug 17, 2022

sodonnel commented Aug 22, 2022

umamaheswararao commented Aug 22, 2022

sodonnel commented Aug 23, 2022

sodonnel commented Nov 23, 2022

		@@ -342,6 +342,10 @@ public List<Integer> overReplicatedIndexes(boolean includePendingDelete) {
		return indexes;
		}

HDDS-7092. EC Offline Recovery with simultaneous Over Replication #3667

HDDS-7092. EC Offline Recovery with simultaneous Over Replication #3667

Conversation

swamirishi commented Aug 9, 2022

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

umamaheswararao left a comment

Choose a reason for hiding this comment

umamaheswararao Aug 17, 2022

Choose a reason for hiding this comment

umamaheswararao Aug 17, 2022

Choose a reason for hiding this comment

umamaheswararao Aug 17, 2022

Choose a reason for hiding this comment

sodonnel commented Aug 22, 2022

umamaheswararao commented Aug 22, 2022

sodonnel commented Aug 23, 2022

sodonnel commented Nov 23, 2022