HDDS-5277. Recon shows operational status as "DECOMMISSIONING" for "DECOMMISSIONED" DNs #2286

avijayanhwx · 2021-05-26T23:32:31Z

What changes were proposed in this pull request?

Recon relies on DN heartbeats to understand Node operational state. If the node goes down before it reports itself as DECOMMISSIONED, then there is a loss of information on Recon side. It is the SCM that moves a node form DECOMMISSIONING to DECOMMISSIONED state first, then the Datanode persists the state change locally, and then heartbeats with the new state to SCM & Recon subsequently. If the DN is shutdown before it can heartbeat the state change to Recon, then Recon lives with the stale information.

This patch adds a DeadNodeHandler for Recon that updates the node operational status information from SCM if needed. It also adds a step in the Recon SCM Sync Task that updates the node operational status from SCM for every dead node as seen from Recon.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-5277

How was this patch tested?

Manually tested using ozone decommissioning CLI.

sodonnel · 2021-06-01T20:32:41Z

I think there is another scenario this change won't catch.

Node is put to maintenance with a timeout value.
Node goes to IN_MAINTENANCE and is then stopped and goes dead.
Node does not come back in time, and SCM automatically expires the maintenance, moving the host to IN_SERVICE + DEAD.

In this case the dead node handler would fire once on Recon, and populate the dead nodes array, then the state would be checked and the array cleared. The later transition to IN_SERVICE + DEAD would never get seen by Recon.

I wonder if it would be simpler to forget about the dead node handler on Recon, and just sync up all nodes on SCM with the nodes on Recon on each run (one every 5 minutes I think). Even with a few 1000 nodes, this should not have a big overhead.

In saying that, using the dead node handler is a nice optimisation.

We could ask SCM for a list of nodes which are only in the DEAD state and sync up that group. There is an existing API for that in SCMNodeManager:

 /**
   * Returns all datanode that are in the given states. Passing null for one of
   * of the states acts like a wildcard for that state. This function works by
   * taking a snapshot of the current collection and then returning the list
   * from that collection. This means that real map might have changed by the
   * time we return this list.
   *
   * @param opState The operational state of the node
   * @param health The health of the node
   * @return List of Datanodes that are known to SCM in the requested states.
   */
  @Override
  public List<DatanodeDetails> getNodes(
      NodeOperationalState opState, NodeState health) {
    return nodeStateManager.getNodes(opState, health)
        .stream()
        .map(node -> (DatanodeDetails)node).collect(Collectors.toList());
  }

I think that would cover both scenarios.

avijayanhwx · 2021-06-01T20:47:29Z

I think there is another scenario this change won't catch.

Node is put to maintenance with a timeout value.

Node goes to IN_MAINTENANCE and is then stopped and goes dead.

Node does not come back in time, and SCM automatically expires the maintenance, moving the host to IN_SERVICE + DEAD.

In this case the dead node handler would fire once on Recon, and populate the dead nodes array, then the state would be checked and the array cleared. The later transition to IN_SERVICE + DEAD would never get seen by Recon.

I wonder if it would be simpler to forget about the dead node handler on Recon, and just sync up all nodes on SCM with the nodes on Recon on each run (one every 5 minutes I think). Even with a few 1000 nodes, this should not have a big overhead.

In saying that, using the dead node handler is a nice optimisation.

We could ask SCM for a list of nodes which are only in the DEAD state and sync up that group. There is an existing API for that in SCMNodeManager:
 /**
   * Returns all datanode that are in the given states. Passing null for one of
   * of the states acts like a wildcard for that state. This function works by
   * taking a snapshot of the current collection and then returning the list
   * from that collection. This means that real map might have changed by the
   * time we return this list.
   *
   * @param opState The operational state of the node
   * @param health The health of the node
   * @return List of Datanodes that are known to SCM in the requested states.
   */
  @Override
  public List<DatanodeDetails> getNodes(
      NodeOperationalState opState, NodeState health) {
    return nodeStateManager.getNodes(opState, health)
        .stream()
        .map(node -> (DatanodeDetails)node).collect(Collectors.toList());
  }
I think that would cover both scenarios.

Thanks for the review @sodonnel. The current patch does a general sync through the PipelineSyncTask as well. Doesn't that handle that case? To be fair, the DeadNodeHandler logic that I have added is redundant here.

sodonnel · 2021-06-01T21:40:06Z

I think you are correct and I miss-understood how the change works. I will check again in the morning.

sodonnel · 2021-06-02T09:48:07Z

I understand this better now. The DeadNodeHandler will catch the case where a node is stopped before it reports its own status to Recon.

The sync in PipelineSyncTask would catch the maintenance timing out scenario too.

This change LGTM, +1.

avijayanhwx · 2021-06-02T17:18:33Z

Thanks for the review @sodonnel.

Aravindan Vijayan added 2 commits May 26, 2021 14:22

HDDS-5277

c06e45b

Add unit test.

cfdf17c

avijayanhwx requested a review from sodonnel May 26, 2021 23:32

More testing.

a4e089c

avijayanhwx requested a review from smengcl May 27, 2021 16:16

sodonnel approved these changes Jun 2, 2021

View reviewed changes

avijayanhwx merged commit f3f258a into apache:master Jun 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HDDS-5277. Recon shows operational status as "DECOMMISSIONING" for "DECOMMISSIONED" DNs #2286

HDDS-5277. Recon shows operational status as "DECOMMISSIONING" for "DECOMMISSIONED" DNs #2286

avijayanhwx commented May 26, 2021

sodonnel commented Jun 1, 2021

avijayanhwx commented Jun 1, 2021

sodonnel commented Jun 1, 2021

sodonnel commented Jun 2, 2021

avijayanhwx commented Jun 2, 2021

HDDS-5277. Recon shows operational status as "DECOMMISSIONING" for "DECOMMISSIONED" DNs #2286

HDDS-5277. Recon shows operational status as "DECOMMISSIONING" for "DECOMMISSIONED" DNs #2286

Conversation

avijayanhwx commented May 26, 2021

What changes were proposed in this pull request?

What is the link to the Apache JIRA

How was this patch tested?

sodonnel commented Jun 1, 2021

avijayanhwx commented Jun 1, 2021

sodonnel commented Jun 1, 2021

sodonnel commented Jun 2, 2021

avijayanhwx commented Jun 2, 2021