HDDS-6567. Store datanode command queue counts from heartbeat in DatanodeInfo in SCM#3329
HDDS-6567. Store datanode command queue counts from heartbeat in DatanodeInfo in SCM#3329sodonnel merged 2 commits intoapache:masterfrom
Conversation
umamaheswararao
left a comment
There was a problem hiding this comment.
@sodonnel patch almost looks good to me. I have few questions to get clarity though, thanks
| @@ -49,6 +57,7 @@ public class DatanodeInfo extends DatanodeDetails { | |||
| private List<StorageReportProto> storageReports; | |||
| private List<MetadataStorageReportProto> metadataStorageReports; | |||
| private LayoutVersionProto lastKnownLayoutVersion; | |||
There was a problem hiding this comment.
@sodonnel I am wondering what happens to this DatanodeInfo when it expire due to lack of HB from that node? is this object around or destroyed. I am trying to figure out that particular code part. I have not found so far. Please point me where we remove this nodeInfo object when node expires. Thanks
There was a problem hiding this comment.
NodeStateManager.checkNodesHealth is what notices the lost heartbeats and triggers events based on that.
The DeadNodeHandler is triggered when the node goes dead (there is also a StaleNodeHandler), and clears out its pipelines etc. Perhaps we should reset the command counts when this happens, or perhaps it is valid to leave them as the last known value. The datanodeInfo object is not removed AFAIK, as it holds the DN service state (in_service, decommissioning, healthy, stale, dead etc). If the DN comes back, it will be reset by the heartbeat processing. If it never comes back, the datanodedetails and datanodeinfo stick around in SCM until it is restarted.
I am not sure if the command counts remaining is a big issue, as we should avoid scheduling commands on dead (and maybe stale) nodes anyway. Eg before scheduling a command for a node, need to check it is HEALTHY, as otherwise the commands will be queued in SCM and never taken by a DN. If something in SCM keeps scheduling commands for dead nodes, it will slowly fill up the SCM memory on the command queue.
There was a problem hiding this comment.
Yeah, On thinking a bit, I think it's ok to leave the counts as well ( anyway we will not assign tasks to dead DN). when node rejoined, we should receive new HB and counts should get refreshed.
This above questions was just for my clarity what happens to DN object. Thanks for the details
| /** | ||
| * Retrieve the number of queued commands of the given type, as reported by | ||
| * the datanode at the last heartbeat. | ||
| * @param cmd The command for which to receive the queued command count |
There was a problem hiding this comment.
if it's -1, we should wait to assign any tasks to this node as we don;t know the actual state?
There was a problem hiding this comment.
-1 means we have not received any data yet. In the case of an upgrade adding a new command (eg SCM upgraded with a new command, but some DNs not upgraded) those DNs will always show a -1 for the new command.
I am not sure how we should handle this - possibly we need a fallback position in any code that uses these counts. If it is "-1" then we need to use some other way of limiting the commands sent. The upgrade scenario should be short lived, and then DNs should only have -1 until their first heartbeat.
I just thought it was a good idea to include -1 as a different state than zero, so we can tell the difference between the two.
| /** | ||
| * Get the number of commands of the given type queued on the datanode at the | ||
| * last heartbeat. If the Datanode has not reported information for the given | ||
| * command type, -1 wil be returned. |
| datanodeInfo.setCommandCounts(commandQueueReportProto); | ||
| metrics.incNumNodeCommandQueueReportProcessed(); | ||
| } | ||
| } catch (NodeNotFoundException e) { |
There was a problem hiding this comment.
Is this metric a "report failed"? or just unknown node report? I am not sure about the definition of this metric here
There was a problem hiding this comment.
These metrics are copying what is already there for other commands, eg see processNodeReport() and processHeartbeat() - I basically copied this methods structure from there to keep it consistent. In both those cases, the metric is "failedProcessing" but the only failure handled is nodeNotFound, so the name is a bit misleading.
There was a problem hiding this comment.
ok. Yeah name is bit misleading a bit.
| @@ -49,6 +57,7 @@ public class DatanodeInfo extends DatanodeDetails { | |||
| private List<StorageReportProto> storageReports; | |||
| private List<MetadataStorageReportProto> metadataStorageReports; | |||
| private LayoutVersionProto lastKnownLayoutVersion; | |||
There was a problem hiding this comment.
Yeah, On thinking a bit, I think it's ok to leave the counts as well ( anyway we will not assign tasks to dead DN). when node rejoined, we should receive new HB and counts should get refreshed.
This above questions was just for my clarity what happens to DN object. Thanks for the details
| /** | ||
| * Retrieve the number of queued commands of the given type, as reported by | ||
| * the datanode at the last heartbeat. | ||
| * @param cmd The command for which to receive the queued command count |
| datanodeInfo.setCommandCounts(commandQueueReportProto); | ||
| metrics.incNumNodeCommandQueueReportProcessed(); | ||
| } | ||
| } catch (NodeNotFoundException e) { |
There was a problem hiding this comment.
ok. Yeah name is bit misleading a bit.
What changes were proposed in this pull request?
HDDS-6554 added the current command counts for all commands queued on a Datanode to the datanode heartbeat. This Jira will process that information on SCM and store it inside the DatanodeInfo object so other parts of SCM can reference it.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-6567
How was this patch tested?
New Unit Tests