Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-8835. ReplicationManager: Fix metrics to work with new RM #4882

Merged
merged 6 commits into from Jun 14, 2023

Conversation

sodonnel
Copy link
Contributor

What changes were proposed in this pull request?

This change is to tidy up the metrics in ReplicationManager metrics. Some of the legacy metrics are no longer valid in the new RM, and the naming was inconsistent between the old Ratis metrics and EC metrics.

The new metrics after this change look like the following:

"name" : "Hadoop:service=StorageContainerManager,name=ReplicationManagerMetrics",
    "modelerType" : "ReplicationManagerMetrics",
    "tag.Hostname" : "835ec174c7ef",
    "InflightReplication" : 0,
    "InflightDeletion" : 0,
    "InflightEcReplication" : 0,
    "InflightEcDeletion" : 2,
    "UnderReplicatedQueueSize" : 0,
    "OverReplicatedQueueSize" : 0,
    "OpenContainers" : 0,
    "ClosingContainers" : 0,
    "QuasiClosedContainers" : 0,
    "ClosedContainers" : 14,
    "DeletingContainers" : 1,
    "DeletedContainers" : 0,
    "RecoveringContainers" : 0,
    "UnderReplicatedContainers" : 0,
    "MisReplicatedContainers" : 0,
    "OverReplicatedContainers" : 0,
    "MissingContainers" : 0,
    "UnhealthyContainers" : 0,
    "EmptyContainers" : 0,
    "OpenUnhealthyContainers" : 0,
    "StuckQuasiClosedContainers" : 0,
    "ReplicationCmdsSentTotal" : 0,
    "ReplicasCreatedTotal" : 0,
    "ReplicaCreateTimeoutTotal" : 0,
    "DeletionCmdsSentTotal" : 0,
    "ReplicasDeletedTotal" : 0,
    "ReplicaDeleteTimeoutTotal" : 0,
    "EcReplicationCmdsSentTotal" : 0,
    "EcDeletionCmdsSentTotal" : 62,
    "EcReplicasCreatedTotal" : 5,
    "EcReplicasDeletedTotal" : 0,
    "EcReconstructionCmdsSentTotal" : 15,
    "EcReplicaCreateTimeoutTotal" : 10,
    "EcReplicasDeletedTotal" : 0,
    "EcReplicaDeleteTimeoutTotal" : 60,
    "EcReconstructionCmdsDeferredTotal" : 0,
    "DeleteContainerCmdsDeferredTotal" : 0,
    "ReplicateContainerCmdsDeferredTotal" : 0

There are some quite large changes as a result of renaming some getting and setter methods after changing the metric names, but the change is split into separate commits to make it easier to review.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8835

How was this patch tested?

Some tests were modified, but this was mostly validated manually to produce the results above, by create containers and stopping and starting some nodes.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sodonnel for the patch.

@adoroszlai adoroszlai merged commit f44f7f2 into apache:master Jun 14, 2023
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants