Skip to content

HDDS-8840. Add metrics to Container Balancer#4998

Merged
adoroszlai merged 1 commit intoapache:masterfrom
siddhantsangwan:HDDS-8840
Jun 29, 2023
Merged

HDDS-8840. Add metrics to Container Balancer#4998
adoroszlai merged 1 commit intoapache:masterfrom
siddhantsangwan:HDDS-8840

Conversation

@siddhantsangwan
Copy link
Contributor

What changes were proposed in this pull request?

Added metrics for number of moves that were scheduled in the latest iteration and in total. Included all MoveResult failure cases under existing metric for failures, numContainerMovesFailed. Except REPLICATION_FAIL_TIME_OUT and DELETION_FAIL_TIME_OUT, which are considered time outs and counted in numContainerMovesTimeout.

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8840

How was this patch tested?

Added assertions to an existing unit test.

@sodonnel
Copy link
Contributor

Changes LGTM. I had originally thought we should expose a metrics for the number of entries pending in the moveManager. I guess these new perIteration metrics cover it, as each iteration should start with everything from the last timed out or completed. There may be a small chance that something goes wrong and allows entries to build up in the moveManager pending table, but perhaps adding a metric for that is not needed. What do you think?

@adoroszlai adoroszlai merged commit 3ec7c5f into apache:master Jun 29, 2023
@adoroszlai
Copy link
Contributor

Thanks @siddhantsangwan for the patch, @sodonnel for the review.

There may be a small chance that something goes wrong and allows entries to build up in the moveManager pending table, but perhaps adding a metric for that is not needed. What do you think?

We can add more metrics in another task, if needed.

@siddhantsangwan
Copy link
Contributor Author

I guess these new perIteration metrics cover it, as each iteration should start with everything from the last timed out or completed.

Yes, scheduled moves should equal timed out + completed + failed, so we don't really need one for pending moves. But we can have something for the number of replicate and delete commands that were sent by Move Manager.

vtutrinov pushed a commit to Cyrill/ozone that referenced this pull request Jul 3, 2023
vtutrinov pushed a commit to Cyrill/ozone that referenced this pull request Jul 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments