Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-8801. ReplicationManager: Add metric to count how often replication is throttled #4864

Merged
merged 3 commits into from Jun 9, 2023

Conversation

sodonnel
Copy link
Contributor

@sodonnel sodonnel commented Jun 9, 2023

What changes were proposed in this pull request?

Adding a metric for the number of containers where we fail to send a delete, replication or reconstruction due to the throttling. This will give some visibility into how often throttling is occurring.

New metrics are:

ecReconstructionCmdsDeferredTotal
deleteContainerCmdsDeferredTotal
replicateContainerCmdsDeferredTotal

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8801

How was this patch tested?

Adapted tests to check metric is getting incremented

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sodonnel for the patch. LGTM, and integration test failures do not seem to be related. However, Findbugs emits a failure about "inconsistent synchronization", which we'll need to get rid of.

@adoroszlai adoroszlai merged commit d77df14 into apache:master Jun 9, 2023
28 checks passed
errose28 added a commit to errose28/ozone that referenced this pull request Jun 10, 2023
* master: (73 commits)
  HDDS-8587. Test that CertificateClient can store multiple rootCA certificates (apache#4852)
  HDDS-8801. ReplicationManager: Add metric to count how often replication is throttled (apache#4864)
  HDDS-8477. Unit test for Snapdiff using tombstone entries (apache#4678)
  HDDS-7507. [Snapshot] Implement List Snapshot API Pagination (apache#4065) (apache#4861)
  HDDS-8373. Document that setquota doesn't accept decimals (apache#4856)
  HDDS-8779. Recon - Expose flag for enable/disable of heatmap. (apache#4845)
  HDDS-8677. Ozone admin OM CLI command for block tokens (apache#4760)
  HDDS-8164. Authorize secret key APIs (apache#4597)
  HDDS-7945. Integrate secret keys to SCM snapshot (apache#4549)
  HDDS-8003. E2E integration test cases for block tokens (apache#4547)
  HDDS-7831. Use symmetric secret key to sign and verify token (apache#4417)
  HDDS-7830. SCM API for OM and Datanode to get secret keys (apache#4345)
  HDDS-7734. Implement symmetric SecretKeys lifescycle management in SCM (apache#4194)
  HDDS-8679. Add dedicated, configurable thread pool for OM gRPC server (apache#4771)
  HDDS-8790. Split EC acceptance tests (apache#4855)
  HDDS-8714. TestScmHAFinalization: mark testFinalizationWithRestart as flaky, enable other test cases
  HDDS-8787. Reduce ozone sh calls in robot tests (apache#4854)
  HDDS-8774. Log allocation stack trace for leaked CodecBuffer (apache#4840)
  HDDS-8729. Add metric for count of blocks pending deletion on datanode (apache#4800)
  HDDS-8780. Leak of ManagedChannel in HASecurityUtils (apache#4850)
  ...
errose28 added a commit to errose28/ozone that referenced this pull request Jun 10, 2023
* tmp-dir-refactor: (73 commits)
  HDDS-8587. Test that CertificateClient can store multiple rootCA certificates (apache#4852)
  HDDS-8801. ReplicationManager: Add metric to count how often replication is throttled (apache#4864)
  HDDS-8477. Unit test for Snapdiff using tombstone entries (apache#4678)
  HDDS-7507. [Snapshot] Implement List Snapshot API Pagination (apache#4065) (apache#4861)
  HDDS-8373. Document that setquota doesn't accept decimals (apache#4856)
  HDDS-8779. Recon - Expose flag for enable/disable of heatmap. (apache#4845)
  HDDS-8677. Ozone admin OM CLI command for block tokens (apache#4760)
  HDDS-8164. Authorize secret key APIs (apache#4597)
  HDDS-7945. Integrate secret keys to SCM snapshot (apache#4549)
  HDDS-8003. E2E integration test cases for block tokens (apache#4547)
  HDDS-7831. Use symmetric secret key to sign and verify token (apache#4417)
  HDDS-7830. SCM API for OM and Datanode to get secret keys (apache#4345)
  HDDS-7734. Implement symmetric SecretKeys lifescycle management in SCM (apache#4194)
  HDDS-8679. Add dedicated, configurable thread pool for OM gRPC server (apache#4771)
  HDDS-8790. Split EC acceptance tests (apache#4855)
  HDDS-8714. TestScmHAFinalization: mark testFinalizationWithRestart as flaky, enable other test cases
  HDDS-8787. Reduce ozone sh calls in robot tests (apache#4854)
  HDDS-8774. Log allocation stack trace for leaked CodecBuffer (apache#4840)
  HDDS-8729. Add metric for count of blocks pending deletion on datanode (apache#4800)
  HDDS-8780. Leak of ManagedChannel in HASecurityUtils (apache#4850)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants