Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HDDS-8774. Log allocation stack trace for leaked CodecBuffer #4840

Merged
merged 1 commit into from
Jun 8, 2023

Conversation

adoroszlai
Copy link
Contributor

@adoroszlai adoroszlai commented Jun 7, 2023

What changes were proposed in this pull request?

Show stack trace of CodecBuffer allocation for leaked instances if log level is at least DEBUG (and set such log level for integration tests).

https://issues.apache.org/jira/browse/HDDS-8774

How was this patch tested?

Introduced leak by randomly returning from release() without actually releasing the buffer. Ran TestOmMetrics. Verified log:

[Finalizer] WARN  db.CodecBuffer (CodecBuffer.java:finalize(129)) - LEAK 1: org.apache.hadoop.hdds.utils.db.CodecBuffer@26701f00, refCnt=1, capacity=24 allocation:
org.apache.hadoop.hdds.utils.db.CodecBuffer.allocate(CodecBuffer.java:74)
org.apache.hadoop.hdds.utils.db.CodecBuffer.allocateDirect(CodecBuffer.java:82)
org.apache.hadoop.hdds.utils.db.StringCodecBase.toCodecBuffer(StringCodecBase.java:175)
org.apache.hadoop.hdds.utils.db.StringCodec.toCodecBuffer(StringCodec.java:28)
org.apache.hadoop.hdds.utils.db.StringCodecBase.toCodecBuffer(StringCodecBase.java:43)
org.apache.hadoop.hdds.utils.db.Codec.toDirectCodecBuffer(Codec.java:75)
org.apache.hadoop.hdds.utils.db.TypedTable.getFromTable(TypedTable.java:358)
org.apache.hadoop.hdds.utils.db.TypedTable.getFromTable(TypedTable.java:337)
org.apache.hadoop.hdds.utils.db.TypedTable.get(TypedTable.java:230)
org.apache.hadoop.hdds.scm.ha.SequenceIdGenerator.upgradeToSequenceId(SequenceIdGenerator.java:353)
org.apache.hadoop.hdds.scm.server.StorageContainerManager.initializeSystemManagers(StorageContainerManager.java:654)
org.apache.hadoop.hdds.scm.server.StorageContainerManager.<init>(StorageContainerManager.java:389)
org.apache.hadoop.hdds.scm.server.StorageContainerManager.createSCM(StorageContainerManager.java:572)
org.apache.hadoop.hdds.scm.HddsTestUtils.getScmSimple(HddsTestUtils.java:598)
org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.createSCM(MiniOzoneClusterImpl.java:735)
org.apache.hadoop.ozone.MiniOzoneClusterImpl$Builder.build(MiniOzoneClusterImpl.java:587)
org.apache.hadoop.ozone.om.TestOmMetrics.startCluster(TestOmMetrics.java:109)
org.apache.hadoop.ozone.om.TestOmMetrics.testBucketOps(TestOmMetrics.java:201)

Temporarily removed setting of log level, verified log has no stack trace:

[Finalizer] WARN  db.CodecBuffer (CodecBuffer.java:finalize(129)) - LEAK 1: org.apache.hadoop.hdds.utils.db.CodecBuffer@26701f00, refCnt=1, capacity=51

@adoroszlai adoroszlai self-assigned this Jun 7, 2023
@adoroszlai
Copy link
Contributor Author

@duongkame @szetszwo please review

Copy link
Contributor

@szetszwo szetszwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 the change looks good.

@adoroszlai adoroszlai merged commit 9d7df93 into apache:master Jun 8, 2023
27 checks passed
@adoroszlai adoroszlai deleted the HDDS-8774 branch June 8, 2023 13:16
@adoroszlai
Copy link
Contributor Author

Thanks @szetszwo for the review.

errose28 added a commit to errose28/ozone that referenced this pull request Jun 10, 2023
* master: (73 commits)
  HDDS-8587. Test that CertificateClient can store multiple rootCA certificates (apache#4852)
  HDDS-8801. ReplicationManager: Add metric to count how often replication is throttled (apache#4864)
  HDDS-8477. Unit test for Snapdiff using tombstone entries (apache#4678)
  HDDS-7507. [Snapshot] Implement List Snapshot API Pagination (apache#4065) (apache#4861)
  HDDS-8373. Document that setquota doesn't accept decimals (apache#4856)
  HDDS-8779. Recon - Expose flag for enable/disable of heatmap. (apache#4845)
  HDDS-8677. Ozone admin OM CLI command for block tokens (apache#4760)
  HDDS-8164. Authorize secret key APIs (apache#4597)
  HDDS-7945. Integrate secret keys to SCM snapshot (apache#4549)
  HDDS-8003. E2E integration test cases for block tokens (apache#4547)
  HDDS-7831. Use symmetric secret key to sign and verify token (apache#4417)
  HDDS-7830. SCM API for OM and Datanode to get secret keys (apache#4345)
  HDDS-7734. Implement symmetric SecretKeys lifescycle management in SCM (apache#4194)
  HDDS-8679. Add dedicated, configurable thread pool for OM gRPC server (apache#4771)
  HDDS-8790. Split EC acceptance tests (apache#4855)
  HDDS-8714. TestScmHAFinalization: mark testFinalizationWithRestart as flaky, enable other test cases
  HDDS-8787. Reduce ozone sh calls in robot tests (apache#4854)
  HDDS-8774. Log allocation stack trace for leaked CodecBuffer (apache#4840)
  HDDS-8729. Add metric for count of blocks pending deletion on datanode (apache#4800)
  HDDS-8780. Leak of ManagedChannel in HASecurityUtils (apache#4850)
  ...
errose28 added a commit to errose28/ozone that referenced this pull request Jun 10, 2023
* tmp-dir-refactor: (73 commits)
  HDDS-8587. Test that CertificateClient can store multiple rootCA certificates (apache#4852)
  HDDS-8801. ReplicationManager: Add metric to count how often replication is throttled (apache#4864)
  HDDS-8477. Unit test for Snapdiff using tombstone entries (apache#4678)
  HDDS-7507. [Snapshot] Implement List Snapshot API Pagination (apache#4065) (apache#4861)
  HDDS-8373. Document that setquota doesn't accept decimals (apache#4856)
  HDDS-8779. Recon - Expose flag for enable/disable of heatmap. (apache#4845)
  HDDS-8677. Ozone admin OM CLI command for block tokens (apache#4760)
  HDDS-8164. Authorize secret key APIs (apache#4597)
  HDDS-7945. Integrate secret keys to SCM snapshot (apache#4549)
  HDDS-8003. E2E integration test cases for block tokens (apache#4547)
  HDDS-7831. Use symmetric secret key to sign and verify token (apache#4417)
  HDDS-7830. SCM API for OM and Datanode to get secret keys (apache#4345)
  HDDS-7734. Implement symmetric SecretKeys lifescycle management in SCM (apache#4194)
  HDDS-8679. Add dedicated, configurable thread pool for OM gRPC server (apache#4771)
  HDDS-8790. Split EC acceptance tests (apache#4855)
  HDDS-8714. TestScmHAFinalization: mark testFinalizationWithRestart as flaky, enable other test cases
  HDDS-8787. Reduce ozone sh calls in robot tests (apache#4854)
  HDDS-8774. Log allocation stack trace for leaked CodecBuffer (apache#4840)
  HDDS-8729. Add metric for count of blocks pending deletion on datanode (apache#4800)
  HDDS-8780. Leak of ManagedChannel in HASecurityUtils (apache#4850)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants