Added owning flag to DLPack tensors by cliffburdick · Pull Request #1156 · NVIDIA/MatX

cliffburdick · 2026-04-16T23:13:03Z

No description provided.

copy-pr-bot · 2026-04-16T23:13:07Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

greptile-apps · 2026-04-16T23:17:48Z

Greptile Summary

This PR adds an owning flag to the make_tensor DLManagedTensor overload. When owning=true, the function heap-copies the DLManagedTensor struct and captures that copy in a shared_ptr custom deleter, so the DLPack deleter is invoked when the last tensor reference is dropped — even if the original DLManagedTensor has already gone out of scope. Two tests verify the non-owning/owning split and the scoped-lifetime case.

Previous review threads raised three concerns that remain open in this revision: managed_ptr is heap-allocated before the null-deleter assertion (leak on exception), managed_ptr itself is never freed after the DLPack deleter runs (persistent leak per call), and the suggestion to use a wrapper struct to close both gaps. Those threads are tracked separately.

Confidence Score: 3/5

Merge is blocked by the unresolved managed_ptr memory leak (every owning tensor creation leaks the heap-copied DLManagedTensor struct), which was raised in previous review threads but is not addressed here.

The dangling-pointer bug from the first review round is fixed by heap-copying the struct, and the scoped-lifetime test correctly exercises that fix. However, two memory-safety issues from prior threads are still open: the allocation-before-assertion path leaks managed_ptr on exception, and the shared_ptr deleter never calls delete managed_ptr, so the struct leaks on every owning tensor destruction. These are correctness issues (memory leak per API call), not theoretical concerns. Additionally the new P2 finding (non-const lvalue ref rejects const/temporary inputs) is minor but is an API regression. Score 3 reflects the open correctness concerns carrying over from prior rounds.

include/matx/core/make_tensor.h lines 927–936: managed_ptr allocation, assertion order, and missing delete in the shared_ptr deleter lambda.

Important Files Changed

Filename	Overview
include/matx/core/make_tensor.h	Adds owning DLPack path: heap-copies DLManagedTensor, captures it in shared_ptr deleter; fixes dangling-pointer from prior review but still has unresolved managed_ptr leak and pre-assertion allocation issues noted in previous threads. API signature is unnecessarily restrictive (non-const lvalue ref).
test/00_tensor/BasicTensorTests.cu	Adds DLPackOwning and DLPackOwningScopedInputLifetime tests with a helper deleter; correctly verifies CUDA memory is freed only when expected, and that ownership extends across DLManagedTensor's scope. DlpackOwnershipDeleter does not delete self, consistent with the (unresolved) managed_ptr leak noted in prior threads.
.gitignore	Widens build directory glob from `build` to `build*/` to cover variant build directories (e.g. build-debug, build-release). No issues.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant make_tensor
    participant HeapCopy as "DLManagedTensor (heap)"
    participant SharedPtr as "shared_ptr deleter"
    participant DLDeleter as "DLPack deleter"

    Caller->>make_tensor: make_tensor(tensor, dlp, owning=true)
    make_tensor->>HeapCopy: new DLManagedTensor{dlp}
    make_tensor->>SharedPtr: shared_ptr(data_ptr, lambda[managed_ptr])
    make_tensor->>Caller: tensor.Shallow(tmp)

    Note over Caller: original dlp may go out of scope

    Caller->>SharedPtr: tensor destroyed (refcount → 0)
    SharedPtr->>DLDeleter: managed_ptr->deleter(managed_ptr)
    DLDeleter-->>SharedPtr: frees CUDA memory, shape, strides, ctx
    Note over HeapCopy: managed_ptr struct itself is NOT deleted (leak)

_{Reviews (3): Last reviewed commit: "Redundant arrays" | Re-trigger Greptile}

greptile-apps · 2026-04-16T23:17:53Z

  index_t strides[TensorType::Rank()];
  index_t shape[TensorType::Rank()];
+  cuda::std::array<index_t, TensorType::Rank()> shape_arr{};
+  cuda::std::array<index_t, TensorType::Rank()> strides_arr{};

  for (int r = 0; r < TensorType::Rank(); r++) {
    strides[r] = dt.strides[r];
    shape[r]   = dt.shape[r];
+    strides_arr[r] = strides[r];
+    shape_arr[r]   = shape[r];
+  }


Redundant loop variables

strides[]/shape[] (raw arrays) and strides_arr/shape_arr (cuda::std::array) hold identical data. The raw arrays are only consumed by the non-owning branch (line 922–923) and the cuda::std::arrays only by the owning branch (line 928). You can simplify by populating just one set per branch, e.g., initialize shape_arr/strides_arr once and use them for both paths, dropping the duplicate raw arrays.

greptile-apps · 2026-04-16T23:43:17Z

+  auto managed_ptr = new DLManagedTensor{dlp_tensor};
+  MATX_ASSERT_STR(managed_ptr->deleter != nullptr, matxInvalidParameter,
+                  "Owning DLPack tensor requires a non-null deleter");


Memory leak of managed_ptr on assertion failure

managed_ptr is heap-allocated before the null check. In debug builds where MATX_THROW raises an exception, if managed_ptr->deleter == nullptr, the exception unwinds the stack and managed_ptr is never freed. Moving the check before the allocation avoids the leak entirely.

Suggested change

auto managed_ptr = new DLManagedTensor{dlp_tensor};

MATX_ASSERT_STR(managed_ptr->deleter != nullptr, matxInvalidParameter,

"Owning DLPack tensor requires a non-null deleter");

MATX_ASSERT_STR(dlp_tensor.deleter != nullptr, matxInvalidParameter,

"Owning DLPack tensor requires a non-null deleter");

auto managed_ptr = new DLManagedTensor{dlp_tensor};

greptile-apps · 2026-04-16T23:43:19Z

+          [managed_ptr](typename TensorType::value_type *) {
+            if (managed_ptr->deleter != nullptr) {
+              managed_ptr->deleter(managed_ptr);
+            }
+          }),


managed_ptr is never freed — persistent memory leak

The shared-pointer deleter calls managed_ptr->deleter(managed_ptr) but never delete managed_ptr. For DLPack producers whose deleters only clean up manager_ctx and nested allocations (e.g. PyTorch's ATenDLMTensor pattern, the test's DlpackOwnershipDeleter), the heap-allocated copy created on line 930 is never returned to the allocator. The test itself demonstrates the problem: DlpackOwnershipDeleter frees shape, strides, and manager_ctx but does not delete self, so managed_ptr leaks on every owning tensor lifetime. To prevent this without requiring every downstream producer to call delete self, wrap the copy in a struct that invokes the deleter from its destructor, then let the lambda delete the wrapper:

struct OwnedDLMTensor { DLManagedTensor tensor; ~OwnedDLMTensor() { if (tensor.deleter) tensor.deleter(&tensor); } }; auto owned = new OwnedDLMTensor{.tensor = dlp_tensor}; auto storage = make_storage_from_shared_ptr<...>( std::shared_ptr<...>(data_ptr, [owned](...) { delete owned; }), desc.TotalSize());

This guarantees the heap allocation is freed regardless of what the producer's deleter does with self.

Added owning flag to DLPack tensors

6645e95

greptile-apps Bot reviewed Apr 16, 2026

View reviewed changes

Fixed ownership issue

39a12de

greptile-apps Bot reviewed Apr 16, 2026

View reviewed changes

Redundant arrays

b1c69ad

cliffburdick closed this Apr 17, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added owning flag to DLPack tensors#1156

Added owning flag to DLPack tensors#1156
cliffburdick wants to merge 3 commits intomainfrom
owning_dlpack

cliffburdick commented Apr 16, 2026

Uh oh!

copy-pr-bot Bot commented Apr 16, 2026

Uh oh!

greptile-apps Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

greptile-apps Bot Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cliffburdick commented Apr 16, 2026

Uh oh!

copy-pr-bot Bot commented Apr 16, 2026

Uh oh!

greptile-apps Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Sequence Diagram

Uh oh!

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Apr 16, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

greptile-apps Bot commented Apr 16, 2026 •

edited

Loading