Skip to content

Added owning flag to DLPack tensors#1156

Closed
cliffburdick wants to merge 3 commits intomainfrom
owning_dlpack
Closed

Added owning flag to DLPack tensors#1156
cliffburdick wants to merge 3 commits intomainfrom
owning_dlpack

Conversation

@cliffburdick
Copy link
Copy Markdown
Collaborator

No description provided.

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 16, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 16, 2026

Greptile Summary

This PR adds an owning flag to the make_tensor DLManagedTensor overload. When owning=true, the function heap-copies the DLManagedTensor struct and captures that copy in a shared_ptr custom deleter, so the DLPack deleter is invoked when the last tensor reference is dropped — even if the original DLManagedTensor has already gone out of scope. Two tests verify the non-owning/owning split and the scoped-lifetime case.

Previous review threads raised three concerns that remain open in this revision: managed_ptr is heap-allocated before the null-deleter assertion (leak on exception), managed_ptr itself is never freed after the DLPack deleter runs (persistent leak per call), and the suggestion to use a wrapper struct to close both gaps. Those threads are tracked separately.

Confidence Score: 3/5

Merge is blocked by the unresolved managed_ptr memory leak (every owning tensor creation leaks the heap-copied DLManagedTensor struct), which was raised in previous review threads but is not addressed here.

The dangling-pointer bug from the first review round is fixed by heap-copying the struct, and the scoped-lifetime test correctly exercises that fix. However, two memory-safety issues from prior threads are still open: the allocation-before-assertion path leaks managed_ptr on exception, and the shared_ptr deleter never calls delete managed_ptr, so the struct leaks on every owning tensor destruction. These are correctness issues (memory leak per API call), not theoretical concerns. Additionally the new P2 finding (non-const lvalue ref rejects const/temporary inputs) is minor but is an API regression. Score 3 reflects the open correctness concerns carrying over from prior rounds.

include/matx/core/make_tensor.h lines 927–936: managed_ptr allocation, assertion order, and missing delete in the shared_ptr deleter lambda.

Important Files Changed

Filename Overview
include/matx/core/make_tensor.h Adds owning DLPack path: heap-copies DLManagedTensor, captures it in shared_ptr deleter; fixes dangling-pointer from prior review but still has unresolved managed_ptr leak and pre-assertion allocation issues noted in previous threads. API signature is unnecessarily restrictive (non-const lvalue ref).
test/00_tensor/BasicTensorTests.cu Adds DLPackOwning and DLPackOwningScopedInputLifetime tests with a helper deleter; correctly verifies CUDA memory is freed only when expected, and that ownership extends across DLManagedTensor's scope. DlpackOwnershipDeleter does not delete self, consistent with the (unresolved) managed_ptr leak noted in prior threads.
.gitignore Widens build directory glob from build to build*/ to cover variant build directories (e.g. build-debug, build-release). No issues.

Sequence Diagram

sequenceDiagram
    participant Caller
    participant make_tensor
    participant HeapCopy as "DLManagedTensor (heap)"
    participant SharedPtr as "shared_ptr deleter"
    participant DLDeleter as "DLPack deleter"

    Caller->>make_tensor: make_tensor(tensor, dlp, owning=true)
    make_tensor->>HeapCopy: new DLManagedTensor{dlp}
    make_tensor->>SharedPtr: shared_ptr(data_ptr, lambda[managed_ptr])
    make_tensor->>Caller: tensor.Shallow(tmp)

    Note over Caller: original dlp may go out of scope

    Caller->>SharedPtr: tensor destroyed (refcount → 0)
    SharedPtr->>DLDeleter: managed_ptr->deleter(managed_ptr)
    DLDeleter-->>SharedPtr: frees CUDA memory, shape, strides, ctx
    Note over HeapCopy: managed_ptr struct itself is NOT deleted (leak)
Loading

Reviews (3): Last reviewed commit: "Redundant arrays" | Re-trigger Greptile

Comment thread include/matx/core/make_tensor.h Outdated
Comment thread include/matx/core/make_tensor.h Outdated
Comment on lines +909 to +919
index_t strides[TensorType::Rank()];
index_t shape[TensorType::Rank()];
cuda::std::array<index_t, TensorType::Rank()> shape_arr{};
cuda::std::array<index_t, TensorType::Rank()> strides_arr{};

for (int r = 0; r < TensorType::Rank(); r++) {
strides[r] = dt.strides[r];
shape[r] = dt.shape[r];
strides_arr[r] = strides[r];
shape_arr[r] = shape[r];
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Redundant loop variables

strides[]/shape[] (raw arrays) and strides_arr/shape_arr (cuda::std::array) hold identical data. The raw arrays are only consumed by the non-owning branch (line 922–923) and the cuda::std::arrays only by the owning branch (line 928). You can simplify by populating just one set per branch, e.g., initialize shape_arr/strides_arr once and use them for both paths, dropping the duplicate raw arrays.

Comment on lines +930 to +932
auto managed_ptr = new DLManagedTensor{dlp_tensor};
MATX_ASSERT_STR(managed_ptr->deleter != nullptr, matxInvalidParameter,
"Owning DLPack tensor requires a non-null deleter");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Memory leak of managed_ptr on assertion failure

managed_ptr is heap-allocated before the null check. In debug builds where MATX_THROW raises an exception, if managed_ptr->deleter == nullptr, the exception unwinds the stack and managed_ptr is never freed. Moving the check before the allocation avoids the leak entirely.

Suggested change
auto managed_ptr = new DLManagedTensor{dlp_tensor};
MATX_ASSERT_STR(managed_ptr->deleter != nullptr, matxInvalidParameter,
"Owning DLPack tensor requires a non-null deleter");
MATX_ASSERT_STR(dlp_tensor.deleter != nullptr, matxInvalidParameter,
"Owning DLPack tensor requires a non-null deleter");
auto managed_ptr = new DLManagedTensor{dlp_tensor};

Comment on lines +936 to +940
[managed_ptr](typename TensorType::value_type *) {
if (managed_ptr->deleter != nullptr) {
managed_ptr->deleter(managed_ptr);
}
}),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 managed_ptr is never freed — persistent memory leak

The shared-pointer deleter calls managed_ptr->deleter(managed_ptr) but never delete managed_ptr. For DLPack producers whose deleters only clean up manager_ctx and nested allocations (e.g. PyTorch's ATenDLMTensor pattern, the test's DlpackOwnershipDeleter), the heap-allocated copy created on line 930 is never returned to the allocator. The test itself demonstrates the problem: DlpackOwnershipDeleter frees shape, strides, and manager_ctx but does not delete self, so managed_ptr leaks on every owning tensor lifetime. To prevent this without requiring every downstream producer to call delete self, wrap the copy in a struct that invokes the deleter from its destructor, then let the lambda delete the wrapper:

struct OwnedDLMTensor {
    DLManagedTensor tensor;
    ~OwnedDLMTensor() {
        if (tensor.deleter) tensor.deleter(&tensor);
    }
};
auto owned = new OwnedDLMTensor{.tensor = dlp_tensor};
auto storage = make_storage_from_shared_ptr<...>(
    std::shared_ptr<...>(data_ptr,
        [owned](...) { delete owned; }),
    desc.TotalSize());

This guarantees the heap allocation is freed regardless of what the producer's deleter does with self.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant