Make DALI array_interface memory writable #4800

JanuszL · 2023-04-21T16:21:36Z

marking DALI memory shared by array_interface read-only prevents it
from sharing directly with Torch and possibly other libraries.
This PR makes it writable.
memory can be shared anyway but it requires an intermediate step
through cupy which makes the memory writable

Category:

Other*/Breaking change(I hope not)

Description:

marking DALI memory shared by array_interface read-only prevents it
from sharing directly with Torch and possibly other libraries.
This PR makes it writable.
memory can be shared anyway but it requires an intermediate step
through cupy which makes the memory writable

Additional information:

Affected modules and functionalities:

tensor cuda_array_interface and array_interface

Key points relevant for the review:

NA

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: N/A

JanuszL · 2023-04-21T16:21:43Z

!build

JanuszL · 2023-04-21T16:22:39Z

dali/test/python/test_backend_impl_torch_dlpack.py

@@ -63,7 +63,7 @@ def test_dlpack_tensor_list_gpu_to_cpu():


 def check_dlpack_types_gpu(t):
-    arr = torch.tensor([[-0.39, 1.5], [-1.5, 0.33]], device="cuda", dtype=t)
+    arr = torch.tensor([[0.39, 1.5], [1.5, 0.33]], device="cuda", dtype=t)


The latest torch complains value cannot be converted to type uint8 without overflow.

JanuszL · 2023-04-21T16:22:43Z

dali/test/python/test_backend_impl_torch_dlpack.py

@@ -149,7 +149,7 @@ def create_tmp(idx):


 def check_dlpack_types_cpu(t):
-    arr = torch.tensor([[-0.39, 1.5], [-1.5, 0.33]], device="cpu", dtype=t)
+    arr = torch.tensor([[0.39, 1.5], [1.5, 0.33]], device="cpu", dtype=t)


The latest torch complains value cannot be converted to type uint8 without overflow.

dali-automaton · 2023-04-21T16:29:43Z

CI MESSAGE: [8039230]: BUILD STARTED

dali-automaton · 2023-04-21T18:14:57Z

CI MESSAGE: [8039230]: BUILD PASSED

mzient · 2023-04-24T08:03:07Z

Shouldn't it be "writable" in the title?

- marking DALI memory shared by array_interface read-only prevents it from sharing directly with Torch and possibly other libraries. This PR makes it writable. - memory can be shared anyway but it requires an intermediate step through cupy which makes the memory writable Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>

JanuszL · 2023-04-24T08:21:01Z

Shouldn't it be "writable" in the title?

Fixed

klecki · 2023-04-24T09:24:22Z

Just a question: when converting form one Tensor type to another, like when we do TensorList -> Tensor, we typically pack the shared_ptr to the memory somewhere. Here, I don't think there is even Python object reference to the original tensor. Wouldn't it crash in an iteration or two in most cases?

JanuszL · 2023-04-24T10:46:26Z

Just a question: when converting form one Tensor type to another, like when we do TensorList -> Tensor, we typically pack the shared_ptr to the memory somewhere. Here, I don't think there is even Python object reference to the original tensor. Wouldn't it crash in an iteration or two in most cases?

I don't think the pipeline exposes a non-continuous tensor as the output. In the general case, it may not work, in most cases, we just expose the pointer to the underlying, continuous memory, so the tensor itself doesn't have to live long.
The problem you describes is rather about the conversion from TensorList to Tensor itself, and possible problems of the array_interface are just a derivate of it.

klecki · 2023-04-24T10:56:35Z

The problem I'm referring to is specifically how long the allocation is valid. But thinking about it, the array interface is just a property of the object, and accessing it doesn't impact the lifetime of the original object. So the user of array interface must take lifetimes into account and keep the original object alive as long as they are using the raw pointer that they obtain. I think we should be fine, until someone keeps this pointer for too long thinking it contents won't change. I don't know what Pytorch may do with such memory.

JanuszL · 2023-04-24T10:58:38Z

The problem I'm referring to is specifically how long the allocation is valid. But thinking about it, the array interface is just a property of the object, and accessing it doesn't impact the lifetime of the original object. So the user of array interface must take lifetimes into account and keep the original object alive as long as they are using the raw pointer that they obtain. I think we should be fine, until someone keeps this pointer for too long thinking it contents won't change. I don't know what Pytorch may do with such memory.

Yes, that is what I had in mind. I think PyTorch just wraps the memory up and does nothing with it. After all, it was possible already but with cupy in the middle.

klecki · 2023-04-24T10:59:27Z

Also, reading the docs of array_interface:

A reference to the object exposing the array interface must be stored by the new object if the memory area is to be secured.

I think it solves my issues.
BTW @JanuszL, maybe you want to follow up with CUDA array interface v3? https://numba.readthedocs.io/en/stable/cuda/cuda_array_interface.html

JanuszL · 2023-04-24T12:29:56Z

I guess that we want to use the new interface at some point.

JanuszL commented Apr 21, 2023

View reviewed changes

mzient approved these changes Apr 24, 2023

View reviewed changes

mzient self-assigned this Apr 24, 2023

JanuszL force-pushed the modifable_arra_intf branch from d0be253 to 106cc0d Compare April 24, 2023 08:20

JanuszL changed the title ~~Make DALI array_interface memory readable~~ Make DALI array_interface memory writable Apr 24, 2023

jantonguirao assigned klecki Apr 24, 2023

klecki approved these changes Apr 24, 2023

View reviewed changes

JanuszL merged commit cabb4fd into NVIDIA:main Apr 24, 2023
3 of 4 checks passed

JanuszL deleted the modifable_arra_intf branch April 24, 2023 12:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make DALI array_interface memory writable #4800

Make DALI array_interface memory writable #4800

JanuszL commented Apr 21, 2023

JanuszL commented Apr 21, 2023

JanuszL Apr 21, 2023

JanuszL Apr 21, 2023

dali-automaton commented Apr 21, 2023

dali-automaton commented Apr 21, 2023

mzient commented Apr 24, 2023

JanuszL commented Apr 24, 2023

klecki commented Apr 24, 2023

JanuszL commented Apr 24, 2023

klecki commented Apr 24, 2023

JanuszL commented Apr 24, 2023

klecki commented Apr 24, 2023

JanuszL commented Apr 24, 2023

Make DALI array_interface memory writable #4800

Make DALI array_interface memory writable #4800

Conversation

JanuszL commented Apr 21, 2023

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

JanuszL commented Apr 21, 2023

JanuszL Apr 21, 2023

Choose a reason for hiding this comment

JanuszL Apr 21, 2023

Choose a reason for hiding this comment

dali-automaton commented Apr 21, 2023

dali-automaton commented Apr 21, 2023

mzient commented Apr 24, 2023

JanuszL commented Apr 24, 2023

klecki commented Apr 24, 2023

JanuszL commented Apr 24, 2023

klecki commented Apr 24, 2023

JanuszL commented Apr 24, 2023

klecki commented Apr 24, 2023

JanuszL commented Apr 24, 2023