Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add DLPack input support to the ExternalSource operator #2023

Merged
merged 4 commits into from
Jun 26, 2020

Conversation

JanuszL
Copy link
Contributor

@JanuszL JanuszL commented Jun 15, 2020

  • adds an ability to pass DLPack object in the ExternalSource operator

Signed-off-by: Janusz Lisiecki jlisiecki@nvidia.com

Why we need this PR?

Pick one, remove the rest

  • It adds DLPack input support to the ExternalSource operator

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    adds DLPack input support to the ExternalSource operator
  • Affected modules and functionalities:
    backend, pipeline, external source
  • Key points relevant for the review:
    NA
  • Validation and testing:
    new tests added
  • Documentation (including examples):
    docs updated

JIRA TASK: [DALI-1465]

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 15, 2020

!build

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 15, 2020

It goes after #1997

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1395708]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1395708]: BUILD FAILED

@JanuszL JanuszL force-pushed the dlpack_input branch 2 times, most recently from 184b06e to 7461691 Compare June 17, 2020 19:10
@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 17, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1403482]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1403482]: BUILD FAILED

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 18, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1405230]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1405230]: BUILD FAILED

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 18, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1405462]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1405462]: BUILD PASSED

template <typename TensorType>
TensorShape<> FillTensorData(const py::object object, TensorType *t, int device_id, string layout) {
void FillTensorCudaArrayInterfaceData(const py::object object, TensorType *batch,
int device_id, string layout) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: align

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

auto dlm_tensor_ptr = DLMTensorRawPtrFromCapsule(capsule, false);
const auto &dl_tensor = dlm_tensor_ptr->dl_tensor;
list.append(dl_tensor.ctx.device_type == kDLGPU);
if (dl_tensor.ctx.device_type != kDLGPU && dl_tensor.ctx.device_type != kDLCPU) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

list.append(dl_tensor.ctx.device_type == kDLGPU || dl_tensor.ctx.device_type == kDLCPU);
list.append(dl_tensor.ctx.device_type == kDLGPU);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -292,10 +399,26 @@ void ExposeTensor(py::module &m) {
)code");

py::class_<Tensor<GPUBackend>>(m, "TensorGPU")
.def(py::init([](py::capsule &capsule, string layout = "") {
auto t = new Tensor<GPUBackend>;
FillTensorDlpackInterfaceData(capsule, t, layout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we make a shorter name FillTensorFromDlPack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

.def(py::init([](const py::object object, string layout = "", int device_id = -1) {
auto t = new Tensor<GPUBackend>;
auto shape = FillTensorData(object, t, device_id, layout);
t->Resize(shape);
FillTensorCudaArrayInterfaceData(object, t, device_id, layout);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FillTensorFromCudaArray?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

tensor_list = TensorListCPU(to_dlpack(arr), "NHWC")
dali_torch_tensor = convert_to_torch(tensor_list, device=arr.device, dtype=arr.dtype)
assert(torch.all(arr.eq(dali_torch_tensor)))
test_dlpack_tensor_list_cpu_direct_creation()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* persist while it is in use by the Tensor.
*/
DLL_PUBLIC inline void ShareData(void *ptr, size_t bytes) {
ShareData(shared_ptr<void>(ptr, [](void *) {}), bytes, uniform_list_shape(1, {0}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this particular shape? What's wrong with TensorListshape<>{}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -219,14 +217,14 @@ class DLL_PUBLIC TensorList : public Buffer<Backend> {
* the user to manage the lifetime of the allocation such that it
* persist while it is in use by the Tensor.
*/
DLL_PUBLIC inline void ShareData(void *ptr, size_t bytes) {
inline void ShareData(const shared_ptr<void> &ptr, size_t bytes, const TensorListShape<> &shape) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For completeness - maybe there should also be a type?

Suggested change
inline void ShareData(const shared_ptr<void> &ptr, size_t bytes, const TensorListShape<> &shape) {
inline void ShareData(const shared_ptr<void> &ptr, size_t bytes, const TensorListShape<> &shape, const TypeInfo &type = {}) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...this would be in line with the new Resiz.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanted to align the API with the Tensor API. It doesn't accept type in the ShareData function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 19, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1409136]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1409136]: BUILD PASSED

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 25, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1423002]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1423002]: BUILD FAILED

Comment on lines +203 to +204
sample_dim, layout_str, pipe_handle->copy_stream, true,
is_pinned);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: indentation.

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 25, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1423041]: BUILD STARTED

- adds an ability to pass DLPack object in the ExternalSource operator
- sorts CPU work on ExternalSource by size

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1423041]: BUILD FAILED

@@ -219,14 +217,15 @@ class DLL_PUBLIC TensorList : public Buffer<Backend> {
* the user to manage the lifetime of the allocation such that it
* persist while it is in use by the Tensor.
*/
DLL_PUBLIC inline void ShareData(void *ptr, size_t bytes) {
inline void ShareData(const shared_ptr<void> &ptr, size_t bytes, const TensorListShape<> &shape,
const TypeInfo &type = TypeInfo::Create<NoType>()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that sufficient?

Suggested change
const TypeInfo &type = TypeInfo::Create<NoType>()) {
const TypeInfo &type = {}) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@@ -57,6 +57,10 @@ class CachingList {
return full_data_.empty();
}

T &PeakFront() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
T &PeakFront() {
T &PeekFront() {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 229 to 230
output_desc[0].shape = tl_data_.PeakFront()->shape();
output_desc[0].type = tl_data_.PeakFront()->type();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
output_desc[0].shape = tl_data_.PeakFront()->shape();
output_desc[0].type = tl_data_.PeakFront()->type();
output_desc[0].shape = tl_data_.PeekFront()->shape();
output_desc[0].type = tl_data_.PeekFront()->type();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 130 to 131
void CheckStrides(TStrides &strides, TShape &shape, size_t type_size,
size_t strides_size, size_t shape_size) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
void CheckStrides(TStrides &strides, TShape &shape, size_t type_size,
size_t strides_size, size_t shape_size) {
void CheckContiguousTensor(const TStrides &strides, size_t num_strides, const TShape &shape,
size_t num_extents, size_t element_size) {

...and add an overload:

void CheckContiguousTensor(const TStrides &strides, const TShape &shape, size_t element_size) {
  CheckContiguousTensor(strides, dali::size(strides), shape, dali::size(shape), element_size);
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 117 to 121
std::vector<Index> tensor_shape(shape.size()-1);
for (int i = 1; i < shape.size(); ++i) {
tensor_shape[i-1] = shape[i];
}
return uniform_list_shape(shape[0], tensor_shape);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
std::vector<Index> tensor_shape(shape.size()-1);
for (int i = 1; i < shape.size(); ++i) {
tensor_shape[i-1] = shape[i];
}
return uniform_list_shape(shape[0], tensor_shape);
return uniform_list_shape(shape[0], shape.last(shape.size()-1));

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}

template<typename SrcBackend>
TensorShape<> CreateShape(TensorShape<> &shape, Tensor<SrcBackend>*) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
TensorShape<> CreateShape(TensorShape<> &shape, Tensor<SrcBackend>*) {
const TensorShape<> &ConvertShape(const TensorShape<> &shape, Tensor<SrcBackend>*) {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 587 to 588
CheckStrides(info.strides, info.shape, info.itemsize, info.strides.size(),
info.shape.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CheckStrides(info.strides, info.shape, info.itemsize, info.strides.size(),
info.shape.size());
CheckStrides(info.strides, info.shape, info.itemsize);

The function should take care of obtaining the sizes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 347 to 348
CheckStrides(info.strides, info.shape, info.itemsize, info.strides.size(),
info.shape.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CheckStrides(info.strides, info.shape, info.itemsize, info.strides.size(),
info.shape.size());
CheckStrides(info.strides, info.shape, info.itemsize);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

" whereas densely packed data of this shape would have a stride ", stride_from_shape));
stride_from_shape *= shape[i];
}
CheckStrides(strides, shape, type.size(), strides.size(), shape.size());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CheckStrides(strides, shape, type.size(), strides.size(), shape.size());
CheckStrides(strides, shape, type.size());

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines 283 to 284
It returns a two element tuple, if this is a valid DLPack object, and if data
resides on the GPU.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It returns a two element tuple, if this is a valid DLPack object, and if data
resides on the GPU.
It returns a tuple of two boolean values: one indicating if this is a valid DLPack object, and the other if the data
resides on the GPU.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@JanuszL
Copy link
Contributor Author

JanuszL commented Jun 25, 2020

!build

@@ -504,17 +523,13 @@ def define_graph(self):

def iter_setup(self):
if use_list:
batch_data = [random_array([100, 100, 3]) for _ in range(self.batch_size)]
batch_data = [cast_to(random_array([100, 100, 3]), datapy.uint8) for _ in range(self.batch_size)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
batch_data = [cast_to(random_array([100, 100, 3]), datapy.uint8) for _ in range(self.batch_size)]
batch_data = [cast_to(random_array([100, 100, 3])*255, datapy.uint8) for _ in range(self.batch_size)]

else:
batch_data = random_array([self.batch_size, 100, 100, 3])
self.feed_input(self.batch, batch_data)
batch_data = cast_to(random_array([self.batch_size, 100, 100, 3]), datapy.uint8)
Copy link
Contributor

@mzient mzient Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
batch_data = cast_to(random_array([self.batch_size, 100, 100, 3]), datapy.uint8)
batch_data = cast_to(random_array([self.batch_size, 100, 100, 3])*256, datapy.uint8)

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1423418]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1423418]: BUILD FAILED

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1424173]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1424173]: BUILD PASSED

Signed-off-by: Janusz Lisiecki <jlisiecki@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1425810]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1425810]: BUILD PASSED

@JanuszL JanuszL merged commit 933636d into NVIDIA:master Jun 26, 2020
@JanuszL JanuszL deleted the dlpack_input branch June 26, 2020 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants