Constant operator and Python wrapper. #1699

mzient · 2020-01-28T17:04:15Z

Add constant operator and unify it in Python with types.Constant.

Signed-off-by: Michal Zientkiewicz michalz@nvidia.com

Why we need this PR?

Pick one, remove the rest

It adds new feature to make operators like Slice and writing tests easier and more productive.

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

What solution was applied:
- Added an operator which returns same data all the time, without copying
- Renamed dali.types.Constant to dali.types.ScalarConstant and added a new dali.types.Constant function which can produce the new Constant node.
Affected modules and functionalities:
- Python wrapper (ops, types)
Key points relevant for the review:
- The python wrapper mostly
Validation and testing:
- Python tests *
Documentation (including examples):
- Not yet *

JIRA TASK: N/A

mzient · 2020-01-28T17:07:11Z

!build

dali-automaton · 2020-01-28T17:10:20Z

CI MESSAGE: [1095600]: BUILD STARTED

mzient · 2020-01-28T17:27:29Z

dali/python/nvidia/dali/ops.py

@@ -21,7 +21,7 @@
 from nvidia.dali import backend as b
 from nvidia.dali.types import _type_name_convert_to_string, _type_convert_value, \
        _vector_element_type, _bool_types, _int_types, _int_like_types, _float_types, \
-        DALIDataType, CUDAStream, Constant
+        DALIDataType, CUDAStream, ScalarConstant as _Constant


...actually, we should add more underscores and import aliases here, because it pollutes ops module.

dali-automaton · 2020-01-28T17:54:30Z

CI MESSAGE: [1095600]: BUILD FAILED

Kh4L · 2020-01-28T18:06:07Z

dali/python/nvidia/dali/ops.py

+                arg_inp = kwargs[k]
+                if arg_inp is None:
+                    continue
+                if type(arg_inp) is _Constant:


You could write this for consistency:

Suggested change

if type(arg_inp) is _Constant:

if isinstance(arg_inp, _Constant):

JanuszL · 2020-01-28T18:54:52Z

dali/python/nvidia/dali/ops.py

@@ -291,6 +291,9 @@ def __init__(self):
    def id(self):
        return self._id

+def _instantiate_constant_node(constant):
+    return Constant(value = [constant.value], dtype = constant.dtype)


Suggested change

return Constant(value = [constant.value], dtype = constant.dtype)

return _Constant(value = [constant.value], dtype = constant.dtype)

That's intended (it should be ops.Constant, but I don't think a module can refer to itself in Python). I renamed _Constant to _ScalarConstant to avoid confusion.

Logic changed. Now I use the types.Constant function.

JanuszL · 2020-01-28T18:57:35Z

dali/python/nvidia/dali/types.py

+        def _numpy_to_dali_type(t):
+            if t is None:
+                return None
+            import numpy as np


Do we need this import here?

Not really, it's already there in the outer scope.

dali/python/nvidia/dali/types.py

JanuszL · 2020-01-28T19:04:18Z

dali/python/nvidia/dali/ops.py

@@ -21,7 +21,7 @@
 from nvidia.dali import backend as b
 from nvidia.dali.types import _type_name_convert_to_string, _type_convert_value, \
        _vector_element_type, _bool_types, _int_types, _int_like_types, _float_types, \
-        DALIDataType, CUDAStream, Constant
+        DALIDataType, CUDAStream, ScalarConstant as _Constant


Suggested change

DALIDataType, CUDAStream, ScalarConstant as _Constant

DALIDataType, CUDAStream, Constant as _Constant

No, it needs to be exactly the ScalarConstant. I can use _ScalarConstant instead.

dali/test/python/test_operator_constant.py

JanuszL

You forgot to add the operator itself.

mzient · 2020-01-29T09:56:11Z

!build

dali-automaton · 2020-01-29T10:00:39Z

CI MESSAGE: [1097209]: BUILD STARTED

dali-automaton · 2020-01-29T11:09:54Z

CI MESSAGE: [1097209]: BUILD PASSED

klecki · 2020-01-29T14:44:58Z

dali/python/nvidia/dali/ops.py

+                if arg_inp is None:
+                    continue
+                if isinstance(type(arg_inp), _ScalarConstant):
+                    arg_inp = instantiate_constant_node(arg_inp)


I can only see _instantiate_constant_node (starting with underscore). Is this codepath tested?

Bug. Fixed.

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

klecki · 2020-01-29T14:47:54Z

dali/python/nvidia/dali/ops.py

@@ -314,14 +317,19 @@ def __init__(self, inputs, op, **kwargs):
        # Argument inputs
        for k in sorted(kwargs.keys()):
            if k not in ["name"]:
-                if not isinstance(kwargs[k], _EdgeReference):
+                arg_inp = kwargs[k]


Maybe loop over (keys, values) here? But I don't know what up with sorting.

I don't know if I can overwrite this reference in a loop when enumerating like that. It would certainly be quite unexpected.

Sorting is necessary to properly match inputs (which are numbered) to arguments. I'd leave it like this.

dali/python/nvidia/dali/ops.py

klecki · 2020-01-29T16:25:44Z

dali/operators/generic/constant.cc

+  .AddOptionalArg<int>("shape", "The desired shape of the output. "
+                                 "If not set, the data is assumed to be 1D",
+                  std::vector<int>())
+  .AddOptionalArg<float>("fdata", "Contents of the constant produced (for floating point types).",


Shouldn't it be prohibited to pass both of fdata and idata to one node. I think we should provide an error in such case.

Can you also warn the user in the doc string?

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

mzient · 2020-01-29T16:36:20Z

!build

dali-automaton · 2020-01-29T16:40:36Z

CI MESSAGE: [1097734]: BUILD STARTED

dali-automaton · 2020-01-29T17:23:43Z

CI MESSAGE: [1097734]: BUILD PASSED

dali/operators/generic/constant.cc

dali/operators/generic/constant.cu

jantonguirao

LGTM, but please check my comments

jantonguirao · 2020-01-30T09:01:56Z

dali/operators/generic/constant.cc

+
+The floating point input data should be placed in `fdata` argument and integer data in `idata`.
+The data is a flat vector of values or a single scalar. The data is then reshaped according
+to the `shape` argument. If the data is scalar, it will be broadcast to fill the entire shape.


Suggested change

to the `shape` argument. If the data is scalar, it will be broadcast to fill the entire shape.

to the `shape` argument. If the data is scalar, it will be broadcasted to fill the entire shape.

According to a dictionary:
to broadcast; p. broadcast, occas. broadcasted.

jantonguirao · 2020-01-30T09:02:55Z

dali/operators/generic/constant.cc

+  .NumInput(0)
+  .NumOutput(1)
+  .AddOptionalArg<int>("shape", "The desired shape of the output. "
+                                 "If not set, the data is assumed to be 1D",


nitpick: messy indentation here

jantonguirao · 2020-01-30T09:07:11Z

dali/operators/generic/constant.cu

+          assert(!idata_.empty());
+          FillTensorList<type>(output_, output_shape_, idata_, ws.stream());
+        }
+      ), (DALI_FAIL("Unsupported type")));  // NOLINT


print the type

jantonguirao · 2020-01-30T09:08:52Z

dali/operators/generic/constant.h

+#include "dali/core/static_switch.h"
+
+#define CONSTANT_OP_SUPPORTED_TYPES \
+  (bool, int8_t, uint8_t, int16_t, uint16_t, int32_t, uint32_t, int64_t, uint64_t, float)


wondering, how about float16? It might be useful if we want to make a quick synthetic pipeline producing constant data just to measure the training speed without preprocessing

Would need special treatment for GPU code to compile, but yeah, I can try.

jantonguirao · 2020-01-30T09:23:03Z

dali/python/nvidia/dali/types.py

                "explicitly before casting to builtin `float`.").format(_float_types))

    def __str__(self):
        return "{}:{}".format(self.value, self.dtype)

    def __repr__(self):
        return "{}".format(self.value)
+
+def _is_scalar_shape(shape):
+    return shape is None or shape == 1 or shape == [1]


Suggested change

return shape is None or shape == 1 or shape == [1]

return shape is None or shape == 1 or shape == [1] or shape == (1,)

Does it make sense?

jantonguirao · 2020-01-30T09:24:02Z

dali/python/nvidia/dali/types.py

+        if value.dtype == np.uint64:
+            value = value.astype(np.uint32)
+
+        def _numpy_to_dali_type(t):


this function seems useful enough to be exposed

jantonguirao · 2020-01-30T09:24:41Z

dali/test/python/test_operator_constant.py

@@ -0,0 +1,144 @@
+# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.


nitpick: 2020

klecki · 2020-01-30T11:01:04Z

dali/operators/generic/constant.h

+  bool SetupImpl(std::vector<OutputDesc> &output_desc, const Workspace &ws) override {
+    output_desc.resize(1);
+    if (output_shape_.empty()) {
+      int batch_size = this->spec_.template GetArgument<int>("batch_size");
+      output_shape_ = uniform_list_shape(batch_size, shape_arg_);
+    }
+    output_desc[0] = { output_shape_, TypeTable::GetTypeInfo(output_type_) };


If we do not want executor to allocate for us, why do we bother with this function?

True. That is, the function can be there, but filling the output_desc is not necessary.

JanuszL · 2020-01-30T11:04:02Z

dali/operators/generic/constant.cu

+
+  out.Reset();
+  out.ShareData(&output_);
+  out.Resize(output_shape_);


Do you need to cal Resize after ShareData?

No need, the metadata (without the layout) is copied.

klecki · 2020-01-30T11:16:40Z

dali/operators/generic/constant.cu

+
+    int n = tmp.size() * sizeof(Dst);
+    for (int i = 0; i < shape.num_samples(); i++)
+      cudaMemcpyAsync(dst.mutable_tensor<Dst>(i), tmp.data(), n, cudaMemcpyHostToDevice, stream);


I know this happens once (or rather batch size times), but someone complained to me once about invoking a lot of cudaMemcpys.

You probably can do the Host -> Device copy once and than do Device -> Device for the rest.

klecki · 2020-01-30T11:31:28Z

dali/operators/generic/constant.cc

+void FillTensorVector(
+  TensorVector<CPUBackend> &dst, const TensorListShape<> &shape, const std::vector<Src> &src) {
+  dst.SetContiguous(false);
+  dst.Resize(shape);


This Resize + mutable_data<Dst> later will actually do a batch size of allocations.

If you really want to save that memory, you probably should do a singe Tensor, and share data with it.

Do we really want to be that clever or should we just go with one big allocation and have the actual TensorList underneath here?

It was supposed to get optimized in some MakeContiguous scenarios.

If the constant node is supposed to be copied to the GPU, then it should have had been created with device="gpu" in the first place.

As for the first point - I traced in in the debugger, it doesn't allocate any memory because at the point of resize the underlying tensors don't have a type yet.

klecki

Shouldn't we adjust the tutorials a bit: https://docs.nvidia.com/deeplearning/sdk/dali-master-branch-user-guide/docs/examples/general/expressions/expr_examples.html#Constants
https://docs.nvidia.com/deeplearning/sdk/dali-master-branch-user-guide/docs/examples/general/expressions/expr_type_promotions.html#Using-Constants
There are mentions of Constant, the code is backward compatible, but the meaning changed a bit.

klecki · 2020-01-30T11:43:09Z

dali/python/nvidia/dali/types.py

+            shape = value.shape
+        data = value.flatten().tolist()
+    else:
+        def _type_from_value_or_list(v):


What with bool?

klecki · 2020-01-30T11:44:39Z

dali/python/nvidia/dali/types.py

+                      shape = shape, dtype = dtype, layout = layout)
+    return op()
+
+def Constant(value, dtype = None, shape = None, layout = None, device = None):


Maybe pass additional kwargs to the ops.Constant?
I'm thinking about naming the operator for example.

klecki · 2020-01-30T11:48:25Z

dali/test/python/test_operator_constant.py

+    yield _test_scalar_constant_promotion, "cpu"
+    yield _test_scalar_constant_promotion, "gpu"
+
+def main():


Why not let nose run the test, and duplicate that. Should we maintain both the main and the test case definitions?

It's much easier to filter it here for debugging purposes.

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

mzient · 2020-01-31T11:15:54Z

!build

dali-automaton · 2020-01-31T11:20:23Z

CI MESSAGE: [1101517]: BUILD STARTED

dali-automaton · 2020-01-31T12:12:42Z

CI MESSAGE: [1101517]: BUILD PASSED

mzient · 2020-01-31T21:47:59Z

@klecki Regarding docs:
I agree, we should update - and so should we also provide an example. But, as you said, the code is indeed backward-compatible and we can adjust docs in a separate PR. This one has grown big enough.

mzient requested review from Kh4L, klecki, ptrendx and a team January 28, 2020 17:04

mzient changed the title ~~Add constant operator and unify it in Python with types.Constant.~~ Constant operator and Python wrapper. Jan 28, 2020

mzient commented Jan 28, 2020

View reviewed changes

mzient requested review from jantonguirao and JanuszL January 28, 2020 17:30

Kh4L reviewed Jan 28, 2020

View reviewed changes

JanuszL reviewed Jan 28, 2020

View reviewed changes

dali/python/nvidia/dali/types.py Outdated Show resolved Hide resolved

JanuszL reviewed Jan 28, 2020

View reviewed changes

dali/test/python/test_operator_constant.py Show resolved Hide resolved

JanuszL requested changes Jan 28, 2020

View reviewed changes

mzient force-pushed the ConstantOp branch from 9054d72 to ffd0f51 Compare January 29, 2020 09:55

mzient requested a review from JanuszL January 29, 2020 09:56

mzient requested a review from Kh4L January 29, 2020 13:23

klecki reviewed Jan 29, 2020

View reviewed changes

mzient added 2 commits January 29, 2020 16:53

Add constant operator and unify it in Python with types.Constant.

905c723

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Add missing files. Fix review issues.

9f4dd4f

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

klecki requested changes Jan 29, 2020

View reviewed changes

mzient force-pushed the ConstantOp branch from ffd0f51 to 1f4ef19 Compare January 29, 2020 16:35

Review fixes, round 2

b421bf9

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

mzient force-pushed the ConstantOp branch from 1f4ef19 to b421bf9 Compare January 29, 2020 16:36

JanuszL reviewed Jan 29, 2020

View reviewed changes

dali/operators/generic/constant.cc Show resolved Hide resolved

JanuszL reviewed Jan 29, 2020

View reviewed changes

dali/operators/generic/constant.cu Outdated Show resolved Hide resolved

JanuszL reviewed Jan 29, 2020

View reviewed changes

dali/operators/generic/constant.cu Show resolved Hide resolved

mzient requested review from JanuszL and klecki January 30, 2020 07:53

jantonguirao approved these changes Jan 30, 2020

View reviewed changes

klecki reviewed Jan 30, 2020

View reviewed changes

JanuszL reviewed Jan 30, 2020

View reviewed changes

JanuszL approved these changes Jan 30, 2020

View reviewed changes

klecki reviewed Jan 30, 2020

View reviewed changes

mzient added 3 commits January 31, 2020 11:17

Add float16 support; fix some minor issues.

51de560

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Add additional keyword arguments to Constant & ConstantNode.

2452bf5

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

Fix bool handling.

43b4f6b

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>

mzient requested a review from klecki January 31, 2020 11:16

JanuszL approved these changes Jan 31, 2020

View reviewed changes

klecki approved these changes Feb 4, 2020

View reviewed changes

mzient merged commit ab36ff5 into NVIDIA:master Feb 4, 2020

	if type(arg_inp) is _Constant:
	if isinstance(arg_inp, _Constant):

	return Constant(value = [constant.value], dtype = constant.dtype)
	return _Constant(value = [constant.value], dtype = constant.dtype)

	DALIDataType, CUDAStream, ScalarConstant as _Constant
	DALIDataType, CUDAStream, Constant as _Constant

	to the `shape` argument. If the data is scalar, it will be broadcast to fill the entire shape.
	to the `shape` argument. If the data is scalar, it will be broadcasted to fill the entire shape.

	return shape is None or shape == 1 or shape == [1]
	return shape is None or shape == 1 or shape == [1] or shape == (1,)

		@@ -0,0 +1,144 @@
		# Copyright (c) 2019, NVIDIA CORPORATION. All rights reserved.

Constant operator and Python wrapper. #1699

Constant operator and Python wrapper. #1699

Conversation

mzient commented Jan 28, 2020 • edited by JanuszL Loading

Why we need this PR?

What happened in this PR?

mzient commented Jan 28, 2020

dali-automaton commented Jan 28, 2020

Choose a reason for hiding this comment

dali-automaton commented Jan 28, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Jan 29, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

JanuszL left a comment

Choose a reason for hiding this comment

mzient commented Jan 29, 2020

dali-automaton commented Jan 29, 2020

dali-automaton commented Jan 29, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient commented Jan 29, 2020

dali-automaton commented Jan 29, 2020

dali-automaton commented Jan 29, 2020

jantonguirao left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Jan 30, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

klecki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient commented Jan 31, 2020

dali-automaton commented Jan 31, 2020

dali-automaton commented Jan 31, 2020

mzient commented Jan 31, 2020

mzient commented Jan 28, 2020 •

edited by JanuszL

Loading

mzient Jan 29, 2020 •

edited

Loading

mzient Jan 30, 2020 •

edited

Loading