Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized handling of operator arguments. #2393

Merged
merged 6 commits into from
Oct 27, 2020

Conversation

mzient
Copy link
Contributor

@mzient mzient commented Oct 23, 2020

Signed-off-by: Michał Zientkiewicz mzient@gmail.com

Why we need this PR?

Pick one, remove the rest

  • It adds new feature needed because it makes passing non-scalar constants to argument inputs easier

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    • Try to call Constant on unrecognized arguments in operator's __call__
    • Add more elaborate distinction between call and init arguments in fn wrappers.
    • Handle numpy scalars in types.Constant
    • Promote non-data-nodes to constant nodes on pipeline output
    • Add a function to separate keyword arguments to call/init args, use it in __init__, __call__ and fn wrappers.
    • Add Compose function :)
  • Affected modules and functionalities:
    • ops, types, fn, pipeline
  • Key points relevant for the review:
    • N/A
  • Validation and testing:
    • Unit test in python + existing tests treated as regression test
  • Documentation (including examples):
    • N/A

JIRA TASK: N/A

@mzient mzient requested a review from a team October 23, 2020 18:02
@mzient
Copy link
Contributor Author

mzient commented Oct 23, 2020

!build

@mzient mzient force-pushed the ArgInputConstantPromotion branch 3 times, most recently from 1bf39ba to 2fdec26 Compare October 23, 2020 18:05
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1729172]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1729172]: BUILD FAILED

@mzient mzient changed the title Promote non-scalar constants to DataNodes in named arguments. Scalar and constant related improvements. Oct 24, 2020
@mzient
Copy link
Contributor Author

mzient commented Oct 24, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1731279]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1731279]: BUILD PASSED

return True
if value is None:
return False
if isinstance(value, (bool, int, float, str, list, tuple, nvidia.dali.types.ScalarConstant)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are this all the types we need to check? I am thinking of things like np.int64, etc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point.

if value_dtype is not None:
dali_type = to_dali_type(value.dtype)
if dali_type in _int_types:
value = int(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this int32 or int64? are we OK with changing types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python's int is 64-bit, so it should cover everything except UINT64 values above 2^63 - I think we can live with it.

@@ -1707,3 +1707,12 @@ def get_output():
out = get_output()[0].at(0)
assert out[0] == -0.5 and out[1] == 1.25

def test_return_constants():
pipe = dali.pipeline.Pipeline(1, 1, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick, no GPU is used in the test anyway:

Suggested change
pipe = dali.pipeline.Pipeline(1, 1, 0)
pipe = dali.pipeline.Pipeline(1, 1, None)

assert np.array_equal(a.at(0), np.array([[1,2],[3,4]]))
assert b.at(0) == 10
assert c.at(0) == 15
assert c.at(0).dtype == np.uint8
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you check other types as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so.

def _is_mxnet_array(value):
return 'mxnet.ndarray.ndarray.NDArray' in str(type(value))

def _is_torch_tensor(value):
return 'torch.Tensor' in str(type(value))

def _is_numpy_array(value):
return 'numpy.ndarray' in str(type(value))
type_name = str(type(value))
return 'numpy.ndarray' in type_name or \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about, long, short, byte and other types numpy supports?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't. It's always intN, uintN or floatN. Numpy supports array of bool, but the element is an ordinary Python bool.

@@ -393,6 +393,8 @@ def _prepare_graph(self, define_graph = None):
if isinstance(outputs[i], types.ScalarConstant):
import nvidia.dali.ops
outputs[i] = nvidia.dali.ops._instantiate_constant_node("cpu", outputs[i])
elif not isinstance(outputs[i], DataNode):
outputs[i] = types.Constant(outputs[i], device="cpu")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we support cupy and GPU constants as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at present.

@mzient mzient changed the title Scalar and constant related improvements. Generalized handling of operator arguments. Oct 26, 2020
@lgtm-com
Copy link
Contributor

lgtm-com bot commented Oct 26, 2020

This pull request introduces 1 alert when merging a1a816bc6637c38c016612cfa136358249262723 into 8112549 - view on LGTM.com

new alerts:

  • 1 for Unused import

@@ -1707,3 +1707,14 @@ def get_output():
out = get_output()[0].at(0)
assert out[0] == -0.5 and out[1] == 1.25

def test_return_constants():
pipe = dali.pipeline.Pipeline(1, 1, None)
types = [bool, np.int8, np.uint8, np.int16, np.uint16, np.int32, np.uint32, np.float32]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NOTE: Constant operator doesn't support int64 or double at native level. We should probably fix it at some point somehow, but I don't have a good idea of how to do that at present.

@mzient
Copy link
Contributor Author

mzient commented Oct 26, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1733940]: BUILD STARTED

@@ -259,6 +259,9 @@ def __init__(self, source = None, num_outputs = None, *, cycle = None, layout =
self._cuda_stream = cuda_stream
self._use_copy_kernel = use_copy_kernel

import nvidia.dali.ops
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this import to the top of the file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a circular import, it won't work on top of the file.


def test_compose():
batch_size = 3
pipe = Pipeline(batch_size,1,None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pipe = Pipeline(batch_size,1,None)
pipe = Pipeline(batch_size, 1, None)

nitpick

@@ -3,7 +3,16 @@
import numpy as np

def test_cat_numpy_array():
pipe = dali.pipeline.Pipeline(1,1,0)
pipe = dali.pipeline.Pipeline(1,1,None)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pipe = dali.pipeline.Pipeline(1,1,None)
pipe = dali.pipeline.Pipeline(1, 1, None)

nitpick


return inputs[0] if len(inputs) == 1 else inputs

def Compose(op_list):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this visible in our supported operations table? Can you paste a screenshot?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it may deserve a bigger and more complete example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now it's marked as experimental. I think the next step would be to send it to the proponent of the whole Compose affair to play with and. Once we get some feedback and suggestions for improvements, we can improve the functionality and provide comprehensive documentation - or, conversely, remove it if we decide it's not necessary after all.

Copy link
Contributor

@JanuszL JanuszL Oct 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not listed in Support Table.
I think you need

   global _cpu_ops
    _cpu_ops = _cpu_ops.union({'DLTensorPythonFunction'})
    global _gpu_ops
    _gpu_ops = _gpu_ops.union({'DLTensorPythonFunction'})

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also how it handles CPU to GPU memory transfer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the handling of transfers.
As for the support table, it's a bit harder, since this is not an operator in native sense and doesn't have a schema.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a schema to make it a part of the table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I hardcode it into the table generator - then yes. When I tried just putting the operator there using set union, Sphinx failed with No schema registered for operator Compose.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1733940]: BUILD PASSED

@mzient
Copy link
Contributor Author

mzient commented Oct 26, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1734773]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1734773]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1734773]: BUILD PASSED

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
…d from the pipeline.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Added Compose function to combine multiple operator instances.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Extended documentation and tests.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@mzient
Copy link
Contributor Author

mzient commented Oct 27, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1736521]: BUILD STARTED

supports_seq = '|v|' if schema.AllowsSequences() or schema.IsSequenceOperator() else ''
volumetric = '|v|' if schema.SupportsVolumetric() else ''
try:
schema = b.GetSchema(op)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can exposer TryGetSchema in backend_impl.cc and use it here instead.
Other than that LGTM.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
@mzient
Copy link
Contributor Author

mzient commented Oct 27, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1736871]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1736521]: BUILD PASSED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1736871]: BUILD PASSED

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants