Add eager mode stateful operators #4016

ksztenderski · 2022-06-28T13:49:41Z

Category:

New feature (non-breaking change which adds functionality)

Description:

Add experimental exposure of eager.rng_state. All operators that are dependent on a state (excluding readers) are methods of eager.rng_state.
Example usage:

import nvidia.dali.experimental.eager as eager

eager_state = eager.rng_state(seed=27)
tl = TensorListCPU(np.zeros((8, 320, 320, 3), dtype=np.uint8))
out = rng.noise.gaussian(tl)
print(out)

Additionally adds a function for exposing eager operators as objects if we ever want to switch to the ops-like API.

Additional information:

Affected modules and functionalities:

Eager mode

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

Implements new requirements
Affects existing requirements
N/A

REQ IDs: N/A

JIRA TASK: N/A

Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

ksztenderski · 2022-06-28T13:55:32Z

!build

dali-automaton · 2022-06-28T14:00:11Z

CI MESSAGE: [5199224]: BUILD STARTED

dali-automaton · 2022-06-28T15:08:02Z

CI MESSAGE: [5199224]: BUILD PASSED

klecki

Mostly nitpick and docstring requests. Otherwise looks ok.

klecki · 2022-06-28T14:39:59Z

dali/test/python/test_eager_coverage.py

-        if not isinstance(out_eager, (tuple, list)):
-            out_eager = (out_eager,)
+def prep_stateful_operators(op_path):
+    seed = rng.integers(2048)


Please add a comment that we are replicating the seed that would be used in the eager op internal for the basline op in the pipeline.

klecki · 2022-06-28T14:43:43Z

dali/test/python/test_eager_coverage.py

@@ -131,14 +131,14 @@ def eager_source(self, i, layout='HWC'):
        return get_tl(np.array(self.fn_source(i)), layout)


I guess for the rng_state we should probably get a test that interleaves operations on the same op with different scalar arguments (so two different instances) and maybe two different rng_state objects (same/distinct seed) and check if they have equal/not equal results as we expect.

klecki · 2022-06-28T15:38:59Z

dali/python/nvidia/dali/_utils/eager_utils.py

+        if op_schema.IsDeprecated() or op_name in _excluded_operators:
+            return


I think we should start with the simple checks that return us out of this function, so we get them out of the way - I would start with this.

klecki · 2022-06-28T15:44:01Z

dali/python/nvidia/dali/_utils/eager_utils.py

+        last_module = eager.rng_state
+        for cur_module_name in submodule:
+            # If nonexistent registers rng_state's submodule.
+            cur_module = last_module._submodule(cur_module_name)
+            last_module = cur_module


Just a suggestion, feel free to ignore, but maybe we should have
get_stateful_target_module and get_stateless_target_module or something, and than we can have a neat if/else chain that checks the kind of op, prepares the wrapper and the target module to insert the wrapper into?

klecki · 2022-06-28T16:07:50Z

dali/python/nvidia/dali/_utils/eager_utils.py

+    return backend_op
+
+
+def _eager_op_object_factory(op_class, op_name):


I guess this could be grouped and marked with the _expose part as unused and with the purpose described. Maybe make it a section with a "block" comment, or a file?

I grouped it together, and added comments about it being unused. I don't know about moving it to a separate file, I think it's fine like this.

klecki · 2022-06-28T16:18:52Z

dali/python/nvidia/dali/experimental/eager/__init__.py

+
+class rng_state(_create_module_class()):
+    """ Manager class for stateful operators. Methods of this class correspond to the appropriate
+    functions in the fn API, they are created by :func:`_wrap_stateful` and are added dynamically.


they are created by :func:_wrap_stateful and are added dynamically. - this part should probably be a comment, and the docstring for the class or for the __init__ could add some info about the seeds, and some rough sketch of what this state promises.

klecki · 2022-06-28T16:20:46Z

dali/python/nvidia/dali/_utils/eager_utils.py

+        key = op_name + _desc_call_args(inputs, call_args) + str(sorted(init_args.items()))
+
+        if key not in self._operator_cache:
+            seed = self._seed_generator.integers(2**32)


This is the tricky part here, can you add a comment why do we create seeds this way and what do we cache (that key - so the scalar arguments and the input dim/type maps to a distinct seed).

Fixes mixed device eager operators and gpu arithmetic operators. Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

klecki · 2022-06-29T14:12:22Z

dali/python/nvidia/dali/_debug_mode.py

@@ -549,7 +550,7 @@ def __init__(self, exec_func, **kwargs):
        import numpy as np
        seed = kwargs.get('seed', -1)
        if seed < 0:
-            seed = np.random.randint(0, 2**32)
+            seed = np.random.randint(sys.maxsize)


Hmm, wouldn't this be platform dependent? Maybe used iinfo and fixed sized type?

I set a fixed value at (1 << 31) - 1

klecki · 2022-06-29T14:26:07Z

dali/python/nvidia/dali/_utils/eager_utils.py

@@ -212,7 +212,11 @@ def _arithm_op(name, *inputs):
    categories_idxs, inputs, integers, reals = _ops._group_inputs(
        inputs, edge_type=(_tensors.TensorListCPU, _tensors.TensorListGPU))
    input_desc = _ops._generate_input_desc(categories_idxs, integers, reals)
-    device = _ops._choose_device(inputs)
+
+    if any(isinstance(input, _tensors.TensorListGPU) for input in inputs):


Here the inputs are already normalized to TensorListCPU/GPU, correct?

klecki · 2022-06-29T14:33:40Z

dali/python/nvidia/dali/_utils/eager_utils.py

@@ -647,6 +668,13 @@ def _desc_call_args(inputs, args):
        [(key, value.dtype, value.layout(), len(value[0].shape())) for key, value in args.items()]))


+def _gen_cache_key(op_name, inputs, init_args, call_args):


klecki · 2022-06-29T14:51:27Z

dali/test/python/test_eager_operators.py

+    assert np.allclose(out_1_2.as_tensor(), out_2_2.as_tensor())
+
+
+def test_objective_eager_resize():


Please add a comment that this tests the hidden functionality of exposing the Eager Ops classes.

Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

ksztenderski · 2022-06-29T15:45:03Z

!build

dali-automaton · 2022-06-29T15:50:02Z

CI MESSAGE: [5211570]: BUILD STARTED

dali-automaton · 2022-06-29T16:58:28Z

CI MESSAGE: [5211570]: BUILD PASSED

Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

ksztenderski · 2022-06-29T21:24:31Z

!build

dali-automaton · 2022-06-29T21:30:06Z

CI MESSAGE: [5215548]: BUILD STARTED

dali-automaton · 2022-06-29T22:42:08Z

CI MESSAGE: [5215548]: BUILD FAILED

dali-automaton · 2022-06-30T07:35:24Z

CI MESSAGE: [5215548]: BUILD PASSED

ksztenderski added 3 commits June 28, 2022 09:27

Add exposure of stateful operators in eager mode

567f500

Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

Add a way for exposing eager operators as objects

6a924bb

Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

Add tests for eager stateful

588ae94

Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

ksztenderski assigned klecki Jun 28, 2022

klecki reviewed Jun 28, 2022

View reviewed changes

jantonguirao assigned mzient Jun 29, 2022

klecki assigned awolant and unassigned mzient Jun 29, 2022

awolant approved these changes Jun 29, 2022

View reviewed changes

Add comments and fixes

7e749a7

Fixes mixed device eager operators and gpu arithmetic operators. Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

klecki approved these changes Jun 29, 2022

View reviewed changes

Use fixed value as seed upper bound

5314f22

Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

klecki approved these changes Jun 29, 2022

View reviewed changes

Clean imports inside tests

7c3db6c

Signed-off-by: ksztenderski <ksztenderski@nvidia.com>

klecki approved these changes Jun 30, 2022

View reviewed changes

ksztenderski merged commit 752e770 into NVIDIA:main Jun 30, 2022

JanuszL mentioned this pull request Jan 11, 2023

DALI 2022 roadmap #3774

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add eager mode stateful operators #4016

Add eager mode stateful operators #4016

ksztenderski commented Jun 28, 2022 •

edited

Loading

ksztenderski commented Jun 28, 2022

dali-automaton commented Jun 28, 2022

dali-automaton commented Jun 28, 2022

klecki left a comment

klecki Jun 28, 2022

ksztenderski Jun 29, 2022

klecki Jun 28, 2022

ksztenderski Jun 29, 2022

klecki Jun 28, 2022

ksztenderski Jun 29, 2022

klecki Jun 28, 2022

ksztenderski Jun 29, 2022

klecki Jun 28, 2022

ksztenderski Jun 29, 2022

klecki Jun 28, 2022

ksztenderski Jun 29, 2022

klecki Jun 28, 2022

ksztenderski Jun 29, 2022

klecki Jun 29, 2022

ksztenderski Jun 29, 2022

klecki Jun 29, 2022

ksztenderski Jun 29, 2022

klecki Jun 29, 2022

klecki Jun 29, 2022

ksztenderski Jun 29, 2022

ksztenderski commented Jun 29, 2022

dali-automaton commented Jun 29, 2022

dali-automaton commented Jun 29, 2022

ksztenderski commented Jun 29, 2022

dali-automaton commented Jun 29, 2022

dali-automaton commented Jun 29, 2022

dali-automaton commented Jun 30, 2022

		@@ -131,14 +131,14 @@ def eager_source(self, i, layout='HWC'):
		return get_tl(np.array(self.fn_source(i)), layout)

		if op_schema.IsDeprecated() or op_name in _excluded_operators:
		return

		return backend_op


		def _eager_op_object_factory(op_class, op_name):

		@@ -647,6 +668,13 @@ def _desc_call_args(inputs, args):
		[(key, value.dtype, value.layout(), len(value[0].shape())) for key, value in args.items()]))


		def _gen_cache_key(op_name, inputs, init_args, call_args):

		assert np.allclose(out_1_2.as_tensor(), out_2_2.as_tensor())


		def test_objective_eager_resize():

Add eager mode stateful operators #4016

Add eager mode stateful operators #4016

Conversation

ksztenderski commented Jun 28, 2022 • edited Loading

Category:

Description:

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

Checklist

Documentation

DALI team only

Requirements

ksztenderski commented Jun 28, 2022

dali-automaton commented Jun 28, 2022

dali-automaton commented Jun 28, 2022

klecki left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ksztenderski commented Jun 29, 2022

dali-automaton commented Jun 29, 2022

dali-automaton commented Jun 29, 2022

ksztenderski commented Jun 29, 2022

dali-automaton commented Jun 29, 2022

dali-automaton commented Jun 29, 2022

dali-automaton commented Jun 30, 2022

ksztenderski commented Jun 28, 2022 •

edited

Loading