Gpu python operator notebook #1715

banasraf · 2020-02-05T15:29:19Z

Why we need this PR?

It adds an example for GPU Python Operators

What happened in this PR?

What solution was applied:
Adding a notebook
Affected modules and functionalities:
Documentation
Key points relevant for the review:
The Notebook
Validation and testing:
Notebook added to QA
Documentation (including examples):
NA

JIRA TASK: [DALI-1260]

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>

banasraf · 2020-02-05T15:34:58Z

!build

dali-automaton · 2020-02-05T15:40:24Z

CI MESSAGE: [1110232]: BUILD STARTED

szalpal · 2020-02-05T15:56:17Z

General suggestions:

When you describe the issue of synchronization, it would be great if'd also show, what happens if user forgets to synchronize. As in "look, that's a problem and here's a solution".
What IMHO lacks here is a slight reminder, what's the difference between PythonFunction, DLTensorPythonFuntion and TorchPythonFunction. I know it's in PythonFunction example, here it wouldn't hurt in my opinion.

Details:

For an introduction and general information about Python Operators family see the Python Operators notebook.

You can put a link here ;)

Below we present a simple kernel interlaying channels

interleaving?

More informationa about

Typo here

dali-automaton · 2020-02-05T16:35:00Z

CI MESSAGE: [1110232]: BUILD PASSED

klecki · 2020-02-05T16:44:43Z

When using PythonFunction or TorchPythonFunction we do not have to bother about synchronizing our GPU function with the rest of DALI pipeline, becuase it is handled behind the scenes.

IMO it's to informal, I would go with something along the lines of:
For PythonFunction and TorchPythonFunction the synchronization of user's GPU code (the provided function) and the rest of DALI Pipeline is handled automatically by the Operator.

klecki · 2020-02-05T16:53:37Z

As for the ending:

To properly synchronize device code in DLTensorPythonFunction we have to ensure that:
* all the preceding GPU work is done before the start of provided function,
* the work we schedule inside finishes before we return the results.
The first condition is warranted by the synchronize_stream=True flag (ON by default). User is responsible for providing the second part. In the example above it is achieved by the line cupy.cuda.get_current_stream().synchronize().

How about:

* all the preceding **DALI** GPU work is done before the start of provided function - this can be handled by the Operator using `synchronize_stream=True` flag (ON by default),
* the work we schedule inside finishes before we return the results - we must use CuPy's cupy.cuda.get_current_stream().synchronize() to synchronize at the end of the function.

But I'm not sure if it's any better.

JanuszL · 2020-02-05T23:05:46Z

1. When you describe the issue of synchronization, it would be great if'd also show, what happens if user forgets to synchronize. As in "look, that's a problem and here's a solution".

Anything can happen or nothing. The question is if we can assume that someone using GPU is familiar with the multithreading challenges and synchronization concept. I don't know if we should describe the details of how synchronization works in CUDA and why we need to do this here, but we can point to some reference and mention the difference between DLPack variant of Python function comparing to the plain one.

I agree with the rest of the comments.

mzient · 2020-02-06T08:24:10Z

docs/examples/custom_operations/python_operator.ipynb

@@ -97,7 +97,7 @@
   "cell_type": "markdown",
   "metadata": {},
   "source": [
-    "### Running the pipeline and visualizing results\n",
+    "## Running the pipeline and visualizing results\n",


Nitpick

Suggested change

"## Running the pipeline and visualizing results\n",

"## Running the pipeline and visualizing the results\n",

szalpal · 2020-02-06T09:34:50Z

When using PythonFunction or TorchPythonFunction we do not have to bother about synchronizing our GPU function with the rest of DALI pipeline, becuase it is handled behind the scenes.

IMO it's to informal, I would go with something along the lines of:
For PythonFunction and TorchPythonFunction the synchronization of user's GPU code (the provided function) and the rest of DALI Pipeline is handled automatically by the Operator.

Informal == easy to understand. In example like this you don't need to use formal language. The latter needs to be used in the context where precise form is required. Here I'd go for being easy to understand

klecki · 2020-02-06T11:16:43Z

When using PythonFunction or TorchPythonFunction we do not have to bother about synchronizing our GPU function with the rest of DALI pipeline, becuase it is handled behind the scenes.
IMO it's to informal, I would go with something along the lines of:
For PythonFunction and TorchPythonFunction the synchronization of user's GPU code (the provided function) and the rest of DALI Pipeline is handled automatically by the Operator.

Informal == easy to understand. In example like this you don't need to use formal language. The latter needs to be used in the context where precise form is required. Here I'd go for being easy to understand

For me it's also easier to understand when it explicitly states that the Operator handles the synchronization automatically instead of something magic happening "behind the scenes". If we can give simple & precise information why not do that?

banasraf · 2020-02-06T11:34:46Z

@szalpal

When you describe the issue of synchronization, it would be great if'd also show, what happens if user forgets to synchronize. As in "look, that's a problem and here's a solution".

The problem is that everything can go right by accident and I would just show normally looking images and say it's wrong.

What IMHO lacks here is a slight reminder, what's the difference between PythonFunction, DLTensorPythonFuntion and TorchPythonFunction. I know it's in PythonFunction example, here it wouldn't hurt in my opinion.

I'll put a sentence of reminder

For an introduction and general information about Python Operators family see the Python Operators notebook.

You can put a link here ;)

We haven't figured out a way to put a link to another notebook inside a notebook, have we?

banasraf · 2020-02-06T14:36:38Z

!build

dali-automaton · 2020-02-06T14:41:50Z

CI MESSAGE: [1112711]: BUILD STARTED

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>

banasraf · 2020-02-06T14:57:57Z

!build

dali-automaton · 2020-02-06T15:01:19Z

CI MESSAGE: [1112740]: BUILD STARTED

dali-automaton · 2020-02-06T16:31:33Z

CI MESSAGE: [1112740]: BUILD PASSED

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>

banasraf · 2020-02-06T17:19:28Z

!build

dali-automaton · 2020-02-06T17:26:08Z

CI MESSAGE: [1112963]: BUILD STARTED

dali-automaton · 2020-02-06T19:04:22Z

CI MESSAGE: [1112963]: BUILD PASSED

banasraf added 3 commits January 28, 2020 19:43

merge master

8ca5a60

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>

Add GPU Python Operator notebook

c30f4dd

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>

Add notebook to docs tree

8f52bf8

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>

klecki approved these changes Feb 5, 2020

View reviewed changes

mzient reviewed Feb 6, 2020

View reviewed changes

banasraf force-pushed the gpu-python-operator-notebook branch from 343cd67 to 2e18d7c Compare February 6, 2020 14:37

Review fixes.

ecd760b

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>

banasraf force-pushed the gpu-python-operator-notebook branch from 2e18d7c to ecd760b Compare February 6, 2020 14:53

szalpal approved these changes Feb 6, 2020

View reviewed changes

link to Python Operators notebook

9cffe73

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>

JanuszL approved these changes Feb 6, 2020

View reviewed changes

banasraf merged commit 7ff2344 into NVIDIA:master Feb 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gpu python operator notebook #1715

Gpu python operator notebook #1715

banasraf commented Feb 5, 2020

banasraf commented Feb 5, 2020

dali-automaton commented Feb 5, 2020

szalpal commented Feb 5, 2020

dali-automaton commented Feb 5, 2020

klecki commented Feb 5, 2020 •

edited

Loading

klecki commented Feb 5, 2020

JanuszL commented Feb 5, 2020

mzient Feb 6, 2020

banasraf Feb 6, 2020

szalpal commented Feb 6, 2020

klecki commented Feb 6, 2020

banasraf commented Feb 6, 2020 •

edited

Loading

banasraf commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

banasraf commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

banasraf commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

	"## Running the pipeline and visualizing results\n",
	"## Running the pipeline and visualizing the results\n",

Gpu python operator notebook #1715

Gpu python operator notebook #1715

Conversation

banasraf commented Feb 5, 2020

Why we need this PR?

What happened in this PR?

banasraf commented Feb 5, 2020

dali-automaton commented Feb 5, 2020

szalpal commented Feb 5, 2020

dali-automaton commented Feb 5, 2020

klecki commented Feb 5, 2020 • edited Loading

klecki commented Feb 5, 2020

JanuszL commented Feb 5, 2020

mzient Feb 6, 2020

Choose a reason for hiding this comment

banasraf Feb 6, 2020

Choose a reason for hiding this comment

szalpal commented Feb 6, 2020

klecki commented Feb 6, 2020

banasraf commented Feb 6, 2020 • edited Loading

banasraf commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

banasraf commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

banasraf commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

dali-automaton commented Feb 6, 2020

klecki commented Feb 5, 2020 •

edited

Loading

banasraf commented Feb 6, 2020 •

edited

Loading