Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gpu python operator notebook #1715

Merged
merged 5 commits into from
Feb 7, 2020

Conversation

banasraf
Copy link
Collaborator

@banasraf banasraf commented Feb 5, 2020

Why we need this PR?

  • It adds an example for GPU Python Operators

What happened in this PR?

  • What solution was applied:
    Adding a notebook
  • Affected modules and functionalities:
    Documentation
  • Key points relevant for the review:
    The Notebook
  • Validation and testing:
    Notebook added to QA
  • Documentation (including examples):
    NA

JIRA TASK: [DALI-1260]

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>
Signed-off-by: Rafal <Banas.Rafal97@gmail.com>
Signed-off-by: Rafal <Banas.Rafal97@gmail.com>
@banasraf
Copy link
Collaborator Author

banasraf commented Feb 5, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1110232]: BUILD STARTED

@szalpal
Copy link
Member

szalpal commented Feb 5, 2020

General suggestions:

  1. When you describe the issue of synchronization, it would be great if'd also show, what happens if user forgets to synchronize. As in "look, that's a problem and here's a solution".
  2. What IMHO lacks here is a slight reminder, what's the difference between PythonFunction, DLTensorPythonFuntion and TorchPythonFunction. I know it's in PythonFunction example, here it wouldn't hurt in my opinion.

Details:

For an introduction and general information about Python Operators family see the Python Operators notebook.

You can put a link here ;)

Below we present a simple kernel interlaying channels

interleaving?

More informationa about

Typo here

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1110232]: BUILD PASSED

@klecki
Copy link
Contributor

klecki commented Feb 5, 2020

When using PythonFunction or TorchPythonFunction we do not have to bother about synchronizing our GPU function with the rest of DALI pipeline, becuase it is handled behind the scenes.

IMO it's to informal, I would go with something along the lines of:
For PythonFunction and TorchPythonFunction the synchronization of user's GPU code (the provided function) and the rest of DALI Pipeline is handled automatically by the Operator.

@klecki
Copy link
Contributor

klecki commented Feb 5, 2020

As for the ending:

To properly synchronize device code in DLTensorPythonFunction we have to ensure that:
* all the preceding GPU work is done before the start of provided function,
* the work we schedule inside finishes before we return the results.
The first condition is warranted by the synchronize_stream=True flag (ON by default). User is responsible for providing the second part. In the example above it is achieved by the line cupy.cuda.get_current_stream().synchronize().

How about:

* all the preceding **DALI** GPU work is done before the start of provided function - this can be handled by the Operator using `synchronize_stream=True` flag (ON by default),
* the work we schedule inside finishes before we return the results - we must use CuPy's cupy.cuda.get_current_stream().synchronize() to synchronize at the end of the function.

But I'm not sure if it's any better.

@JanuszL
Copy link
Contributor

JanuszL commented Feb 5, 2020

1. When you describe the issue of synchronization, it would be great if'd also show, what happens if user forgets to synchronize. As in "look, that's a problem and here's a solution".

Anything can happen or nothing. The question is if we can assume that someone using GPU is familiar with the multithreading challenges and synchronization concept. I don't know if we should describe the details of how synchronization works in CUDA and why we need to do this here, but we can point to some reference and mention the difference between DLPack variant of Python function comparing to the plain one.

I agree with the rest of the comments.

@@ -97,7 +97,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Running the pipeline and visualizing results\n",
"## Running the pipeline and visualizing results\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick

Suggested change
"## Running the pipeline and visualizing results\n",
"## Running the pipeline and visualizing the results\n",

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@szalpal
Copy link
Member

szalpal commented Feb 6, 2020

When using PythonFunction or TorchPythonFunction we do not have to bother about synchronizing our GPU function with the rest of DALI pipeline, becuase it is handled behind the scenes.

IMO it's to informal, I would go with something along the lines of:
For PythonFunction and TorchPythonFunction the synchronization of user's GPU code (the provided function) and the rest of DALI Pipeline is handled automatically by the Operator.

Informal == easy to understand. In example like this you don't need to use formal language. The latter needs to be used in the context where precise form is required. Here I'd go for being easy to understand

@klecki
Copy link
Contributor

klecki commented Feb 6, 2020

When using PythonFunction or TorchPythonFunction we do not have to bother about synchronizing our GPU function with the rest of DALI pipeline, becuase it is handled behind the scenes.
IMO it's to informal, I would go with something along the lines of:
For PythonFunction and TorchPythonFunction the synchronization of user's GPU code (the provided function) and the rest of DALI Pipeline is handled automatically by the Operator.

Informal == easy to understand. In example like this you don't need to use formal language. The latter needs to be used in the context where precise form is required. Here I'd go for being easy to understand

For me it's also easier to understand when it explicitly states that the Operator handles the synchronization automatically instead of something magic happening "behind the scenes". If we can give simple & precise information why not do that?

@banasraf
Copy link
Collaborator Author

banasraf commented Feb 6, 2020

@szalpal

  1. When you describe the issue of synchronization, it would be great if'd also show, what happens if user forgets to synchronize. As in "look, that's a problem and here's a solution".

The problem is that everything can go right by accident and I would just show normally looking images and say it's wrong.

  1. What IMHO lacks here is a slight reminder, what's the difference between PythonFunction, DLTensorPythonFuntion and TorchPythonFunction. I know it's in PythonFunction example, here it wouldn't hurt in my opinion.

I'll put a sentence of reminder

For an introduction and general information about Python Operators family see the Python Operators notebook.

You can put a link here ;)

We haven't figured out a way to put a link to another notebook inside a notebook, have we?

@banasraf
Copy link
Collaborator Author

banasraf commented Feb 6, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1112711]: BUILD STARTED

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>
@banasraf
Copy link
Collaborator Author

banasraf commented Feb 6, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1112740]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1112740]: BUILD PASSED

Signed-off-by: Rafal <Banas.Rafal97@gmail.com>
@banasraf
Copy link
Collaborator Author

banasraf commented Feb 6, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1112963]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1112963]: BUILD PASSED

@banasraf banasraf merged commit 7ff2344 into NVIDIA:master Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants