Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DALI Pipeline inputs as named arguments to Pipeline.run() #4712

Merged
merged 11 commits into from
Mar 20, 2023

Conversation

szalpal
Copy link
Member

@szalpal szalpal commented Mar 14, 2023

Category:

New feature (non-breaking change which adds functionality)

Description:

Currently, providing inputs to DALI Pipeline is highly cumbersome. We have a couple of ways:

  1. Using Readers,
  2. Using external_source operator and its argument source,
  3. Using an input operator (e.g. external_source) and feed_input function.

All of these are quite problematic. This PR attempts to tackle this with allowing yet another way of providing inputs.

My proposal is to augment pipe.run() function with **kwargs in Python. Every such argument, provided that it has a corresponding input operator in DALI pipeline, can accept any data assigned to the name of such input operator. For example:

@pipeline_def
def pipe():
    return fn.external_source(name="my_inp")

p = pipe(...)
p.build()
p.run(my_inp=np.random(...))

The change is added run .run() only, since I figured it's always possible to add it to the schedule_run too, but removing it will be problematic due to backwards compatibility issues.

Additional information:

Affected modules and functionalities:

Key points relevant for the review:

Tests:

  • Existing tests apply
  • New tests added
    • Python tests
    • GTests
    • Benchmark
    • Other
  • N/A

Checklist

Documentation

  • Existing documentation applies
  • Documentation updated
    • Docstring
    • Doxygen
    • RST
    • Jupyter
    • Other
  • N/A

DALI team only

Requirements

  • Implements new requirements
  • Affects existing requirements
  • N/A

REQ IDs: N/A

JIRA TASK: N/A

Signed-off-by: szalpal <mszolucha@nvidia.com>
@szalpal
Copy link
Member Author

szalpal commented Mar 14, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7581426]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7581426]: BUILD FAILED

Signed-off-by: szalpal <mszolucha@nvidia.com>
@szalpal
Copy link
Member Author

szalpal commented Mar 14, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7583127]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7583127]: BUILD FAILED

Signed-off-by: szalpal <mszolucha@nvidia.com>
@szalpal
Copy link
Member Author

szalpal commented Mar 15, 2023

!build

@szalpal szalpal marked this pull request as ready for review March 15, 2023 09:55
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7593105]: BUILD STARTED

Signed-off-by: szalpal <mszolucha@nvidia.com>
Signed-off-by: szalpal <mszolucha@nvidia.com>
@@ -965,9 +965,9 @@ def schedule_run(self):
Should not be mixed with :meth:`run` in the same pipeline"""
with self._check_api_type_scope(types.PipelineAPIType.SCHEDULED):
if self._first_iter and self._exec_pipelined:
self._prefetch()
self._prefetch(**pipeline_inputs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's bad.... It would most likely cause very weird behavior of the pipeline and explaining it to the user would be extremely hard. Instead, rather enforce that queue depth is 1 when using pipeline inputs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, with how currently the running logic is organised, it has to be there. if self._first_iter and self._exec_pipelined: means, that when we legitimately set exec_pipelined=True and prefetch_queue_depth=1, the prefetch function will be executed. And we want to have pipeline_inputs in this scenario.

I've added the enforce and extensive explanation in the run function. Hopefully this will solve the issue.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(to put our call in writing):
We don't have to propagate it down to _prefetch and _run_once - we could just call feed_input in schedule_run or even restrict the new API to run, so we avoid yet another way to feed the (prefetching) pipeline.

Signed-off-by: szalpal <mszolucha@nvidia.com>
Signed-off-by: szalpal <mszolucha@nvidia.com>
A list of `TensorList` objects for respective pipeline outputs
"""
if len(pipeline_inputs) > 0 and (self.prefetch_queue_depth > 1 and self.exec_pipelined):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I wonder what's the value of prefetch_queue_depth in case of separated queues - and, specifically, what prefetch_queue_depth > 1 would evaluate to. At least I'd try to specify separated queues and see the error below pops up.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7593105]: BUILD FAILED

@szalpal
Copy link
Member Author

szalpal commented Mar 16, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7604393]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7604393]: BUILD FAILED

Signed-off-by: szalpal <mszolucha@nvidia.com>
@szalpal
Copy link
Member Author

szalpal commented Mar 16, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7606050]: BUILD STARTED

Signed-off-by: szalpal <mszolucha@nvidia.com>
@szalpal
Copy link
Member Author

szalpal commented Mar 16, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7606101]: BUILD STARTED

Signed-off-by: szalpal <mszolucha@nvidia.com>
@szalpal
Copy link
Member Author

szalpal commented Mar 16, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7606282]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7606282]: BUILD FAILED

@szalpal
Copy link
Member Author

szalpal commented Mar 16, 2023

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7607431]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7607431]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7606050]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7606101]: BUILD FAILED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [7606101]: BUILD PASSED

@szalpal szalpal merged commit 6ccbe3e into NVIDIA:main Mar 20, 2023
@szalpal szalpal deleted the run_arguments branch March 20, 2023 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants