-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DALI Pipeline inputs as named arguments to Pipeline.run()
#4712
Conversation
Signed-off-by: szalpal <mszolucha@nvidia.com>
!build |
CI MESSAGE: [7581426]: BUILD STARTED |
CI MESSAGE: [7581426]: BUILD FAILED |
Signed-off-by: szalpal <mszolucha@nvidia.com>
!build |
CI MESSAGE: [7583127]: BUILD STARTED |
CI MESSAGE: [7583127]: BUILD FAILED |
Signed-off-by: szalpal <mszolucha@nvidia.com>
!build |
CI MESSAGE: [7593105]: BUILD STARTED |
Signed-off-by: szalpal <mszolucha@nvidia.com>
Signed-off-by: szalpal <mszolucha@nvidia.com>
dali/python/nvidia/dali/pipeline.py
Outdated
@@ -965,9 +965,9 @@ def schedule_run(self): | |||
Should not be mixed with :meth:`run` in the same pipeline""" | |||
with self._check_api_type_scope(types.PipelineAPIType.SCHEDULED): | |||
if self._first_iter and self._exec_pipelined: | |||
self._prefetch() | |||
self._prefetch(**pipeline_inputs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's bad.... It would most likely cause very weird behavior of the pipeline and explaining it to the user would be extremely hard. Instead, rather enforce that queue depth is 1 when using pipeline inputs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately, with how currently the running logic is organised, it has to be there. if self._first_iter and self._exec_pipelined:
means, that when we legitimately set exec_pipelined=True
and prefetch_queue_depth=1
, the prefetch
function will be executed. And we want to have pipeline_inputs
in this scenario.
I've added the enforce and extensive explanation in the run
function. Hopefully this will solve the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(to put our call in writing):
We don't have to propagate it down to _prefetch
and _run_once
- we could just call feed_input
in schedule_run
or even restrict the new API to run, so we avoid yet another way to feed the (prefetching) pipeline.
Signed-off-by: szalpal <mszolucha@nvidia.com>
Signed-off-by: szalpal <mszolucha@nvidia.com>
dali/python/nvidia/dali/pipeline.py
Outdated
A list of `TensorList` objects for respective pipeline outputs | ||
""" | ||
if len(pipeline_inputs) > 0 and (self.prefetch_queue_depth > 1 and self.exec_pipelined): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I wonder what's the value of prefetch_queue_depth
in case of separated queues - and, specifically, what prefetch_queue_depth > 1
would evaluate to. At least I'd try to specify separated queues and see the error below pops up.
CI MESSAGE: [7593105]: BUILD FAILED |
!build |
CI MESSAGE: [7604393]: BUILD STARTED |
CI MESSAGE: [7604393]: BUILD FAILED |
!build |
CI MESSAGE: [7606050]: BUILD STARTED |
!build |
CI MESSAGE: [7606101]: BUILD STARTED |
!build |
CI MESSAGE: [7606282]: BUILD STARTED |
CI MESSAGE: [7606282]: BUILD FAILED |
!build |
CI MESSAGE: [7607431]: BUILD STARTED |
CI MESSAGE: [7607431]: BUILD FAILED |
CI MESSAGE: [7606050]: BUILD FAILED |
CI MESSAGE: [7606101]: BUILD FAILED |
CI MESSAGE: [7606101]: BUILD PASSED |
Category:
New feature (non-breaking change which adds functionality)
Description:
Currently, providing inputs to DALI Pipeline is highly cumbersome. We have a couple of ways:
external_source
operator and its argumentsource
,external_source
) andfeed_input
function.All of these are quite problematic. This PR attempts to tackle this with allowing yet another way of providing inputs.
My proposal is to augment
pipe.run()
function with**kwargs
in Python. Every such argument, provided that it has a corresponding input operator in DALI pipeline, can accept any data assigned to the name of such input operator. For example:The change is added run
.run()
only, since I figured it's always possible to add it to theschedule_run
too, but removing it will be problematic due to backwards compatibility issues.Additional information:
Affected modules and functionalities:
Key points relevant for the review:
Tests:
Checklist
Documentation
DALI team only
Requirements
REQ IDs: N/A
JIRA TASK: N/A