Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functional API + improved ExternalSource + improved Pipeline #1598

Merged
merged 18 commits into from
Mar 11, 2020

Conversation

mzient
Copy link
Contributor

@mzient mzient commented Dec 20, 2019

Signed-off-by: Michal Zientkiewicz michalz@nvidia.com

Why we need this PR?

  • simplified usage of operators - avoid double call; use operators directly, without creating "instances" in constructor
  • greatly simplified ExternalSource usage - pass data generating callbacks directly to ExternalSource
  • pipeline outputs can be set with set_outputs
  • added context manager (with block) to manage current pipeline

What happened in this PR?

Fill relevant points, put NA otherwise. Replace anything inside []

  • What solution was applied:
    • Wrapper functions and type-dependent argument grouping
  • Affected modules and functionalities:
    • Python front-end
  • Key points relevant for the review:
    • ops.py and the example notebook
  • Validation and testing:
    • Regression: existing tests
    • New API: TODO
  • Documentation (including examples):
    • TODO

JIRA TASK: DALI-1289

@mzient mzient requested a review from a team December 20, 2019 13:18
@mzient
Copy link
Contributor Author

mzient commented Dec 20, 2019

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1042036]: BUILD STARTED


if nupper > 0:
out += camel[start:i].lower()
out = out.replace("b_box", "bbox").replace("mx_net", "mxnet")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should have a dict of special cases in one place. Here is easy to miss and extend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes and no. BBox is not a special case - rather a pattern. With MXNet - currently we have just one operator, but should we have more, it's already handled.

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1042036]: BUILD PASSED

klecki
klecki previously requested changes Dec 20, 2019
Copy link
Contributor

@klecki klecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As an experimental API, if we even merge this I would first consider putting the whole thing into another module like:
nvidia.dali.experimental.ops or nvidia.dali.experimental.eager.
Ideally in a way, that would later allow removing the .experimental. from the name (and maybe leaving the fallback for a while).
Still, I think it would be better to have them in module other than ops.

If someone want's to use it, he can always alias the imported functionality.

Some documentation or at least PR description could say, that it reexports the Operators as snake_cased functions.

We may need to introduce the EdgeReference as some first class citizen, so we can describe what it represents and how to use it in the docs. It would be also helpful for the error messages.

I would also be interested in how you plan to show those functions in docs? It would not be ideal to duplicate everything. The tensor argument inputs/python values/python lists of values would be especially tricky, as now they can be mixed in the same call.

There was also different idea to abuse the python even more and:

  • pretend that we have the named inputs that can work as keyword arguments - we can do this by counting the positional inputs and checking if someone passed input as keyword.
  • actually create the proper signature. This can be also useful here, we can use python-forge (the even have Dali picture on the project page :P)
    https://pypi.org/project/python-forge/

@JanuszL
Copy link
Contributor

JanuszL commented Dec 20, 2019

We may need to introduce the EdgeReference as some first class citizen, so we can describe what it represents and how to use it in the docs. It would be also helpful for the error messages.

Definitely it is not an eager mode.

@klecki
Copy link
Contributor

klecki commented Dec 20, 2019

Definitely it is not an eager mode.

Yeah, just a name.

@klecki
Copy link
Contributor

klecki commented Dec 20, 2019

Actually it is kinda eager in the regard how you create and parametrize the ops.
My other worry is that having device="gpu" everywhere in the define_graph really clutters the concise graph definition.

@mzient
Copy link
Contributor Author

mzient commented Dec 20, 2019

@klecki It's not "just a name". It's not eager mode nor is it intended to be.
As for the "creation and parametrization" - it's no more eager than the old design.
As for device="something" - I have two suggestions:
Aliases:

resize
resize_cpu
resize_gpu

Submodules/namespaces/whatever:

ops.resize
ops.cpu.resize
ops.gpu.resize

I would definitely like to keep the parametric version so the user can still provide the device as an argument - though the second solution provides a workaround:

device = ops.cpu
device.resize(...)

@klecki
Copy link
Contributor

klecki commented Dec 20, 2019

@klecki It's not "just a name". It's not eager mode nor is it intended to be.
As for the "creation and parametrization" - it's no more eager than the old design.

You all misinterpret my intention and focus on details.
I say that it would be better to put the functions in a separate dedicated module,
like dali.ops.fn, dali.ops.functions, dali.shortcut, dali.functions, dali.whatever.we.name.them.just.make.them.distinct.and.not.pay.attention.to.a.misnamed.example.

As for the device, I think pytorch had some experience and recent changes in that regard, we should look for hints in regard for the usability of the solution.

The device argument is not the only one that will clutter the visuals of defining graph.

Maybe a good thing to do would be to take some of the example pipelines that we have (especially some of the more complex/configurable ones) and to rewrite/mock them with this approach to identify what becomes harder and what easier, what are the weak points that a change in API could fix.

@mzient
Copy link
Contributor Author

mzient commented Jan 7, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1060592]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1060592]: BUILD FAILED

@mzient
Copy link
Contributor Author

mzient commented Jan 7, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1060685]: BUILD STARTED

@mzient
Copy link
Contributor Author

mzient commented Jan 7, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1060699]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1060699]: BUILD FAILED

@@ -22,3 +22,4 @@ Tutorials
audio_decoder.ipynb
hsv_example.ipynb
brightness_contrast_example.ipynb
experimental_new_api.ipynb
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we should call it experimental?

"\n",
" crop_pos = ops.Uniform(range = (0.0, 1.0))\n",
" output = ops.crop_mirror_normalize(\n",
" images, crop_pos_x = crop_pos(),\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
" images, crop_pos_x = crop_pos(),\n",
" images,\n"
" crop_pos_x = crop_pos(),\n",

@JanuszL
Copy link
Contributor

JanuszL commented Jan 10, 2020

I say that it would be better to put the functions in a separate dedicated module,
like dali.ops.fn, dali.ops.functions, dali.shortcut, dali.functions, dali.whatever.we.name.them.just.make.them.distinct.and.not.pay.attention.to.a.misnamed.example.

I'm very much agree to use a separate module/namespace for that.

@mzient
Copy link
Contributor Author

mzient commented Feb 24, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1147244]: BUILD STARTED

@mzient
Copy link
Contributor Author

mzient commented Mar 6, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1173371]: BUILD STARTED

Added functional API documentation, marked as experimental.

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1173447]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1173447]: BUILD PASSED

@mzient
Copy link
Contributor Author

mzient commented Mar 10, 2020

!build

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1180016]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1180145]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1180145]: BUILD PASSED

t = _EdgeReference(t_name, output_device, self)
t_name = self._name
if num_output > 1:
t_name += "[{}]".format(i)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possible output name collision if the _name is duplicated - either requires more checking or additional mangling.

Copy link
Contributor

@klecki klecki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstrings for build() and set_outputs() are not rendered properly.

groups.add(group)
self._input_callbacks = list(groups)

def build(self, define_graph = None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds yet another way to build a pipeline and at some point we will start causing confusion without clear doc how to do something.

Especially as this argument will now appear in docs, but just not documented.

Comment on lines 730 to 738
"""Set the outputs of the pipeline.

Use of this function is an alternative to overriding `define_graph` in a derived class.

Args
----
`*output_data_nodes` : unpacked list of :class:`DataNode` objects
The outputs of the pipeline
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The effective docstring is all over the place in sphinx.

docs/api.rst Outdated
.. autoclass:: nvidia.dali.backend.TensorGPU
:members:
:undoc-members:
`Functioal API <functional_api.rst>`_ (Experimental!) section describes a psuedo-imperative API
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`Functioal API <functional_api.rst>`_ (Experimental!) section describes a psuedo-imperative API
`Functional API <functional_api.rst>`_ (Experimental!) section describes a psuedo-imperative API

docs/api.rst Outdated

Tensor
------
`Pipeline <pipeline.rst>`_ section describes the :class:`Pipeline` object - the central and most
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`Pipeline <pipeline.rst>`_ section describes the :class:`Pipeline` object - the central and most
`Pipeline <pipeline.rst>`_ section describes the :class:`~nvidia.dali.pipelinePipeline` class- the central and most

Adding the tilde is supposed to render it as only Pipeline.

@lgtm-com
Copy link
Contributor

lgtm-com bot commented Mar 11, 2020

This pull request introduces 1 alert when merging f0b1c7d167e298affbdab07836666d45e8643dc6 into 107f603 - view on LGTM.com

new alerts:

  • 1 for Unused import

@mzient
Copy link
Contributor Author

mzient commented Mar 11, 2020

!build

Signed-off-by: Michal Zientkiewicz <michalz@nvidia.com>
@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1182951]: BUILD STARTED

@dali-automaton
Copy link
Collaborator

CI MESSAGE: [1182951]: BUILD PASSED

@mzient mzient changed the title Functional API + improved ExternalSource Functional API + improved ExternalSource + improved Pipeline Mar 11, 2020
@mzient mzient merged commit c984e34 into NVIDIA:master Mar 11, 2020
@mzient mzient deleted the SimplifiedAPI_PoC branch March 27, 2020 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants