Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

onnx-systemds implementation #904

Closed
wants to merge 22 commits into from

Conversation

lukas-jkl
Copy link
Contributor

This PR implements a first poc-implementation for an ONNX importer.
It adds support for the following operators:

  • Add
  • Sub
  • MatMul
  • Neg
  • Xor
  • Or
  • And
  • Relu
  • Tanh
  • Sigmoid
  • Softmax
  • Dropout
  • MaxPool
  • Conv
  • If

As well as the logic for nested sub-graphs.

Copy link
Contributor

@j143 j143 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a cursory check, the changeset looks good. Can the design be documented thoroughly in a .md file.
Feel free to include as much information in the doc as possible with the uses of tables, graphs etc.

I am facing problems with building onnx for now.

@lukas-jkl
Copy link
Contributor Author

On a cursory check, the changeset looks good. Can the design be documented thoroughly in a .md file.
Feel free to include as much information in the doc as possible with the uses of tables, graphs etc.

I am facing problems with building onnx for now.

Thank you for the quick review, I've added a design document which describes the current implementation in detail.
I think the conda install of onnx works best, I also had problems installing with other methods.

@j143
Copy link
Contributor

j143 commented May 3, 2020

Thank you for the design documentation, @lukas-jkl . I would make a deep pass shortly.

@j143
Copy link
Contributor

j143 commented May 12, 2020

Sorry, @lukas-jkl - I may not be able to review PR anytime soon. But, I posted this on the list, so that someone would come forward for review. Thank you.

@mboehm7
Copy link
Contributor

mboehm7 commented May 12, 2020

thanks for the initial look @j143. @Baunsgaard could you please have a look. I'll also try it out later this week.

Copy link
Contributor

@Baunsgaard Baunsgaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Great PR, some changes needed, but absolutely great addition.
I Like your testing and documentation

I would appreciate if you relocate the Onnx_systemds inside our src/main/python/systemds itself. such that when we release a new pip install this addition is included, and we don't need to provide an extra package.

The documentation is written in markdown, depending on the time you have, i would appreciate if you write it up as a rst (or more) instead inside src/main/python/docs. This would make it integrate with the already existing documents.
Furthermore, there is already the document in /docs that contain another onnx-systemds-design.md. It would be great to consolidate these two.

If you want to make it absolutely perfect then you should add an github action inside .github/workflows/ maybe call it onnx.yml, and if you merge the onnx implementation into systemds then extend the python.yml instead.

A simple test you can have instead of an output file (like the ones in tests/output_reference) is to finish your scripts with a sum(res) and then compare that one number with your result. (you can add more aggregates if you don't trust just sum).

Minor extra things Missing:

  • init.py inside the tests folder.

to run onnx-systemds you need [onnx](https://github.com/onnx/onnx)

* [Installation instructions](https://github.com/onnx/onnx#installation)
* The conda install seems to work best
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is unessesary (line 14), since it depends on how the user want to install,
just having the link to install instructions is fine. Maybe add Install instructions Onnx.


* [Installation instructions](https://github.com/onnx/onnx#installation)
* The conda install seems to work best
* the environment variable `SYSTEMDS_ROOT` needs to be set to the root of the repository
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to our /bin/README.md for setting up environment

For a more detailed description of this converter refer to the [description of the converter design](docs/onnx-systemds-design.md)



Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many of your documentation files, have many extra newlines before a new section. one empty line is sufficient

@@ -0,0 +1,37 @@
# onnx-systemds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Onnx onnx or ONNX?
It is in general not consistent throughout the PR,

@@ -0,0 +1,37 @@
# onnx-systemds

A tool for importing/exporting [ONNX](https://github.com/onnx/onnx/blob/master/docs/IR.md) graphs into/from SystemDS DML scripts.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consistency, here the link is ONNX, while later it is onnx.

@@ -0,0 +1,37 @@
# onnx-systemds
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add License

""" Class for preparing onnx value structures for writing them to the dml script """
def __init__(self, value_info: onnx.ValueInfoProto, initializer: onnx.TensorProto = None):

supported_types = ["int", "boolean", "double"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean int here ? or is it a long value?
I'm assuming this here maps to our internal representations of simple long double or boolean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those should be the types supported by systems (I've now renamed the variable to make this more clear). So I guess it should be "integer"

10: "double", # float16,
11: "double",
12: "int", # uint32
13: "int", # uint64
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

13 is not possible for all values, to map if a long
12 && 7 is not possible if the int is not a long 😕

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way of throwing an not implemented exception if a type is invalid, if so then i would suggest to make some appropriate error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you are correct, I am however unsure how big the "Integer" type of systemds is? I guess it should be int32. I've now changed those translations and an exception is thrown if we do not support the type. I've also added an additional check to enforce that a matrix must only contain double values, as stated in the dml-language-reference.

# TODO: not sure this is the solution for every instance of this problem
# Multiply all shapes right
rows = self.shape[0]
cols = reduce(lambda s0, s1: s0 * s1, self.shape[1:])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the only place where the functools import is used, and the syntax is a little foreign to me.

  1. thing move the import to be an inline import since it is as i can see it an edge case if it is needed.
  2. consider if the import is necessary to execute this.
  3. If functools is a default python lib then completely ignore this comment 😄

return GeneratedScriptPart(imports=[import_render], dml_script=node_render)


def gen_maxpool_call(env: jinja2.environment.Environment, graph: onnx.GraphProto,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gen conv and gen maxpool, contain much of the same code, maybe some of it could be alleviated using some private helper functions?

Copy link
Contributor

@Baunsgaard Baunsgaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furter comment:

The Onnx files inside test_models, have an model_generate file.
I think it would be nice if those were not included in the PR, but have a note in the readme to generate them first before tests?


try:
abspath_input = os.path.abspath(input_file)
res = subprocess.run([systemds_root_path + "/bin/systemds.sh", abspath_input] + args,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the issue for why it fails now. we changed the name to /bin/systemds without the .sh.

@Baunsgaard
Copy link
Contributor

On a cursory check, the changeset looks good. Can the design be documented thoroughly in a .md file.
Feel free to include as much information in the doc as possible with the uses of tables, graphs etc.
I am facing problems with building onnx for now.

Thank you for the quick review, I've added a design document which describes the current implementation in detail.
I think the conda install of onnx works best, I also had problems installing with other methods.

i installed not using conda, with no problems. the only thing to remember is to install protobuf, but the Onnx guide specifies this as well.

@Baunsgaard
Copy link
Contributor

The thing to discuss in general for this implementation of Onnx in SystemDS is the use of jinja2.
I would argue that it is fine, but it does come with some drawbacks.

  1. Extra dependency 2. New syntax to learn.

The upside is somewhat hopefully simpler translation to DML. This said in the current implementation it would not be so difficult to simplify and construct the strings needed without jinja2, at least to my understanding, but maybe it is useful down the road.

Any opinions?

@lukas-jkl
Copy link
Contributor Author

Thank you for the quick review @Baunsgaard ! I've implemented your suggested changes and have rewritten the documentation in rst files. I am however still a bit unsure about the yml files. I guess it would not be sufficient to simply add onnx and Jinja to the pip dependencies as protobuf would also need to be installed?

Copy link
Contributor

@Baunsgaard Baunsgaard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the quick review @Baunsgaard ! I've implemented your suggested changes and have rewritten the documentation in rst files. I am however still a bit unsure about the yml files. I guess it would not be sufficient to simply add onnx and Jinja to the pip dependencies as protobuf would also need to be installed?

Since our python tests are executed on Ubuntu-latest it should hopefully be as simple as adding a step,

- name: install protobuf
   run: {INSERT INSTALL COMMAND HERE }

@lukas-jkl
Copy link
Contributor Author

Thank you for the quick review @Baunsgaard ! I've implemented your suggested changes and have rewritten the documentation in rst files. I am however still a bit unsure about the yml files. I guess it would not be sufficient to simply add onnx and Jinja to the pip dependencies as protobuf would also need to be installed?

Since our python tests are executed on Ubuntu-latest it should hopefully be as simple as adding a step,

- name: install protobuf
   run: {INSERT INSTALL COMMAND HERE }

Seems to work now!

@Baunsgaard
Copy link
Contributor

Cool many thanks, LGTM. 🥇
(if i had the power i would merge it)

Up for discussion in the future is still #904 (comment)

Only real thing left to refine in my opinion is the Output files for validation, that i think is to much, because the tests should not reflect a specific result, but a certain behavior. But I also acknowledge this is a highly personal thing, and that some of the python tests already in does not follow this principle.

@mboehm7

@mboehm7
Copy link
Contributor

mboehm7 commented May 14, 2020

LGTM - thanks @lukas-jkl for this awesome patch, and @Baunsgaard for the detailed discussions. Overall, this is a great initial importer and completes the AMLS project. During the merge, I only added **/*.out to the exclude list of the rat check (for proper licenses).

Down the the road, I would also prefer to remove these expected output files and rather generate them dynamically (e.g., via pytorch). Similarly, we should clean up the license headers to use a common text layout (beginning/end lines).

Btw, there are additional rat problems in the data slicing implementation (which I fix when merging Svetlana's PR) and the JolEstimate* tests (@Baunsgaard you might want to fix that).

@asfgit asfgit closed this in 0ac0c25 May 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants