Skip to content

Add Gluon autolog capabilities for simple fitting. #1973

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 25 commits into from
Nov 20, 2019
Merged

Add Gluon autolog capabilities for simple fitting. #1973

merged 25 commits into from
Nov 20, 2019

Conversation

cosmincatalin
Copy link
Contributor

@cosmincatalin cosmincatalin commented Oct 22, 2019

MXNet's Gluon API features a simple and direct for method on an estimator. This is similar to Keras.
This is the autologger for it.

What changes are proposed in this pull request?

Basic functionality for tracking metrics and parameters of Gluon HybridBlocks based models. Autologging is included.

How is this patch tested?

Added a gluon specific test as part of the pull request.

Release Notes

Is this a user-facing change?

Autolog, save and load MXNet Gluon's HybridBlocks' based models.

What component(s) does this PR affect?

  • UI
  • CLI
  • API
  • REST-API
  • Examples
  • Docs
  • Tracking
  • Projects
  • Artifacts
  • Models
  • Scoring
  • Serving
  • R
  • Java
  • Python

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

MXNet's Gluon API features a simple and direct fir method on an estimator. This is similar to Keras.
This is the autologer for it.
Copy link
Contributor

@aarondav aarondav left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looks pretty good to me! I'd like to see if @dbczumar and @smurching could also take a look, as they're familiar with the models and autologging components specifically.

@cosmincatalin
Copy link
Contributor Author

All comments have been resolved. I can still see that Travis fails, but it does not seem to be related to what I have been doing. Can anyone help with that?

Copy link
Collaborator

@dbczumar dbczumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cosmincatalin I've restarted the errant test job. I've left a couple comments regarding the need for a conda environment for use with the pyfunc flavor. Is it possible to add a unit test for loading the Gluon model as a python function (pyfunc) and evaluating it on example data?

@cosmincatalin
Copy link
Contributor Author

One of the tests in Travis still breaks unfortunately. I'm not sure what the problem is with it.

@cosmincatalin
Copy link
Contributor Author

I've managed to fix everything. 😄

@dbczumar
Copy link
Collaborator

@cosmincatalin I pushed a fix to your branch to pass the conda environment to the pyfunc configuration when saving the model. Additionally, I added an MXNet Gluon entry to the MLflow Models docs. Please let me know if this looks okay.

Before merging, I'd like to add tests to ensure that conda environments are persisted correctly and added to the pyfunc configuration. I'll aim to do this within the next 24 hours, but please feel free to do so in the mean time!

harupy and others added 4 commits November 18, 2019 14:19
* Enable comma-dangle

* Apply lint:fix
* Update ShowArtifactImageView to handle gif

* Remove imageContainer

* Refactor and add comments

* Add alt text

* Add test
@cosmincatalin
Copy link
Contributor Author

@dbczumar Everything looks fine with me in the documentation. Please go ahead and add the tests.

Copy link
Collaborator

@dbczumar dbczumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cosmincatalin I pushed the remaining tests to the branch, as well as tests for scoring. I also updated the pyfunc representation of Gluon models to conform to the pyfunc prediction interface (https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#inference-api). Please let me know if you have any questions about that interface. With these changes, this PR looks great and should be ready to merge once tests pass. Thank you for your contribution!

@cosmincatalin
Copy link
Contributor Author

Everything looks good on my side. I can't wait to see it released. I'll prepare an example scenario on my blog as well.

@dbczumar dbczumar merged commit 0ce1048 into mlflow:master Nov 20, 2019
@dbczumar
Copy link
Collaborator

@cosmincatalin Awesome! Just merged. Looking forward to your blog post!

@juntai-zheng juntai-zheng added rn/feature Mention under Features in Changelogs. rn/documentation Mention under Documentation Changes in Changelogs. and removed rn/documentation Mention under Documentation Changes in Changelogs. labels Dec 19, 2019
@cosmincatalin
Copy link
Contributor Author

@dbczumar Some half a year later I have the blog post if you're interested in advertising it on some channels: https://cosminsanda.com/posts/experiment-tracking-with-mlflow-inside-amazon-sagemaker/

avflor pushed a commit to avflor/mlflow that referenced this pull request Aug 22, 2020
* Add Gluon autolog capabilities for simple fitting.

MXNet's Gluon API features a simple and direct fir method on an estimator. This is similar to Keras.
This is the autologer for it.

* Remove Python type hints to support legacy Python 2.
Add MXnet as a dependency to small requirements instead of large.
Make the computing context an argument to load model.

* Fix a lint error with a line being too long

* Add validation metric test

* Require MXNet bot for large and small requirements (not sure this is correct)

* Add a missing documentation reference.

* Log estimator parameters at the begining of training as opposed to at the end.

* Log the network at the end, after training, as it should

* A lint error. By now, the commit are sure to be squashed.

* Add Gluon pyfunc functionality

* Tests added, and conda env

* Run the Gluon tests just like the other DL frameworks.

* Fix conda envs

* Small fix

* Enable the ESLint comma-dangle rule (mlflow#2069)

* Enable comma-dangle

* Apply lint:fix

* Fix artifact image viewer to support animated GIF (mlflow#2070)

* Update ShowArtifactImageView to handle gif

* Remove imageContainer

* Refactor and add comments

* Add alt text

* Add test

* Show current tag value on edit (mlflow#2105)

* Add and fix tests

* Lint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rn/feature Mention under Features in Changelogs.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants