-
Notifications
You must be signed in to change notification settings - Fork 4.5k
Add Gluon autolog capabilities for simple fitting. #1973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
MXNet's Gluon API features a simple and direct fir method on an estimator. This is similar to Keras. This is the autologer for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looks pretty good to me! I'd like to see if @dbczumar and @smurching could also take a look, as they're familiar with the models and autologging components specifically.
Add MXnet as a dependency to small requirements instead of large. Make the computing context an argument to load model.
All comments have been resolved. I can still see that Travis fails, but it does not seem to be related to what I have been doing. Can anyone help with that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cosmincatalin I've restarted the errant test job. I've left a couple comments regarding the need for a conda environment for use with the pyfunc flavor. Is it possible to add a unit test for loading the Gluon model as a python function (pyfunc
) and evaluating it on example data?
One of the tests in Travis still breaks unfortunately. I'm not sure what the problem is with it. |
I've managed to fix everything. 😄 |
@cosmincatalin I pushed a fix to your branch to pass the conda environment to the pyfunc configuration when saving the model. Additionally, I added an Before merging, I'd like to add tests to ensure that conda environments are persisted correctly and added to the pyfunc configuration. I'll aim to do this within the next 24 hours, but please feel free to do so in the mean time! |
* Enable comma-dangle * Apply lint:fix
* Update ShowArtifactImageView to handle gif * Remove imageContainer * Refactor and add comments * Add alt text * Add test
@dbczumar Everything looks fine with me in the documentation. Please go ahead and add the tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cosmincatalin I pushed the remaining tests to the branch, as well as tests for scoring. I also updated the pyfunc
representation of Gluon models to conform to the pyfunc
prediction interface (https://mlflow.org/docs/latest/python_api/mlflow.pyfunc.html#inference-api). Please let me know if you have any questions about that interface. With these changes, this PR looks great and should be ready to merge once tests pass. Thank you for your contribution!
Everything looks good on my side. I can't wait to see it released. I'll prepare an example scenario on my blog as well. |
@cosmincatalin Awesome! Just merged. Looking forward to your blog post! |
@dbczumar Some half a year later I have the blog post if you're interested in advertising it on some channels: https://cosminsanda.com/posts/experiment-tracking-with-mlflow-inside-amazon-sagemaker/ |
* Add Gluon autolog capabilities for simple fitting. MXNet's Gluon API features a simple and direct fir method on an estimator. This is similar to Keras. This is the autologer for it. * Remove Python type hints to support legacy Python 2. Add MXnet as a dependency to small requirements instead of large. Make the computing context an argument to load model. * Fix a lint error with a line being too long * Add validation metric test * Require MXNet bot for large and small requirements (not sure this is correct) * Add a missing documentation reference. * Log estimator parameters at the begining of training as opposed to at the end. * Log the network at the end, after training, as it should * A lint error. By now, the commit are sure to be squashed. * Add Gluon pyfunc functionality * Tests added, and conda env * Run the Gluon tests just like the other DL frameworks. * Fix conda envs * Small fix * Enable the ESLint comma-dangle rule (mlflow#2069) * Enable comma-dangle * Apply lint:fix * Fix artifact image viewer to support animated GIF (mlflow#2070) * Update ShowArtifactImageView to handle gif * Remove imageContainer * Refactor and add comments * Add alt text * Add test * Show current tag value on edit (mlflow#2105) * Add and fix tests * Lint
MXNet's Gluon API features a simple and direct for method on an estimator. This is similar to Keras.
This is the autologger for it.
What changes are proposed in this pull request?
Basic functionality for tracking metrics and parameters of Gluon HybridBlocks based models. Autologging is included.
How is this patch tested?
Added a gluon specific test as part of the pull request.
Release Notes
Is this a user-facing change?
Autolog, save and load MXNet Gluon's HybridBlocks' based models.
What component(s) does this PR affect?
How should the PR be classified in the release notes? Choose one:
rn/breaking-change
- The PR will be mentioned in the "Breaking Changes" sectionrn/none
- No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" sectionrn/feature
- A new user-facing feature worth mentioning in the release notesrn/bug-fix
- A user-facing bug fix worth mentioning in the release notesrn/documentation
- A user-facing documentation change worth mentioning in the release notes