Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add documentation for mlflow autologging on website #1508

Merged
merged 4 commits into from
May 13, 2022

Conversation

serena-ruan
Copy link
Contributor

@serena-ruan serena-ruan commented May 10, 2022

Summary

Add documentation for Mlflow autologgin on the website.

Tests

Rendered website and check

Dependency changes

None.

AB#1785118

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov-commenter
Copy link

codecov-commenter commented May 10, 2022

Codecov Report

Merging #1508 (3a9a3d3) into master (9657a53) will decrease coverage by 1.54%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1508      +/-   ##
==========================================
- Coverage   84.30%   82.75%   -1.55%     
==========================================
  Files         296      296              
  Lines       14906    14906              
  Branches      717      717              
==========================================
- Hits        12566    12336     -230     
- Misses       2340     2570     +230     
Impacted Files Coverage Δ
...soft/azure/synapse/ml/cognitive/AudioStreams.scala 0.00% <0.00%> (-87.88%) ⬇️
...t/azure/synapse/ml/cognitive/SpeechToTextSDK.scala 18.43% <0.00%> (-72.16%) ⬇️
...crosoft/azure/synapse/ml/cognitive/SpeechAPI.scala 0.00% <0.00%> (-70.00%) ⬇️
...crosoft/azure/synapse/ml/io/http/HTTPClients.scala 66.17% <0.00%> (-8.83%) ⬇️
...ft/azure/synapse/ml/core/env/StreamUtilities.scala 77.77% <0.00%> (-7.41%) ⬇️
...se/ml/cognitive/MultivariateAnomalyDetection.scala 87.03% <0.00%> (-0.75%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9657a53...3a9a3d3. Read the comment docs.

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Collaborator

@mhamilton723 mhamilton723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Magnifique, one small Q

2. Upload your customized `log_model_allowlist.txt` file to dbfs by clicking File/Upload Data button on Databricks UI.
3. Set Spark configuration:
```
spark.conf.set("spark.mlflow.pysparkml.autolog.logModelAllowlistFile", "/dbfs/FileStore/PATH_TO_YOUR_log_model_allowlist.txt")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this accept URLs? Perhaps we can host a reasonable default in our blob!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We might also want to mention this can be set in cluster configs too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this idea! I'll raise a PR to mlflow to make the URL work lol this sounds so reasonable. And I just tested that the above spark.conf.set doesn't work as the cluster is already started, I'll change it to add spark config inside cluster configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

## Configuration process in Databricks as an example

1. Install MLflow via `%pip install mlflow`
2. Upload your customized `log_model_allowlist.txt` file to dbfs by clicking File/Upload Data button on Databricks UI.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we can pass a URL in this param, perhaps we can make this platform agnostic or at least give advice for both Synapse and Databricks

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'll add example in step 2 in section To enable autologging for SynapseML

Copy link
Collaborator

@mhamilton723 mhamilton723 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this! Left a few copy edits and some minor questions and suggestions. Cant wait to get this on website and thanks for building this out you freakin PM/Dev combo

@serena-ruan
Copy link
Contributor Author

/azp run

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@serena-ruan serena-ruan merged commit 4b46cc8 into microsoft:master May 13, 2022
@serena-ruan serena-ruan deleted the serena/addMlflowDoc branch May 13, 2022 03:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants