Integrating Mlflow. #116

AbhiramE · 2019-06-21T00:29:15Z

Description

This PR integrates Mlflow for logging metrics. The change allows user's to locally log metrics like validation loss and display it in a UI using mlflow-ui.

The change also supports AzureML-mlflow. Changes made to gensen deep dive notebook utilize this package to track mlflow logged metrics in the Azure portal.

This change helps us move away from using AzureML in the utils. By using Mlflow for logging we provide the flexibility to opt in/opt out of AzureML for the users.

Related Issues

Checklist:

[ x ] My code follows the code style of this project, as detailed in our contribution guidelines.
I have added tests.
[ x ] I have updated the documentation accordingly.

… create the workspace if it does not exist

review-notebook-app · 2019-06-21T00:29:21Z

Check out this pull request on ReviewNB: https://app.reviewnb.com/microsoft/nlp/pull/116

You'll be able to see visual diffs and write comments on notebook cells. Powered by ReviewNB.

eedeleon

Currently looking over the notebooks but these are the more important changes

.gitignore

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb

scenarios/sentence_similarity/gensen_train.py

tools/generate_conda_file.py

eedeleon · 2019-06-21T17:53:42Z

tests/unit/test_word_embeddings.py

@@ -58,7 +59,8 @@ def test_load_pretrained_vectors_glove():

 def test_load_pretrained_vectors_fasttext():
    dir_path = "temp_data/"
-    file_path = os.path.join(os.path.join(dir_path, "fastText"), "wiki.simple.bin")
+    file_path = os.path.join(os.path.join(dir_path, "fastText"),
+                             "wiki.simple.bin")


is your editor auto changing these? you can update it to 120 or 100 since it seems to be at 80 and that is a bit too short

the repo is configured at 79 (pep8) - black defaults to 88...
I'm for 120 :) but that seems to be far from any standard.

Yeah. I have set my width to 80. I can move it to 100.

In the gensen train file with multiple levels of indentation 80 just feels very less. The code almost looks vertical. 😄

Yeah, mlflow uses 100. our sdk uses 119. 80 is due to a limitation in terminals from around the 1980s

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb

eedeleon · 2019-06-21T20:16:52Z

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb

@@ -127,7 +127,7 @@
  },


Why does it have the username like that?

Reply via ReviewNB

I think the default workspace was pointed for this file store. Using previous changes made by @catherine667 here. She can answer it better.

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb

AbhiramE · 2019-06-24T17:00:46Z

Rebased and forced pushed to the wrong branch! 😨 Fixed it with the latest push.

eedeleon · 2019-06-24T22:34:10Z

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb

@@ -132,6 +132,13 @@
    "scrolled": true


this warning is unfortunate. do you know when you started getting it/ can you remove it from the notebook?

Reply via ReviewNB

I got these warnings from the time I started running the gensen deep dive notebook on local. (A week ago) But before I did, I updated my current conda environment with the updated dependencies.

I can strip the output of this cell, if this warning needs to be hidden.

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb

eedeleon · 2019-06-24T22:36:18Z

scenarios/sentence_similarity/gensen_train.py

+
+            # Keep track of indicies to train forward and backward jointly
+            if (
+                "skipthought_next" in tasknames


you do not have to change tasknames to task_names but I really prefer it but since it is vendored I get why you would not
if all([task_name in task_names for task in [names, I, care, about]):

I definitely agree with the suggested change. The change however is the tip of an iceberg. 😄 There's a lot that can be improved and refactored in the gensen code. For now though it would be great if we can pass on this.

eedeleon

Changes look good. Notebook did not change much and the other updates are mostly simplifications.

If the 100 char per line PR lands beforehand I would update the PR to avoid having to make a follow up PR since you are currently familiar with all of your changes and it would be harder later.

Once the comments here are addressed I think it is good for check in.

catherine667 and others added 3 commits June 20, 2019 20:00

add the aml utility function that can get or create workspace as that…

e561dee

… create the workspace if it does not exist

Integrated Mlflow with AzureMl Gensen deep dive notebook

ed26f8f

Fixed documentation to get rid of AzureML logging

8140f67

AbhiramE requested review from catherine667 and saidbleik June 21, 2019 00:29

AbhiramE requested review from miguelgfierro and eedeleon June 21, 2019 00:29

heatherbshapiro mentioned this pull request Jun 21, 2019

Investigate ML Flow #60

Closed

AbhiramE changed the title ~~Abhiram mlflow~~ Integrating Mlflow. Jun 21, 2019

eedeleon reviewed Jun 21, 2019

View reviewed changes

catherine667 reviewed Jun 21, 2019

View reviewed changes

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb Outdated Show resolved Hide resolved

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb Outdated Show resolved Hide resolved

eedeleon reviewed Jun 21, 2019

View reviewed changes

Code changes based on code review comments.

80487c8

AbhiramE requested a review from eedeleon June 21, 2019 21:30

AbhiramE force-pushed the abhiram-mlflow branch 2 times, most recently from 054572f to 80487c8 Compare June 24, 2019 16:58

eedeleon reviewed Jun 24, 2019

View reviewed changes

eedeleon approved these changes Jun 24, 2019

View reviewed changes

Updated pip version of AzureML Mlflow used in the Pytorch estimator

5d86d03

saidbleik merged commit 4dac5f1 into staging Jun 25, 2019

AbhiramE deleted the abhiram-mlflow branch June 25, 2019 16:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrating Mlflow. #116

Integrating Mlflow. #116

AbhiramE commented Jun 21, 2019

review-notebook-app bot commented Jun 21, 2019

eedeleon left a comment

eedeleon Jun 21, 2019

saidbleik Jun 21, 2019

AbhiramE Jun 21, 2019

AbhiramE Jun 21, 2019

eedeleon Jun 21, 2019

eedeleon Jun 21, 2019

AbhiramE Jun 21, 2019

AbhiramE commented Jun 24, 2019

eedeleon Jun 24, 2019

AbhiramE Jun 24, 2019

eedeleon Jun 24, 2019 •

edited

Loading

AbhiramE Jun 24, 2019

eedeleon left a comment

Integrating Mlflow. #116

Integrating Mlflow. #116

Conversation

AbhiramE commented Jun 21, 2019

Description

Related Issues

Checklist:

review-notebook-app bot commented Jun 21, 2019

eedeleon left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

AbhiramE commented Jun 24, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eedeleon Jun 24, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

eedeleon left a comment

Choose a reason for hiding this comment

eedeleon Jun 24, 2019 •

edited

Loading