Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Integrating Mlflow. #116

Merged
merged 5 commits into from
Jun 25, 2019
Merged

Integrating Mlflow. #116

merged 5 commits into from
Jun 25, 2019

Conversation

AbhiramE
Copy link
Contributor

Description

This PR integrates Mlflow for logging metrics. The change allows user's to locally log metrics like validation loss and display it in a UI using mlflow-ui.

The change also supports AzureML-mlflow. Changes made to gensen deep dive notebook utilize this package to track mlflow logged metrics in the Azure portal.

This change helps us move away from using AzureML in the utils. By using Mlflow for logging we provide the flexibility to opt in/opt out of AzureML for the users.

Related Issues

Checklist:

  • [ x ] My code follows the code style of this project, as detailed in our contribution guidelines.
  • I have added tests.
  • [ x ] I have updated the documentation accordingly.

@review-notebook-app
Copy link

Check out this pull request on ReviewNB: https://app.reviewnb.com/microsoft/nlp/pull/116

You'll be able to see visual diffs and write comments on notebook cells. Powered by ReviewNB.

@AbhiramE AbhiramE changed the title Abhiram mlflow Integrating Mlflow. Jun 21, 2019
Copy link
Contributor

@eedeleon eedeleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently looking over the notebooks but these are the more important changes

.gitignore Outdated Show resolved Hide resolved
scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb Outdated Show resolved Hide resolved
scenarios/sentence_similarity/gensen_train.py Outdated Show resolved Hide resolved
scenarios/sentence_similarity/gensen_train.py Outdated Show resolved Hide resolved
scenarios/sentence_similarity/gensen_train.py Outdated Show resolved Hide resolved
scenarios/sentence_similarity/gensen_train.py Show resolved Hide resolved
scenarios/sentence_similarity/gensen_train.py Show resolved Hide resolved
scenarios/sentence_similarity/gensen_train.py Outdated Show resolved Hide resolved
tools/generate_conda_file.py Outdated Show resolved Hide resolved
@@ -58,7 +59,8 @@ def test_load_pretrained_vectors_glove():

def test_load_pretrained_vectors_fasttext():
dir_path = "temp_data/"
file_path = os.path.join(os.path.join(dir_path, "fastText"), "wiki.simple.bin")
file_path = os.path.join(os.path.join(dir_path, "fastText"),
"wiki.simple.bin")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is your editor auto changing these? you can update it to 120 or 100 since it seems to be at 80 and that is a bit too short

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the repo is configured at 79 (pep8) - black defaults to 88...
I'm for 120 :) but that seems to be far from any standard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I have set my width to 80. I can move it to 100.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the gensen train file with multiple levels of indentation 80 just feels very less. The code almost looks vertical. 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, mlflow uses 100. our sdk uses 119. 80 is due to a limitation in terminals from around the 1980s

@@ -127,7 +127,7 @@
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does it have the username like that?


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the default workspace was pointed for this file store. Using previous changes made by @catherine667 here. She can answer it better.

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb Outdated Show resolved Hide resolved
@AbhiramE AbhiramE requested a review from eedeleon June 21, 2019 21:30
@AbhiramE AbhiramE force-pushed the abhiram-mlflow branch 2 times, most recently from 054572f to 80487c8 Compare June 24, 2019 16:58
@AbhiramE
Copy link
Contributor Author

Rebased and forced pushed to the wrong branch! 😨 Fixed it with the latest push.

@@ -132,6 +132,13 @@
"scrolled": true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this warning is unfortunate. do you know when you started getting it/ can you remove it from the notebook?


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got these warnings from the time I started running the gensen deep dive notebook on local. (A week ago) But before I did, I updated my current conda environment with the updated dependencies.

I can strip the output of this cell, if this warning needs to be hidden.


# Keep track of indicies to train forward and backward jointly
if (
"skipthought_next" in tasknames
Copy link
Contributor

@eedeleon eedeleon Jun 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do not have to change tasknames to task_names but I really prefer it but since it is vendored I get why you would not
if all([task_name in task_names for task in [names, I, care, about]):

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely agree with the suggested change. The change however is the tip of an iceberg. 😄 There's a lot that can be improved and refactored in the gensen code. For now though it would be great if we can pass on this.

Copy link
Contributor

@eedeleon eedeleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good. Notebook did not change much and the other updates are mostly simplifications.

If the 100 char per line PR lands beforehand I would update the PR to avoid having to make a follow up PR since you are currently familiar with all of your changes and it would be harder later.

Once the comments here are addressed I think it is good for check in.

@saidbleik saidbleik merged commit 4dac5f1 into staging Jun 25, 2019
@AbhiramE AbhiramE deleted the abhiram-mlflow branch June 25, 2019 16:18
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants