GenSen on AML deep dive notebook (sentence similarity) #78

catherine667 · 2019-05-31T19:00:21Z

This notebook serves as an introduction to an end-to-end NLP solution for sentence similarity building one of the advanced models - GenSen on AzureML platform. We show the advantages of AzureML when training large NLP models with GPU.

The notebook includes: Data loading and preprocessing, Train GenSen model with distributed PyTorch with Horovod on AzureML and Tuning on HypterDrive. Evaluation and deployment will be added later. In addition, the comparison results with training and tuning on AML v.s. VM will be added once this initial PR is merged with staging.

Provide a refactored GenSen code into utils_nlp to make the model reusable.

We provide a distributed PyTorch with Horovod implementation of the paper along with pre-trained models as well as code to evaluate these models on a variety of transfer learning benchmarks.
This code is based on the gibhub codebase from Maluuba, but we have refactored the code in the following aspects:

Support a distributed PyTorch with Horovod
Clean and refactor the original code in a more structured form
Change the training file (train.py) from non-stopping to stop when the validation loss reaches to the local minimum
Update the code from Python 2.7 to 3+ and PyTorch from 0.2 or 0.3 to 1.0.1
Add some necessary comments
Add some code for training on AzureML platform
Fix the bug on when setting the batch size to 1, the training raises an error

review-notebook-app · 2019-05-31T19:00:34Z

Check out this pull request on ReviewNB: https://app.reviewnb.com/microsoft/nlp/pull/78

Visit www.reviewnb.com to know how we simplify your Jupyter Notebook workflows.

miguelgfierro · 2019-06-03T12:52:59Z

This is really good. Several questions:

Have you talked with anyone in Maluuba about this? it would be great if they could give us feedback
We are not uploading images to the repo, see discussion here. I uploaded your image here https://nlpbp.blob.core.windows.net/images/seq2seq.png
Does your code allows for training without horovod? just on GPU or multi-GPU?

At a bare minimum, I would split the notebook into two, one for training and one for hyperparameter tuning. In reco we are doing this.

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb

heatherbshapiro

It would be good to explicitly call out why users should use this and how azureml helps. Does hyperdrive improve the accuracy? we spin up the gpu compute for you and it would be very difficult to run without...etc

catherine667 · 2019-06-03T15:33:12Z

This is really good. Several questions:

Have you talked with anyone in Maluuba about this? it would be great if they could give us feedback

We are not uploading images to the repo, see discussion here. I uploaded your image here https://nlpbp.blob.core.windows.net/images/seq2seq.png

Does your code allows for training without horovod? just on GPU or multi-GPU?

At a bare minimum, I would split the notebook into two, one for training and one for hyperparameter tuning. In reco we are doing this.

@miguelgfierro The followings are the replies:

Our PM @irshaffe has tried to contact Maluuba authors in that repo, but it seems none of them is at Microsoft right now. From the repo, seems like they are not activate this year. Still have unmerged PR from last year.
I have used the url you have provided and remove the img folder.
The code only needs minor changes for local training with one or multiple GPUs. @AbhiramE is working on this and he will raise a separate PR on local training without horovod.
We also plan to add model evaluation to the notebook, would it be good to keep all the sessions in one notebook so that I do not have to replicate the settings for AML? @saidbleik
If we split the notebook with preprocessing + training, tuning and evaluation. What would be the structure you recommend? (With sub-folder like similarity/gensen/...)

Thanks!

saidbleik · 2019-06-03T16:51:47Z

Great work!

I like that the flow is all in one notebook (although this one is a little long). Ideally, there should be one notebook for showing the base example (without Horovod, AML, tuning, etc...) and one that extends the base example and shows how to do the same using AML.
- gensen_deep_dive.ipynb
- gensen_deep_dive_aml.ipynb
Things like train.py should be outside the utils_nlp folder.

saidbleik · 2019-06-03T16:58:48Z

For multi-gpu support, see here: https://github.com/microsoft/nlp/blob/bleik/utils_nlp/pytorch/device_utils.py

catherine667 · 2019-06-03T17:01:27Z

Great work!

I like that the flow is all in one notebook (although this one is a little long). Ideally, there should be one notebook for showing the base example (without Horovod, AML, tuning, etc...) and one that extends the base example and shows how to do the same using AML.

gensen_deep_dive.ipynb
gensen_deep_dive_aml.ipynb

Things like train.py should be outside the utils_nlp folder.

@saidbleik The structure you mentioned:
gensen_deep_dive.ipynb
gensen_deep_dive_aml.ipynb

That's exactly what we are planning to do. @AbhiramE is working on gensen_deep_dive.ipynb which is training on VM without AML. He will raise a separate PR once this PR is merged. We are currently doing the experiments on the performance for AML v.s. VM. Once the evaluations are done, we will put the results in README file.

I think maybe it's better to put all the sessions (preprocessing, training, tuning, evaluation) in one gensen_deep_dive_aml.ipynb notebook because our purpose is to show the whole end-to-end pipeline and this way can also avoid code replications on setting AML configurations.

For train.py, do you recommend to let me put it in the same folder as the notebook? @saidbleik
Please let me know if you have other thoughts. Thanks!

catherine667 · 2019-06-03T17:09:48Z

For multi-gpu support, see here: https://github.com/microsoft/nlp/blob/bleik/utils_nlp/pytorch/device_utils.py

@saidbleik
The original code supports the multi gpus by using:
model = torch.nn.DataParallel(model, device_ids=range(n_gpus))

However, we do not have multi gpus permission to train the model for now. We can only use Standard_NC6 with single gpu.

saidbleik · 2019-06-03T17:13:26Z

I think maybe it's better to put all the sessions (preprocessing, training, tuning, evaluation) in one gensen_deep_dive_aml.ipynb notebook because our purpose is to show the whole end-to-end pipeline and this way can also avoid code replications on setting AML configurations. Please let me know if you have other thoughts. Thanks!

This is fine, but it can be less verbose. For example, you don't need to describe Gensen in the AML version. You can link to the base notebook instead (or perhaps have this common description in the readme).

scenarios/sentence_similarity/gensen_aml_deep_dive.ipynb

catherine667 · 2019-06-04T17:18:43Z

It would be good to explicitly call out why users should use this and how azureml helps. Does hyperdrive improve the accuracy? we spin up the gpu compute for you and it would be very difficult to run without...etc

@heatherbshapiro Added in section 2.3.5 for explaining the Horovod.

catherine667 · 2019-06-04T17:32:49Z

Several changes have been made:

Add quotes and citations to the GenSen explanation part; correct all the typos
Fixed the uploading bug and add code for using default datastore
Add explanations on the results to the training and tuning
Fixed the bug on training not converging

…l never stop; min_epoch_loss always eqals to val_epoch_loss

training. 1. Add random seeds for iterators 2. learning rate=lr*hvd.size() 3. sync the optimizer 4. remove DataParallel

…o the top

heatherbshapiro

This looks great. Thanks for making the updates!

catherine667 requested review from miguelgfierro, saidbleik, AbhiramE, jisooghd, janhavi13, cocochrane, eedeleon and irshaffe May 31, 2019 19:00

catherine667 requested a review from heatherbshapiro May 31, 2019 19:06