Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

[ASK] Improve user experience for long running notebooks #213

Closed
yijingchen opened this issue Jul 30, 2019 · 7 comments
Closed

[ASK] Improve user experience for long running notebooks #213

yijingchen opened this issue Jul 30, 2019 · 7 comments
Assignees
Labels
enhancement New feature or request

Comments

@yijingchen
Copy link
Contributor

yijingchen commented Jul 30, 2019

Description

Some notebooks take long time to run. For external data scientist who wants to try it out fast and see how things work, this is not a quite pleasant experience. Here are some ideas for improvements:

  • Each notebook add a note section describing the the machine configuration(i.e. # of GPU, etc) and estimated time to finish running the notebooks so that user won't be surprised.
  • Another idea is set the default of the notebooks to run on smaller data and smaller parameters. And then add another section to guide user change it to larger experiment so they know they'll face a long running time.

Notebook running time (Last update: 8/1/2019)

Machine: Azure DLVM Standard_NC12 with 2 GPU

Scenario Notebook Name CPU GPU
entailment entailment_xnli_multilingual NA ~20hrs
name_entity_recognition ner_wikigold_bert ~37mins ~6mins
embeddings embedding_trainer ~5mins ~5mins
interpret_NLP_models understand_models ~4mins ~2mins
text_classification tc_mnil_bert ~8.2hrs ~1.2hrs
@yijingchen yijingchen added the enhancement New feature or request label Jul 30, 2019
@daden-ms
Copy link
Contributor

daden-ms commented Jul 30, 2019

second on the idea of making the notebooks run on smaller data and smaller parameters or provide an options to do so. I have a similar issue for the entailment notebook (#215)

@yijingchen
Copy link
Contributor Author

second on the idea of making the notebooks run on smaller data and smaller parameters or provide an options to do so. I have a similar issue for the entailment notebook (#215)

I believe @hlums is looking into this. I personally also prefer the default of the notebook runs on smaller data, because it is very likely people don't read about the instructions and click run directly.
Hong, I'm thinking maybe we can add both note section and update the default. For those who wants to know the true model performance, we can add the number in note section and guide them how they can change the dataset/parameter to achieve the same performance number.

@yijingchen
Copy link
Contributor Author

@miguelgfierro FYI, I will keep updating the notebook end-to-end running time in this issue. I thought it could be useful for you notebook testing pipeline as well.

@hlums hlums self-assigned this Aug 1, 2019
@hlums
Copy link
Collaborator

hlums commented Aug 1, 2019

@yijingchen @daden-ms I had a discussion with @saidbleik , both of us think it's not ideal to run the notebook on a smaller dataset by default because the model performance in the notebook will look bad. Can you take a look at https://github.com/microsoft/nlp/blob/hlu/update_entailment_notebook_running_time/scenarios/entailment/entailment_multinli_bert.ipynb? Does this help improving the user experience?
If it helps, I plan to test all notebooks on a cpu machine and a machine with a single GPU for both QUICK_RUN=True and QUICK_RUN=False and provide the running time information.

@yijingchen
Copy link
Contributor Author

@hlums This notebook looks great. You can add a comment on the line of 'quick_run' as well to tell user they need to make changes on that line.
It would be great if we can have this consistent format across all the notebooks.

@hlums
Copy link
Collaborator

hlums commented Aug 2, 2019

@hlums This notebook looks great. You can add a comment on the line of 'quick_run' as well to tell user they need to make changes on that line.
It would be great if we can have this consistent format across all the notebooks.

Sounds great! I will work on that.

@daden-ms
Copy link
Contributor

daden-ms commented Aug 6, 2019

This is awesome. Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants