Skip to content
This repository has been archived by the owner on Nov 16, 2023. It is now read-only.

Staging to master to add the latest fixes #503

Merged
merged 26 commits into from
Nov 30, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
342bf36
update mlflow version to match the other azureml versions
miguelgfierro Nov 20, 2019
e91b9ef
Update generate_conda_file.py
miguelgfierro Nov 21, 2019
00d9ca0
added temporary
miguelgfierro Nov 21, 2019
9311727
Merge pull request #483 from microsoft/miguel/temporary
miguelgfierro Nov 22, 2019
f928b0d
Merge pull request #481 from microsoft/miguelgfierro-patch-1
Nov 25, 2019
2f9bfad
doc: update github url references
Nov 25, 2019
c8abcbe
docs: update nlp recipes references
Nov 25, 2019
99d00d4
Minor bug fix for text classification of multi languages notebook
kehuangms Nov 25, 2019
d71de4a
remove bert and xlnet notebooks
saidbleik Nov 25, 2019
3d7c037
Merge pull request #490 from microsoft/emawa/docs/update-nlp-references
saidbleik Nov 25, 2019
c3528d5
Merge pull request #493 from microsoft/kehuan
saidbleik Nov 25, 2019
7df12d8
Merge pull request #494 from microsoft/transformers2
saidbleik Nov 25, 2019
b0dc696
remove obsolete tests and links
saidbleik Nov 26, 2019
0b4b256
Add missing tmp directories.
hlums Nov 26, 2019
a39143f
fix import error and max_nodes for the cluster
daden-ms Nov 27, 2019
e578682
Merge pull request #497 from microsoft/transformers2
miguelgfierro Nov 27, 2019
bc41256
Merge pull request #499 from microsoft/daden/issue496
miguelgfierro Nov 27, 2019
d13cce1
Minor edits.
hlums Nov 27, 2019
6c2ab2a
Attempt to fix test device error.
hlums Nov 27, 2019
4b13b9d
Temporarily pin transformers version
hlums Nov 27, 2019
3e72fb0
Remove gpu tags temporarily
hlums Nov 27, 2019
40ae2b7
Test whether device error also occurs for SequenceClassifier.
hlums Nov 27, 2019
321032e
Revert temporary changes.
hlums Nov 27, 2019
3bb5cce
Revert temporary changes.
hlums Nov 27, 2019
857ce5c
Merge pull request #498 from microsoft/hlu/fix_temp_directories
miguelgfierro Nov 28, 2019
25b6643
Merge pull request #500 from microsoft/hlu/temporary_test_fix
miguelgfierro Nov 28, 2019
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ The following is a list of related repositories that we like and think are usefu
|[AzureML-BERT](https://github.com/Microsoft/AzureML-BERT)|End-to-end recipes for pre-training and fine-tuning BERT using Azure Machine Learning service.|
|[MASS](https://github.com/microsoft/MASS)|MASS: Masked Sequence to Sequence Pre-training for Language Generation.|
|[MT-DNN](https://github.com/namisan/mt-dnn)|Multi-Task Deep Neural Networks for Natural Language Understanding.|
|[UniLM](https://github.com/microsoft/unilm)|Unified Language Model Pre-training.|



## Build Status
Expand Down
10 changes: 5 additions & 5 deletions SETUP.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,9 @@ You can learn how to create a Notebook VM [here](https://docs.microsoft.com/en-u
We provide a script, [generate_conda_file.py](tools/generate_conda_file.py), to generate a conda-environment yaml file
which you can use to create the target environment using the Python version 3.6 with all the correct dependencies.

Assuming the repo is cloned as `nlp` in the system, to install **a default (Python CPU) environment**:
Assuming the repo is cloned as `nlp-recipes` in the system, to install **a default (Python CPU) environment**:

cd nlp
cd nlp-recipes
python tools/generate_conda_file.py
conda env create -f nlp_cpu.yaml

Expand All @@ -62,7 +62,7 @@ Click on the following menus to see how to install the Python GPU environment:

Assuming that you have a GPU machine, to install the Python GPU environment, which by default installs the CPU environment:

cd nlp
cd nlp-recipes
python tools/generate_conda_file.py --gpu
conda env create -n nlp_gpu -f nlp_gpu.yaml

Expand All @@ -79,7 +79,7 @@ Assuming that you have an Azure GPU DSVM machine, here are the steps to setup th

2. Install the GPU environment.

cd nlp
cd nlp-recipes
python tools/generate_conda_file.py --gpu
conda env create -n nlp_gpu -f nlp_gpu.yaml

Expand Down Expand Up @@ -110,7 +110,7 @@ Running the command tells pip to install the `utils_nlp` package from source in

> It is also possible to install directly from Github, which is the best way to utilize the `utils_nlp` package in external projects (while still reflecting updates to the source as it's installed as an editable `'-e'` package).

> `pip install -e git+git@github.com:microsoft/nlp.git@master#egg=utils_nlp`
> `pip install -e git+git@github.com:microsoft/nlp-recipes.git@master#egg=utils_nlp`

Either command, from above, makes `utils_nlp` available in your conda virtual environment. You can verify it was properly installed by running:

Expand Down
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
# The full version, including alpha/beta/rc tags
release = VERSION

prefix = "NLP"
prefix = "NLPRecipes"

# -- General configuration ---------------------------------------------------

Expand Down
4 changes: 2 additions & 2 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
NLP Utilities
===================================================

The `NLP repository <https://github.com/Microsoft/NLP>`_ provides examples and best practices for building NLP systems, provided as Jupyter notebooks.
The `NLP repository <https://github.com/microsoft/nlp-recipes>`_ provides examples and best practices for building NLP systems, provided as Jupyter notebooks.

The module `utils_nlp <https://github.com/microsoft/nlp/tree/master/utils_nlp>`_ contains functions to simplify common tasks used when developing and
The module `utils_nlp <https://github.com/microsoft/nlp-recipes/tree/master/utils_nlp>`_ contains functions to simplify common tasks used when developing and
evaluating NLP systems.

.. toctree::
Expand Down
8 changes: 4 additions & 4 deletions examples/entailment/entailment_xnli_bert_azureml.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@
"from azureml.core.runconfig import MpiConfiguration\n",
"from azureml.core import Experiment\n",
"from azureml.widgets import RunDetails\n",
"from azureml.core.compute import ComputeTarget\n",
"from azureml.core.compute import ComputeTarget, AmlCompute\n",
"from azureml.exceptions import ComputeTargetException\n",
"from utils_nlp.azureml.azureml_utils import get_or_create_workspace, get_output_files"
]
Expand Down Expand Up @@ -169,7 +169,7 @@
"except ComputeTargetException:\n",
" print(\"Creating new compute target: {}\".format(cluster_name))\n",
" compute_config = AmlCompute.provisioning_configuration(\n",
" vm_size=\"STANDARD_NC6\", max_nodes=1\n",
" vm_size=\"STANDARD_NC6\", max_nodes=NODE_COUNT\n",
" )\n",
" compute_target = ComputeTarget.create(ws, cluster_name, compute_config)\n",
" compute_target.wait_for_completion(show_output=True)\n",
Expand Down Expand Up @@ -524,9 +524,9 @@
"metadata": {
"celltoolbar": "Tags",
"kernelspec": {
"display_name": "Python 3",
"display_name": "Python (nlp_gpu_transformer_bug_bash)",
"language": "python",
"name": "python3"
"name": "nlp_gpu_transformer_bug_bash"
},
"language_info": {
"codemirror_mode": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -175,7 +175,7 @@
"metadata": {},
"source": [
"This step downloads the pre-trained [AllenNLP](https://allennlp.org/models) pretrained model and registers the model in our Workspace. The pre-trained AllenNLP model we use is called Bidirectional Attention Flow for Machine Comprehension ([BiDAF](https://www.semanticscholar.org/paper/Bidirectional-Attention-Flow-for-Machine-Seo-Kembhavi/007ab5528b3bd310a80d553cccad4b78dc496b02\n",
")) It achieved state-of-the-art performance on the [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/) dataset in 2017 and is a well-respected, performant baseline for QA. AllenNLP's pre-trained BIDAF model is trained on the SQuAD training set and achieves an EM score of 68.3 on the SQuAD development set. See the [BIDAF deep dive notebook](https://github.com/microsoft/nlp/examples/question_answering/bidaf_deep_dive.ipynb\n",
")) It achieved state-of-the-art performance on the [SQuAD](https://rajpurkar.github.io/SQuAD-explorer/) dataset in 2017 and is a well-respected, performant baseline for QA. AllenNLP's pre-trained BIDAF model is trained on the SQuAD training set and achieves an EM score of 68.3 on the SQuAD development set. See the [BIDAF deep dive notebook](https://github.com/microsoft/nlp-recipes/examples/question_answering/bidaf_deep_dive.ipynb\n",
") for more information on this algorithm and AllenNLP implementation."
]
},
Expand Down
3 changes: 0 additions & 3 deletions examples/text_classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,5 @@ The following summarizes each notebook for Text Classification. Each notebook pr
|Notebook|Environment|Description|Dataset|
|---|---|---|---|
|[BERT for text classification on AzureML](tc_bert_azureml.ipynb) |Azure ML|A notebook which walks through fine-tuning and evaluating pre-trained BERT model on a distributed setup with AzureML. |[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/)|
|[XLNet for text classification with MNLI](tc_mnli_xlnet.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a pre-trained XLNet model on a subset of the MultiNLI dataset|[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/)|
|[BERT for text classification of Hindi BBC News](tc_bbc_bert_hi.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a pre-trained BERT model on Hindi BBC news data|[BBC Hindi News](https://github.com/NirantK/hindi2vec/releases/tag/bbc-hindi-v0.1)|
|[BERT for text classification of Arabic News](tc_dac_bert_ar.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a pre-trained BERT model on Arabic news articles|[DAC](https://data.mendeley.com/datasets/v524p5dhpj/2)|
|[Text Classification of MultiNLI Sentences using Multiple Transformer Models](tc_mnli_transformers.ipynb)|Local| A notebook which walks through fine-tuning and evaluating a number of pre-trained transformer models|[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/)|
|[Text Classification of Multi Language Datasets using Transformer Model](tc_multi_languages_transformers.ipynb)|Local|A notebook which walks through fine-tuning and evaluating a pre-trained transformer model for multiple datasets in different language|[MultiNLI](https://www.nyu.edu/projects/bowman/multinli/) <br> [BBC Hindi News](https://github.com/NirantK/hindi2vec/releases/tag/bbc-hindi-v0.1) <br> [DAC](https://data.mendeley.com/datasets/v524p5dhpj/2)
Loading