<i>Copyright (c) Microsoft Corporation. All rights reserved.</i>

<i>Licensed under the MIT License.</i>

# Evaluation on GenSen model by SentEval on AML

SentEval is the evaluation toolkit for sentence embeddings. SentEval is a library for evaluating the quality of sentence embeddings. It is used to assess their generalization power by using them as features on a broad and diverse set of "transfer" tasks. SentEval currently includes 17 downstream tasks.

This notebook will show you how to run SentEval and evaluate trained GenSen model on AML. We used the [SentEval](https://github.com/facebookresearch/SentEval) toolkit to run most of our transfer learning experiments. To replicate these numbers, clone their repository and follow setup instructions. Once complete, copy this notebook and `gensen.py` into their examples folder and run the following commands to reproduce different rows in Table 2 of our paper. Note: Please set the path to the pretrained glove embeddings (`glove.840B.300d.h5`) and model folder as appropriate.

## 0 Global settings

Most of the functions used in the notebook can be found in the `gensen.py` file. We will submit a job to AML to run `gensen_senteval.py`. Set the `PATH_SENTEVAL` as SentEval Data path and `PATH_TO_DATA` as model data path, which you should put your trained model here: pre-trained models under `embedding/` and trained models under `models/`.

Upload the above two paths to Datastore Blob and access the paths through `script_params` which will describe later.

### Prerequisites
* Go through the [Configuration](../../../configuration.ipynb) notebook to install the Azure Machine Learning Python SDK and create an Azure ML `Workspace`
* Review the [tutorial](../train-hyperparameter-tune-deploy-with-pytorch/train-hyperparameter-tune-deploy-with-pytorch.ipynb) on single-node PyTorch training using Azure Machine Learning

In [31]:
# Check core SDK version number
import azureml.core
print("SDK version:", azureml.core.VERSION)

SDK version: 1.0.33


### Diagnostics
Opt-in diagnostics for better experience, quality, and security of future releases.

In [32]:
from azureml.telemetry import set_diagnostics_collection

set_diagnostics_collection(send_diagnostics=True)

Turning diagnostics collection on. 


### Initialize workspace

Initialize a [Workspace](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#workspace) object from the existing workspace you created in the Prerequisites step. `Workspace.from_config()` creates a workspace object from the details stored in `config.json`.

In [33]:
from azureml.core.workspace import Workspace

ws = Workspace.from_config()
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep='\n')

Workspace name: MAIDAPNLP
Azure region: eastus2
Subscription id: 15ae9cb6-95c1-483d-a0e3-b1a1a3b06324
Resource group: nlprg


### Create or attach existing AmlCompute
You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#compute-target) for training your model. In this tutorial, we use Azure ML managed compute ([AmlCompute](https://docs.microsoft.com/azure/machine-learning/service/how-to-set-up-training-targets#amlcompute)) for our remote training compute resource. Specifically, the below code creates an `STANDARD_NC6` GPU cluster that autoscales from `0` to `4` nodes.

**Creation of AmlCompute takes approximately 5 minutes.** If the AmlCompute with that name is already in your workspace, this code will skip the creation process.

As with other Azure services, there are limits on certain resources (e.g. AmlCompute) associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota.

**Use Standard_NC6 for now.**

In [34]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

# choose a name for your cluster
cluster_name = "gpucluster"

try:
    compute_target = ComputeTarget(workspace=ws, name=cluster_name)
    print('Found existing compute target.')
except ComputeTargetException:
    print('Creating a new compute target...')
    compute_config = AmlCompute.provisioning_configuration(vm_size='STANDARD_NC6',
                                                           max_nodes=4)

    # create the cluster
    compute_target = ComputeTarget.create(ws, cluster_name, compute_config)

    compute_target.wait_for_completion(show_output=True)

# use get_status() to get a detailed status for the current AmlCompute. 
print(compute_target.get_status().serialize())

Found existing compute target.
{'currentNodeCount': 0, 'targetNodeCount': 0, 'nodeStateCounts': {'preparingNodeCount': 0, 'runningNodeCount': 0, 'idleNodeCount': 0, 'unusableNodeCount': 0, 'leavingNodeCount': 0, 'preemptedNodeCount': 0}, 'allocationState': 'Steady', 'allocationStateTransitionTime': '2019-05-01T20:59:10.053000+00:00', 'errors': None, 'creationTime': '2019-04-17T17:21:26.968570+00:00', 'modifiedTime': '2019-04-17T17:27:28.740980+00:00', 'provisioningState': 'Succeeded', 'provisioningStateTransitionTime': None, 'scaleSettings': {'minNodeCount': 0, 'maxNodeCount': 4, 'nodeIdleTimeBeforeScaleDown': 'PT7200S'}, 'vmPriority': 'Dedicated', 'vmSize': 'STANDARD_NC6'}


## Access to a project directory
Create a directory that will contain all the necessary code from your local machine that you will need access to on the remote resource. This includes the training script and any additional files your training script depends on.

`project_folder` contains all the code you want to submit to AmlCompute to run. The size of the folder can not exceed 300Mb. In `gensen_senteval.py`, it loads large pre-trained embedding files to the model. Thus, we need to save large files in datastore and only uploads code to `project_folder`.

In [35]:
import os

# Change the path to where your model code locates.
project_folder = '../../amlcode/'
os.makedirs(project_folder, exist_ok=True)

## Access to Datastore
To download some of the data required to evaluate a GenSen model, run the bash file [here](https://github.com/facebookresearch/SentEval/blob/master/data/downstream/get_transfer_data.bash). Make sure to upload all the large files to azure file share. You can access to datastore by using `ds.as_mount()`.

In [36]:
from azureml.core import Datastore
ds = Datastore.register_azure_file_share(workspace=ws,
                                        datastore_name= 'GenSenSentEval',
                                        file_share_name='azureml-filestore-09b72610-7938-4ed2-86a2-5004896b12d9',
                                        account_name='maidapnlp0056795534',
                                        account_key='8LtGFZErNlvI6fSrgODqCxJCckkVgq3AL/5S/8ma7Re7xUHgWrNRCfTFnP/QDhF7KDY6ScAORsUpSm7ziog5/Q==')

Upload the contents of the data directory to the path ./data on the above datastore.

In [None]:
# Upload files from local.
# ds.upload(src_dir='data', target_path='data', overwrite=True, show_progress=True)

In [37]:
ds.as_mount()
# We can fetch the datastore path by: os.environ['AZUREML_DATAREFERENCE_datarefname'].
# path_on_datastore = 'data/'
# ds_data = ds.path(path_on_datastore)
# print(ds_data)

$AZUREML_DATAREFERENCE_gensensenteval

## Train model on the remote compute
Now that we have the AmlCompute ready to go, let's run our training job.

### Prepare training script
Now you will need to create your evaluation script. In this tutorial, the script for evaluation of GENSEN is already provided for you at `gensen_senteval.py`. In practice, you should be able to take any custom PyTorch training script as is and run it with Azure ML without having to modify your code.

However, if you would like to use Azure ML's [metric logging](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#logging) capabilities, you will have to add a small amount of Azure ML logic inside your training script. In this example, at each logging interval, we will log the loss for that minibatch to our Azure ML run.

To do so, in `gensen_senteval.py`, we will first access the Azure ML `Run` object within the script:
```Python
from azureml.core.run import Run
run = Run.get_context()
```
Later within the script, we log the loss metric to our run:
```Python
run.log('loss', loss.item())
```

### Create an experiment
Create an [Experiment](https://docs.microsoft.com/azure/machine-learning/service/concept-azure-machine-learning-architecture#experiment) to track all the runs in your workspace for this tutorial.

In [38]:
from azureml.core import Experiment, get_run

experiment_name = 'gensen'
experiment = Experiment(ws, name=experiment_name)

## Create A PyTorch Estimator

First we import the Estimator class and also a widget to visualize a run.

In [39]:
from azureml.train.estimator import Estimator
from azureml.widgets import RunDetails

Prepare your PyTorch estimator.

In [58]:
from azureml.train.dnn import PyTorch
script_params = {
    '--folder_path': ds.path("data/models"),
    '--pretrain': ds.path("data/embedding/glove.840B.300d.h5"),
    '--path_senteval': ds.path("senteval/data"),
    '--path_to_data': ds.as_mount()
}

estimator = PyTorch(source_directory=project_folder,
                    script_params=script_params,
                    compute_target=compute_target,
                    entry_script='gensen/gensen_senteval.py',
                    node_count=4,
                    process_count_per_node=1,
                    distributed_backend='mpi',
                    use_gpu=True,
                    conda_packages=['scikit-learn=0.20.3', 'h5py', 'nltk']
                   )

### Submit job and monitor your job
Run your experiment by submitting your estimator object. Note that this call is asynchronous.

In [59]:
run = experiment.submit(estimator)
print(run)
RunDetails(run).show()

Run(Experiment: gensen,
Id: gensen_1556823378_ffe734a7,
Type: azureml.scriptrun,
Status: Queued)


_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': True, 'log_level': 'INFO', 's…

In [None]:
run.wait_for_completion(show_output=True) # this provides a verbose log

## Cancel the job
It's better to cancel the job manually to make sure you does not waste resources.

In [25]:
# Cancel the job with id.
# job_id = "pytorch-gensen_1555533596_d9cc75fe"
# run = get_run(experiment, job_id)

# Cancel jobs.
run.cancel()

## References

1. [1] A. Conneau, D. Kiela, [*SentEval: An Evaluation Toolkit for Universal Sentence Representations*](https://arxiv.org/abs/1803.05449).