# Machine Translation Project 

This project aims to translate sentences from German to English. There are in total 200,000 pairs of sentences from IWSLT 2016. The current model script is named as `transformer.py`, which contains a bert tokenizer and a transformer model. This notebook aims to provide an example that highlights the ease of integrating with users' codes and the use of Azure Datastore. 

Table of Content:
1. [Set up workspace](#Set-up-workspace)
2. [Set up datastore](#Set-up-datastore)
3. [Specify compute target](#Specify-compute-target)
4. [Create estimator and submit an experiment](#Create-estimator-and-submit-an-experiment)
5. [Cancel runs](#Cancel-runs)


# Set up workspace

In [2]:
import os
import numpy as np
import matplotlib.pyplot as plt
import simplejson as json

import azureml.core
from azureml.core import Workspace, Experiment, Run, Datastore, ScriptRunConfig
from azureml.data.data_reference import DataReference

# import compute target 
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget

from azureml.train.dnn import PyTorch

# check core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)

Azure ML SDK Version:  1.0.23


In [3]:
ws = Workspace.from_config()

Found the config file in: /Users/chengyineng/Documents/Projects/microsoft-azure-ml-notebooks/machine-translation/config.json


In [4]:
# load config file

config_path = os.path.join(os.getcwd(), 'config.json')
print(config_path)
with open(config_path, 'r') as f:
    config = json.load(f)

/Users/chengyineng/Documents/Projects/microsoft-azure-ml-notebooks/machine-translation/config.json


# Set up datastore

Here, we register a datastore specifically for this machine translation project. To read more about how to use a datastore to access your data, you can go to this [web page](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-access-data).

In [5]:
ds = ws.get_default_datastore()
print(ds.datastore_type, ds.account_name, ds.container_name)


# #define default datastore for current workspace
ds = Datastore.register_azure_blob_container(workspace=ws, 
                                             datastore_name='machine_translation', 
                                             container_name=config["container_name"],
                                             account_name=config["account_name"], 
                                             account_key=config["account_key"],
                                             create_if_not_exists=True)

#get named datastore from current workspace
ds = Datastore.get(ws, datastore_name='machine_translation')

ds.path('./machine_translation').as_download()

AzureBlob amherstwstorageinnganzr azureml-blobstore-fe92660d-c6c1-4086-b2f7-71f9c508e6c7


$AZUREML_DATAREFERENCE_4b1e1205ec574e4e8b1a1e86d6fe3d00

You can locate your datastore on the portal itself.
<br>
<img src="images/datastore.png" width="1500">

Create your experiment name within the workspace you desire.

In [6]:
experiment_name = 'machine_translation-transformer'
exp = Experiment(workspace=ws, name=experiment_name)

<br>
<img src="images/experiment_name.png" width="1500">

# Specify compute target

In [7]:
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
import os

# choose a name for your cluster
compute_name = os.environ.get("AML_COMPUTE_CLUSTER_NAME", "nv12")
compute_min_nodes = os.environ.get("AML_COMPUTE_CLUSTER_MIN_NODES", 0)
compute_max_nodes = os.environ.get("AML_COMPUTE_CLUSTER_MAX_NODES", 4)

# This example uses CPU VM. For using GPU VM, set SKU to STANDARD_NC6/STANDARD_D2_V2
vm_size = os.environ.get("AML_COMPUTE_CLUSTER_STANDARD_NC24", "STANDARD_NC24")

if compute_name in ws.compute_targets:
    compute_target = ws.compute_targets[compute_name]
    if compute_target and type(compute_target) is AmlCompute:
        print('found compute target. just use it. ' + compute_name)
else:
    print('creating a new compute target...')
    provisioning_config = AmlCompute.provisioning_configuration(vm_size = vm_size,
                                                                min_nodes = compute_min_nodes, 
                                                                max_nodes = compute_max_nodes)

    # create the cluster
    compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)
    
    # can poll for a minimum number of nodes and for a specific timeout. 
    # if no min node count is provided it will use the scale settings for the cluster
    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
    
     # For a more detailed view of current AmlCompute status, use get_status()
    print(compute_target.get_status().serialize())

found compute target. just use it. nv12


You can go to the portal to view existing compute targets or create a new target.
<br>
<img src="images/compute_target.png" width="1500">

Mount the datastore folder that you just created so that when the experiment is submitted, it knows where to find the data.

In [18]:
script_params = {
    '--data-folder': ds.as_mount()
}

# Create estimator and submit an experiment

In [17]:
pt_est = PyTorch(source_directory='./', 
                 script_params=script_params,
                 compute_target=compute_target,
                 entry_script='transformer.py',
                 pip_packages=['torchtext','bert-embedding'], # specify the packages that you need the run to install
                 use_gpu=True)

In [18]:
run = exp.submit(pt_est)

### To show widget of the experiment run details

In [None]:
from azureml.widgets import RunDetails
RunDetails(run).show()

You can retrieve the logs for the experiment runs in the portal. 
<br>
<img src="images/log_details.png" width="1500">

Even if you have to submit multiple experiment runs, you do not have to worry about the changes that you make to the scripts because Azure keeps track of the scripts. You can even download a snapshot of your experiment run on the portal.
<br>
<img src="images/snapshot.png" width="1500">

# Cancel runs

### To cancel the last run:

When you realize that you have made mistakes in your submitted runs, you can immediately cancel it.

In [10]:
local_script_run = exp.submit(pt_est)
print("Did the run start?",local_script_run.get_status())

local_script_run.cancel()
print("Did the run cancel?",local_script_run.get_status())

Did the run cancel? Canceled


### To cancel a run based on experiment ID

If you submitted multiple experiments and you want to cancel a specific experiment, you can retrieve an experiment's `Run Id` to cancel it directly. 

In [9]:
from azureml.core import get_run
run_cpu_id = 'machine_translation_1554915123_b8da5e9d' #get from portal
run=get_run(exp, run_cpu_id)
run.cancel()

Alternatively, you can also navigate to the experiment run on the portal to use the `cancel` button.
<br>
<img src="images/cancel_runs.png" width="800">