# Spectrograms of speech, identify accents using a CNN in Azure ML Services

In this notebook we will develop a convolutional neural network using Keras with TensorFlow. The model will be trained and evaluated using Azure Machine Learnings Services, invoking API command for every one of the steps.

## Importing libraries and configuring the Azure ML services

In [1]:
%matplotlib inline
import numpy as np
import os
import matplotlib.pyplot as plt

In [3]:
import azureml
from azureml.core import Workspace

# check core SDK version number
print("Azure ML SDK Version: ", azureml.core.VERSION)

Azure ML SDK Version:  1.0.48


## Defining Azure values in global variables

In [4]:
subscription_id = os.getenv("SUBSCRIPTION_ID", default="<SUBSCRIPTION_ID>")
resource_group = os.getenv("RESOURCE_GROUP", default="<RESOURCE_GROUP>")
workspace_name = os.getenv("WORKSPACE_NAME", default="<WORKSPACE_NAME>")
workspace_region = os.getenv("WORKSPACE_REGION", default="<WORKSPACE_REGION>")

### Initialize workspace
Initialize a Workspace object from an existing workspace created in a previous training or it will be created if it does not exist. Then we create a config.json file.

In [5]:

try:
    ws = Workspace(subscription_id = subscription_id, resource_group = resource_group, workspace_name = workspace_name)
    # write the details of the workspace to a configuration file to the notebook library
    ws.write_config()
    print("Workspace configuration succeeded. Skip the workspace creation steps below")
except:
    print("Workspace not accessible. Creating a new workspace below")
    # Create the workspace using the specified parameters
    ws = Workspace.create(name = workspace_name,
                      subscription_id = subscription_id,
                      resource_group = resource_group, 
                      location = workspace_region,
                      create_resource_group = True,
                      exist_ok = True)
    ws.get_details()

    # write the details of the workspace to a configuration file to the notebook library
    ws.write_config()

Performing interactive authentication. Please follow the instructions on the terminal.
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code FF33HUK5P to authenticate.
Interactive authentication successfully completed.
Workspace configuration succeeded. Skip the workspace creation steps below


In [None]:
#Print the values for the ML workspace 
print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep='\n')

## Create compute resources for our  training experiments

In this section we get compute resources where the model will be trained. We will search for an existing compute resource or if it does not exist we will create it. We can use a pre-created Virtual Machine, like a Azure Data Science Virtual Machine, or we can create a new compute resource indicating the type and computation power.

To create a cluster, you need to specify a compute configuration that specifies the type of machine to be used and the scalability behaviors. Then you choose a name for the cluster that is unique within the workspace that can be used to address the cluster later. There are many types of compute resource, "STANDARD_DS12_V2", "STANDARD_D4_V2".

From Microsoft docs:

The cluster parameters are:

vm_size - this describes the virtual machine type and size used in the cluster. All machines in the cluster are the same type. You can get the list of vm sizes available in your region by using the CLI command
az vm list-skus -o tsv
min_nodes - this sets the minimum size of the cluster. If you set the minimum to 0 the cluster will shut down all nodes while note in use. Setting this number to a value higher than 0 will allow for faster start-up times, but you will also be billed when the cluster is not in use.
max_nodes - this sets the maximum size of the cluster. Setting this to a larger number allows for more concurrency and a greater distributed processing of scale-out jobs.
To create a CPU cluster now, run the cell below. The autoscale settings mean that the cluster will scale down to 0 nodes when inactive and up to 4 nodes when busy.

In [7]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

# Choose a name for your CPU cluster
cpu_cluster_name = "cpucluster"
#cpu_cluster_name = "ML-VM-DSVM"

# Verify that cluster does not exist already
try:
    cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)
    print("Found existing cpucluster")
except ComputeTargetException:
    print("Creating new cpucluster")
    
    # Specify the configuration for the new cluster
    # "STANDARD_DS12_V2" "STANDARD_D4_V2"
    compute_config = AmlCompute.provisioning_configuration(vm_size="STANDARD_D12_V2",
                                                           min_nodes=0,
                                                           max_nodes=4)

    # Create the cluster with the specified name and configuration
    cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)
    
    # Wait for the cluster to complete, show the output log
    cpu_cluster.wait_for_completion(show_output=True)

Found existing cpucluster


### Create/Open an Azure ML experiment

Let's create an experiment named "speech-cnn". An experiment is a container, it will save our trainings, their metrics, their outputs,... Then we can analyze and compare our results on diferents arquitecture or parameters in the model. It will be our dairy during training and evaluating a ml model.

We will create a folder to hold the training scripts. The script runs will be recorded under the experiment in Azure.

In [None]:
from azureml.core import Experiment

script_folder = './speech_cnn'
os.makedirs(script_folder, exist_ok=True)

exp = Experiment(workspace=ws, name='speech')
print("Experiment: ",exp.name)
print("Experiments in WS: ",exp.list(ws))

### Upload Spectrograms of Speech dataset to default datastore

A datastore is a place where data can be stored that is then made accessible to a Run either by means of mounting or copying the data to the compute target. A datastore can either be backed by an Azure Blob Storage or and Azure File Share (ADLS will be supported in the future). 
For simple data handling, each workspace provides a default datastore that can be used, in case the data is not already in Blob Storage or File Share. We will use that default datastore.

In [9]:
ds = ws.get_default_datastore()

In this next step, we will upload the training and test dataset into the workspace's default datastore, which we will then later be mount on an AmlCompute cluster for training.

In [9]:
ds.upload(src_dir='./data', target_path='speech_specs', overwrite=True, show_progress=True)

Uploading ./data/x_images_arrays_zip_1000.npz
Uploading ./data/y_infected_labels_1000.npz
Uploaded ./data/y_infected_labels_1000.npz, 1 files out of an estimated total of 2
Uploaded ./data/x_images_arrays_zip_1000.npz, 2 files out of an estimated total of 2


$AZUREML_DATAREFERENCE_b3879516944548e9b31877cc2fcb6b3c

## Get default Compute resource

Now we get the compute resource, cpucluster, that we have created previously.

In [12]:
# We can list all the existing compute resources, identify wich one we would like to use.  
compute_targets = ws.compute_targets
for name, ct in compute_targets.items():
    print(name, ct.type, ct.provisioning_state)

cpucluster AmlCompute Succeeded


In [18]:
from azureml.core.compute import ComputeTarget

compute_target = ComputeTarget(ws, 'cpucluster')
# use get_status() to get a detailed status for the current cluster. 
print(compute_target.get_status())

<azureml.core.compute.amlcompute.AmlComputeStatus object at 0x7f7214f2b160>


In [None]:
cpu_cluster

## Copy the training files into the script folder

The script folder should contain all the files necessary for the azure job to train the model. There will be a .py file where all the steps of the training proccess are defined: read the data, split the data, create the model, set the parameters, compile and fit the model, evaluate.

So in the next code we will copy the .py file to the script folder

In [23]:
import shutil

# the training logic is in the keras_mnist.py file.
shutil.copy('./train_cnn_gen.py', script_folder)

'./speech_cnn/train_cnn_gen.py'

In [24]:
script_folder

'./speech_cnn'

## Create TensorFlow estimator & add Keras
Next, we construct an azureml.train.dnn.TensorFlow estimator object, use the  compute target, and pass the mount-point of the datastore to the training code as a parameter. The TensorFlow estimator is providing a simple way of launching a TensorFlow training job on a compute target. It will automatically provide a docker image that has TensorFlow installed. In this case, we add keras package (for the Keras framework obviously), and matplotlib package for plotting a "Loss vs. Accuracy" chart and record it in run history.

In [25]:
from azureml.train.dnn import TensorFlow

script_params = {
    '--data-folder': ds.path('speech_specs').as_download(),
    '--batch-size': 32,
    '--x_filename': 'x_speech_arrays_zip_4500.npz',
    '--y_filename': 'y_speech_labels_4500.npz',
    '--training_size': '4500',
    '--n_epochs': 100
}

est = TensorFlow(source_directory=script_folder,
                 script_params=script_params,
                 compute_target=cpu_cluster, 
                 pip_packages=['keras', 'matplotlib'],
                 conda_packages=['scikit-learn'],
                 entry_script='train_cnn_gen.py', 
                 use_gpu=False)



## Submit job to run
Submit the estimator to the Azure ML experiment to kick off the execution.

In [26]:
run = exp.submit(est)

### Monitor the Run
As the Run is executed, it will go through the following stages:

Preparing: A docker image is created matching the Python environment specified by the TensorFlow estimator and it will be uploaded to the workspace's Azure Container Registry. This step will only happen once for each Python environment -- the container will then be cached for subsequent runs. Creating and uploading the image takes about 5 minutes. While the job is preparing, logs are streamed to the run history and can be viewed to monitor the progress of the image creation.

Scaling: If the compute needs to be scaled up (i.e. the AmlCompute cluster requires more nodes to execute the run than currently available), the cluster will attempt to scale up in order to make the required amount of nodes available. Scaling typically takes about 5 minutes.

Running: All scripts in the script folder are uploaded to the compute target, data stores are mounted/copied and the entry_script is executed. While the job is running, stdout and the ./logs folder are streamed to the run history and can be viewed to monitor the progress of the run.

Post-Processing: The ./outputs folder of the run is copied over to the run history

There are multiple ways to check the progress of a running job. We can use a Jupyter notebook widget.

Note: The widget will automatically update ever 10-15 seconds, always showing you the most up-to-date information about the run

In [27]:
from azureml.widgets import RunDetails
RunDetails(run).show()

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…



We can also periodically check the status of the run object, and navigate to Azure portal to monitor the run.

In [None]:
run.wait_for_completion(show_output=True)

## Show some metrics from the experiment run

In [16]:
run.get_metrics()

{'Loss': [1.0955833164708955,
  1.075153040779011,
  1.0529363508181722,
  1.0343222243903463,
  1.023221089166376,
  0.9967844357939579,
  0.9911308208388598,
  0.9797448336810809,
  0.9574432784666395,
  0.9499125918999916,
  0.9314793423152291,
  0.9293175091123367,
  0.9167442514222833,
  0.9081856488112376,
  0.8985620865372799,
  0.8870494969757149,
  0.8835849441220408,
  0.8736895226576937,
  0.8554565441448058,
  0.8586099487249093,
  0.8246951279618815,
  0.8404530548728635,
  0.8231880239841649,
  0.8058394335310556,
  0.8093994051351675,
  0.7874928226385416,
  0.7822506700396004,
  0.7824323663797079,
  0.7598635075872789,
  0.756289806601178,
  0.7387851864233145,
  0.7325044742079594,
  0.733957331811366,
  0.7251220433701314,
  0.7126694824128942,
  0.7044057672333824,
  0.7086608976526645,
  0.6988241864961359,
  0.6883164684334143,
  0.6872050270371373,
  0.6763843257865564,
  0.658654801781402,
  0.6673667163592283,
  0.6511916895083782,
  0.6381610899228152,
  0.646

In [17]:
run.get_details()

{'runId': 'speech_1562584118_2df48015',
 'target': 'cpucluster',
 'status': 'Running',
 'startTimeUtc': '2019-07-08T11:12:53.863613Z',
 'properties': {'azureml.runsource': 'experiment',
  'ContentSnapshotId': 'edac3eca-cfa8-4a4c-bd34-1cd41b45c850',
  'AzureML.DerivedImageName': 'azureml/azureml_d11eca37f102303834454016a14b7700'},
 'runDefinition': {'script': 'train_cnn_gen_large.py',
  'arguments': ['--data-folder',
   '$AZUREML_DATAREFERENCE_3983386c7b6245298bb9b4d5e56cbee3',
   '--batch-size',
   '32',
   '--x_filename',
   'x_speech_arrays_zip_4500.npz',
   '--y_filename',
   'y_speech_labels_4500.npz',
   '--training_size',
   '4500',
   '--n_epochs',
   '60'],
  'sourceDirectoryDataStore': None,
  'framework': 'Python',
  'communicator': 'None',
  'target': 'cpucluster',
  'dataReferences': {'3983386c7b6245298bb9b4d5e56cbee3': {'dataStoreName': 'workspaceblobstore',
    'mode': 'Download',
    'pathOnDataStore': 'speech_specs',
    'pathOnCompute': None,
    'overwrite': False}},


In [28]:
run.get_file_names()

['Accuracy vs Loss.png',
 'azureml-logs/55_batchai_execution.txt',
 'azureml-logs/60_control_log.txt',
 'azureml-logs/80_driver_log.txt',
 'logs/azureml/azureml.log',
 'outputs/model/model.h5',
 'outputs/model/model.json']

In [28]:
run.cancel()