# Create Azure and Batch AI Resources
In this notebook we will create the necessary resources to train a ResNet50 model([ResNet50](https://arxiv.org/abs/1512.03385)) in a distributed fashion using [Horovod](https://github.com/uber/horovod) on the Imagenet dataset. If you plan on using fake data then the sections marked optional can be skipped. This notebook will take you through the following steps:
 * [Create Azure Resources](#azure_resources)
 * [Create Fileserver(NFS)(Optional)](#create_fileshare)
 * [Upload Data to Blob (Optional)](#upload_data)
 * [Configure Batch AI Cluster](#configure_cluster)

In [3]:
import sys
sys.path.append("../common") 

from dotenv import dotenv_values, set_key, find_dotenv, get_key
from getpass import getpass
import os
import json
from utils import get_password, write_json_to_file, dotenv_for
from pathlib import Path

Below are the variables that describe our experiment. By default we are using the NC24rs_v3 (Standard_NC24rs_v3) VMs which have V100 GPUs and Infiniband. By default we are using 2 nodes with each node having 4 GPUs, this equates to 8 GPUs. Feel free to increase the number of nodes but be aware what limitations your subscription may have.

Set the USE_FAKE to True if you want to use fake data rather than the Imagenet dataset. This is often a good way to debug your models as well as checking what IO overhead is.

In [4]:
# Variables for Batch AI - change as necessary
ID                     = "ddpytorch"
GROUP_NAME             = f"batch{ID}rg"
STORAGE_ACCOUNT_NAME   = f"batch{ID}st"
FILE_SHARE_NAME        = f"batch{ID}share"
SELECTED_SUBSCRIPTION  = "Boston Team Danielle" #"<YOUR SUBSCRIPTION>"
WORKSPACE              = "workspace"
NUM_NODES              = 2
CLUSTER_NAME           = "msv100"
VM_SIZE                = "Standard_NC24rs_v3"
GPU_TYPE               = "V100"
PROCESSES_PER_NODE     = 4
LOCATION               = "eastus"
NFS_NAME               = f"batch{ID}nfs"
EXPERIMENT             = f"distributed_pytorch_{GPU_TYPE}"
USERNAME               = "batchai_user"
USE_FAKE               = False
DOCKERHUB              = "masalvar" #"<YOUR DOCKERHUB>"
DATA                   = Path("/data/imagenet")
CONTAINER_NAME         = f"batch{ID}container"

In [5]:
FAKE='-env FAKE=True' if USE_FAKE else ''
TOTAL_PROCESSES = PROCESSES_PER_NODE * NUM_NODES

<a id='azure_resources'></a>
## Create Azure Resources
First we need to log in to our Azure account. 

In [6]:
!az login -o table

[33mTo sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code AVJMP2QER to authenticate.[0m
CloudName    IsDefault    Name                        State     TenantId
-----------  -----------  --------------------------  --------  ------------------------------------
AzureCloud   True         Visual Studio Enterprise    Enabled   72f988bf-86f1-41af-91ab-2d7cd011db47
AzureCloud   False        Azure Internal - London     Enabled   72f988bf-86f1-41af-91ab-2d7cd011db47
AzureCloud   False        Boston Team Danielle        Enabled   72f988bf-86f1-41af-91ab-2d7cd011db47
AzureCloud   False        Cosmos_WDG_Core_BnB_100348  Enabled   72f988bf-86f1-41af-91ab-2d7cd011db47
AzureCloud   False        PhillyExt                   Enabled   72f988bf-86f1-41af-91ab-2d7cd011db47
AzureCloud   False        PhillyInt                   Enabled   72f988bf-86f1-41af-91ab-2d7cd011db47
AzureCloud   False        Azure Cat E2E               Enabled   72f988bf-86f1-41af-9

If you have more than one Azure account you will need to select it with the command below. If you only have one account you can skip this step.

In [7]:
!az account set --subscription "$SELECTED_SUBSCRIPTION"

In [8]:
!az account list -o table

[33mA few accounts are skipped as they don't have 'Enabled' state. Use '--all' to display them.[0m
Name                        CloudName    SubscriptionId                        State    IsDefault
--------------------------  -----------  ------------------------------------  -------  -----------
Visual Studio Enterprise    AzureCloud   fb11e9eb-22e1-4347-8d0a-84ef60157664  Enabled  False
Azure Internal - London     AzureCloud   1ba81249-8edd-4619-a486-3d28a2176aad  Enabled  False
Boston Team Danielle        AzureCloud   edf507a2-6235-46c5-b560-fd463ba2e771  Enabled  True
Cosmos_WDG_Core_BnB_100348  AzureCloud   dae41bd3-9db4-4b9b-943e-832b57cac828  Enabled  False
PhillyExt                   AzureCloud   a20c82c7-4497-4d44-952a-3105f790e26b  Enabled  False
PhillyInt                   AzureCloud   d50e5f6b-6c27-4ab1-8587-3d85cef6426e  Enabled  False
Azure Cat E2E               AzureCloud   fc4ea3c9-1d30-4f18-b33b-7404e7da0123  Enabled  False
CAT_Eng                     AzureC

Next we create the group that will hold all our Azure resources.

In [9]:
!az group create -n $GROUP_NAME -l $LOCATION -o table

Location    Name
----------  ----------------
eastus      batchddpytorchrg


We will create the storage account that will store our fileshare where all the outputs from the jobs will be stored.

In [10]:
json_data = !az storage account create -l $LOCATION -n $STORAGE_ACCOUNT_NAME -g $GROUP_NAME --sku Standard_LRS
print('Storage account {} provisioning state: {}'.format(STORAGE_ACCOUNT_NAME, 
                                                         json.loads(''.join(json_data))['provisioningState']))

Storage account batchddpytorchst provisioning state: Succeeded


In [11]:
json_data = !az storage account keys list -n $STORAGE_ACCOUNT_NAME -g $GROUP_NAME
storage_account_key = json.loads(''.join([i for i in json_data if 'WARNING' not in i]))[0]['value']

In [12]:
!az storage share create --account-name $STORAGE_ACCOUNT_NAME \
--account-key $storage_account_key --name $FILE_SHARE_NAME

{
  "created": true
}


In [13]:
!az storage directory create --share-name $FILE_SHARE_NAME  --name scripts \
--account-name $STORAGE_ACCOUNT_NAME --account-key $storage_account_key

{
  "created": true
}


Here we are setting some defaults so we don't have to keep adding them to every command

In [14]:
!az configure --defaults location=$LOCATION
!az configure --defaults group=$GROUP_NAME

In [15]:
%env AZURE_STORAGE_ACCOUNT $STORAGE_ACCOUNT_NAME
%env AZURE_STORAGE_KEY=$storage_account_key

env: AZURE_STORAGE_ACCOUNT=batchddpytorchst
env: AZURE_STORAGE_KEY=WrYVH6klSlvcccnpbLPD5dBTGuEEhIl9gf+1a3TsLuxo5566hx4NXnkMAm2MVNVxPFqYhuSSgdBhQ0ln4AC6nA==


#### Create Workspace
Batch AI has the concept of workspaces and experiments. Below we will create the workspace for our work.

In [33]:
!az batchai workspace create -n $WORKSPACE -g $GROUP_NAME

{
  "creationTime": "2018-11-28T10:15:46.868000+00:00",
  "id": "/subscriptions/edf507a2-6235-46c5-b560-fd463ba2e771/resourceGroups/batchddpytorchrg/providers/Microsoft.BatchAI/workspaces/workspace",
  "location": "eastus",
  "name": "workspace",
  "provisioningState": "succeeded",
  "provisioningStateTransitionTime": "2018-11-28T10:15:46.868000+00:00",
  "resourceGroup": "batchddpytorchrg",
  "tags": null,
  "type": "Microsoft.BatchAI/workspaces"
}


<a id='upload_data'></a>
## Upload Data to Blob (Optional)
In this section we will create a blob container and upload the imagenet data we prepared locally in the previous notebook.

In [16]:
!az storage container create --account-name {STORAGE_ACCOUNT_NAME} \
                             --account-key {storage_account_key} \
                             --name {CONTAINER_NAME}

{
  "created": true
}


In [20]:
# Should take about 20 minnutes
!azcopy --source {DATA/"train.tar.gz"} \
--destination https://{STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{CONTAINER_NAME}/train.tar.gz \
--dest-key {storage_account_key} --quiet

[?1h=[6nFinished: 0 file(s), 0 B; Average Speed:0 B/s.                                 [6n[1;1H[6nFinished: 0 file(s), 0 B; Average Speed:0 B/s.                                 [6n[1;1H[6nFinished: 0 file(s), 0 B; Average Speed:0 B/s.                                 [6n[1;1H[6nFinished: 0 file(s), 0 B; Average Speed:0 B/s.                                 [6n[1;1H[6nFinished: 0 file(s), 4 MB; Average Speed:428.89 KB/s.                          [6n[1;1H[6nFinished: 0 file(s), 148 MB; Average Speed:12.78 MB/s.                         [6n[1;1H[6nFinished: 0 file(s), 504 MB; Average Speed:36.97 MB/s.                         [6n[1;1H[6nFinished: 0 file(s), 596 MB; Average Speed:38.01 MB/s.                         [6n[1;1H[6nFinished: 0 file(s), 832 MB; Average Speed:46.93 MB/s.                         [6n[1;1H[6nFinished: 0 file(s), 1.098 GB; Average Speed:56.84 MB/s.                       [6n[1;1H[6nFinished: 0 file(s), 1.102 GB; Average Speed:51.69 MB/s.   

Finished: 0 file(s), 21.426 GB; Average Speed:105.35 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 21.813 GB; Average Speed:106.21 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 22.02 GB; Average Speed:106.18 MB/s.                      [6n[1;1H[6nFinished: 0 file(s), 22.301 GB; Average Speed:106.51 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 22.641 GB; Average Speed:107.11 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 22.867 GB; Average Speed:107.17 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 22.914 GB; Average Speed:106.39 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 23.395 GB; Average Speed:107.62 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 23.754 GB; Average Speed:108.28 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 24.152 GB; Average Speed:109.1 MB/s.                      [6n[1;1H[6nFinished: 0 file(s), 24.645 GB; Average Speed:110.33 MB/s.            

[1;1H[6nFinished: 0 file(s), 46.543 GB; Average Speed:118.35 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 46.91 GB; Average Speed:118.68 MB/s.                      [6n[1;1H[6nFinished: 0 file(s), 47.477 GB; Average Speed:119.51 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 47.832 GB; Average Speed:119.8 MB/s.                      [6n[1;1H[6nFinished: 0 file(s), 48.223 GB; Average Speed:120.18 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 48.668 GB; Average Speed:120.69 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 49.051 GB; Average Speed:121.04 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 49.32 GB; Average Speed:121.11 MB/s.                      [6n[1;1H[6nFinished: 0 file(s), 49.762 GB; Average Speed:121.59 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 49.902 GB; Average Speed:121.34 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 50.336 GB; Average Speed:121.81 MB/s.  

Finished: 0 file(s), 72.52 GB; Average Speed:127.15 MB/s.                      [6n[1;1H[6nFinished: 0 file(s), 72.93 GB; Average Speed:127.42 MB/s.                      [6n[1;1H[6nFinished: 0 file(s), 73.172 GB; Average Speed:127.4 MB/s.                      [6n[1;1H[6nFinished: 0 file(s), 73.453 GB; Average Speed:127.45 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 74.059 GB; Average Speed:128.05 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 74.098 GB; Average Speed:127.68 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 74.469 GB; Average Speed:127.88 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 74.809 GB; Average Speed:128.02 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 75.316 GB; Average Speed:128.45 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 75.391 GB; Average Speed:128.14 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 75.395 GB; Average Speed:127.23 MB/s.            

[1;1H[6nFinished: 0 file(s), 99.938 GB; Average Speed:133.15 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 100.23 GB; Average Speed:133.19 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 100.234 GB; Average Speed:132.83 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 100.859 GB; Average Speed:133.31 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 101.113 GB; Average Speed:133.29 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 101.328 GB; Average Speed:133.22 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 101.57 GB; Average Speed:133.19 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 101.684 GB; Average Speed:132.99 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 101.965 GB; Average Speed:133.01 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 102.41 GB; Average Speed:133.24 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 102.746 GB; Average Speed:133.33 MB/s. 

Finished: 0 file(s), 124.598 GB; Average Speed:134.11 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 124.988 GB; Average Speed:134.24 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 125.332 GB; Average Speed:134.32 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 125.715 GB; Average Speed:134.45 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 126.113 GB; Average Speed:134.58 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 126.117 GB; Average Speed:134.16 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 126.559 GB; Average Speed:134.34 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 127.02 GB; Average Speed:134.55 MB/s.                     [6n[1;1H[6nFinished: 0 file(s), 127.023 GB; Average Speed:134.26 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 127.316 GB; Average Speed:134.28 MB/s.                    [6n[1;1H[6nFinished: 0 file(s), 127.531 GB; Average Speed:134.23 MB/s.           

In [21]:
!azcopy --source {DATA/"validation.tar.gz"} \
--destination https://{STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{CONTAINER_NAME}/validation.tar.gz \
--dest-key {storage_account_key} --quiet

[?1h=[6nFinished: 0 file(s), 0 B; Average Speed:0 B/s.                                 [6n[1;1H[6nFinished: 0 file(s), 0 B; Average Speed:0 B/s.                                 [6n[1;1H[6nFinished: 0 file(s), 0 B; Average Speed:0 B/s.                                 [6n[1;1H[6nFinished: 0 file(s), 0 B; Average Speed:0 B/s.                                 [6n[1;1H[6nFinished: 0 file(s), 4 MB; Average Speed:448.83 KB/s.                          [6n[1;1H[6nFinished: 0 file(s), 180 MB; Average Speed:16.1 MB/s.                          [6n[1;1H[6nFinished: 0 file(s), 592 MB; Average Speed:44.76 MB/s.                         [6n[1;1H[6nFinished: 0 file(s), 688 MB; Average Speed:45.04 MB/s.                         [6n[1;1H[6nFinished: 0 file(s), 1.129 GB; Average Speed:66.73 MB/s.                       [6n[1;1H[6nFinished: 0 file(s), 1.133 GB; Average Speed:59.88 MB/s.                       [6n[1;1H[6nFinished: 0 file(s), 1.508 GB; Average Speed:72.09 MB/s.   

<a id='create_fileshare'></a>
## Create Fileserver (Optional)
In this example we will store the data on an NFS fileshare. It is possible to use many storage solutions with Batch AI. NFS offers the best traideoff between performance and ease of use. The best performance is achieved by loading the data locally but this can be cumbersome since it requires that the data is download by the all the nodes which with the imagenet dataset can take hours. 

In [40]:
!az batchai file-server create -n $NFS_NAME --disk-count 4 --disk-size 250 -w $WORKSPACE \
-s Standard_DS4_v2 -u $USERNAME -p {get_password(dotenv_for())} -g $GROUP_NAME --storage-sku Premium_LRS

Password not set
Please enter password to use for the cluster········
[K{- Finished ..
  "creationTime": "2018-11-28T11:21:55.710000+00:00",
  "dataDisks": {
    "cachingType": "none",
    "diskCount": 4,
    "diskSizeInGb": 250,
    "storageAccountType": "Premium_LRS"
  },
  "id": "/subscriptions/edf507a2-6235-46c5-b560-fd463ba2e771/resourceGroups/batchddpytorchrg/providers/Microsoft.BatchAI/workspaces/workspace/fileservers/batchddpytorchnfs",
  "mountSettings": {
    "fileServerInternalIp": "10.0.0.4",
    "fileServerPublicIp": "40.114.88.44",
    "mountPoint": "/data"
  },
  "name": "batchddpytorchnfs",
  "provisioningState": "succeeded",
  "provisioningStateTransitionTime": "2018-11-28T11:25:23.686000+00:00",
  "resourceGroup": "batchddpytorchrg",
  "sshConfiguration": {
    "publicIpsToAllow": null,
    "userAccountSettings": {
      "adminUserName": "batchai_user",
      "adminUserPassword": null,
      "adminUserSshPublicKey": null
    }
  },
  "subnet": {
    "id": "/subscript

In [41]:
!az batchai file-server list -o table -w $WORKSPACE -g $GROUP_NAME

Name               Resource Group    Size             Disks       Public IP     Internal IP    Mount Point
-----------------  ----------------  ---------------  ----------  ------------  -------------  -------------
batchddpytorchnfs  batchddpytorchrg  Standard_DS4_v2  4 x 250 Gb  40.114.88.44  10.0.0.4       /data


In [42]:
json_data = !az batchai file-server list -w $WORKSPACE -g $GROUP_NAME
nfs_ip=json.loads(''.join([i for i in json_data if 'WARNING' not in i]))[0]['mountSettings']['fileServerPublicIp']

After we have created the NFS share we need to copy the data to it. To do this we write the script below which will be executed on the fileserver. It installs a tool called azcopy and then downloads and extracts the data to the appropriate directory.

In [43]:
nodeprep_script = f"""
#!/usr/bin/env bash
wget https://gist.githubusercontent.com/msalvaris/073c28a9993d58498957294d20d74202/raw/87a78275879f7c9bb8d6fb9de8a2d2996bb66c24/install_azcopy
chmod 777 install_azcopy
sudo ./install_azcopy

mkdir -p /data/imagenet

azcopy --source https://{STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{CONTAINER_NAME}/validation.tar.gz \
        --destination  /data/imagenet/validation.tar.gz\
        --source-key {storage_account_key}\
        --quiet


azcopy --source https://{STORAGE_ACCOUNT_NAME}.blob.core.windows.net/{CONTAINER_NAME}/train.tar.gz \
        --destination  /data/imagenet/train.tar.gz\
        --source-key {storage_account_key}\
        --quiet

cd /data/imagenet
tar -xzf train.tar.gz
tar -xzf validation.tar.gz
"""

In [44]:
with open('nodeprep.sh', 'w') as f:
    f.write(nodeprep_script)

Next we will copy the file over and run it on the NFS VM. This will install azcopy and download and prepare the data

In [45]:
!sshpass -p {get_password(dotenv_for())} scp -o "StrictHostKeyChecking=no" nodeprep.sh $USERNAME@{nfs_ip}:~/



In [46]:
!sshpass -p {get_password(dotenv_for())} ssh -o "StrictHostKeyChecking=no" $USERNAME@{nfs_ip} "sudo chmod 777 ~/nodeprep.sh && ./nodeprep.sh"

--2018-11-28 11:27:12--  https://gist.githubusercontent.com/msalvaris/073c28a9993d58498957294d20d74202/raw/87a78275879f7c9bb8d6fb9de8a2d2996bb66c24/install_azcopy
Resolving gist.githubusercontent.com (gist.githubusercontent.com)... 151.101.32.133
Connecting to gist.githubusercontent.com (gist.githubusercontent.com)|151.101.32.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 481 [text/plain]
Saving to: ‘install_azcopy’

     0K                                                       100%  107M=0s

2018-11-28 11:27:12 (107 MB/s) - ‘install_azcopy’ saved [481/481]

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   983  100   983    0     0   4168      0 --:--:-- --:--:-- --:--:--  4182
Hit:1 http://azure.archive.ubuntu.com/ubuntu xenial InRelease
Get:2 http://azure.archive.ubuntu.com/ubuntu xenial-updates InRelease [109 kB]
Get:3 http://azure.archive

Processing triggers for libc-bin (2.23-0ubuntu10) ...
--2018-11-28 11:27:59--  https://aka.ms/downloadazcopyprlinux
Resolving aka.ms (aka.ms)... 23.222.209.19
Connecting to aka.ms (aka.ms)|23.222.209.19|:443... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: https://azcopy.azureedge.net/azcopy-7-1-0-netcorepreview/azcopy_7.1.0-netcorepreview_all.tar.gz [following]
--2018-11-28 11:27:59--  https://azcopy.azureedge.net/azcopy-7-1-0-netcorepreview/azcopy_7.1.0-netcorepreview_all.tar.gz
Resolving azcopy.azureedge.net (azcopy.azureedge.net)... 72.21.81.200, 2606:2800:11f:17a5:191a:18d5:537:22f9
Connecting to azcopy.azureedge.net (azcopy.azureedge.net)|72.21.81.200|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3841375 (3.7M) [application/octet-stream]
Saving to: ‘azcopy.tar.gz’

     0K .......... .......... .......... .......... ..........  1% 23.7M 0s
    50K .......... .......... .......... .......... ..........  2%  211M 0s
 


sent 11,683,102 bytes  received 1,290 bytes  23,368,784.00 bytes/sec
total size is 11,675,344  speedup is 1.00
[2018/11/28 11:28:20] Transfer summary:
-----------------
Total files transferred: 1
Transfer successfully:   1
Transfer skipped:        0
Transfer failed:         0
Elapsed time:            00.00:00:20
[2018/11/28 11:35:38] Transfer summary:
-----------------
Total files transferred: 1
Transfer successfully:   1
Transfer skipped:        0
Transfer failed:         0
Elapsed time:            00.00:07:12


<a id='configure_cluster'></a>
## Configure Batch AI Cluster
We then upload the scripts we wish to execute onto the fileshare. The fileshare will later be mounted by Batch AI. An alternative to uploading the scripts would be to embedd them inside the Docker container.

Below it the command to create the cluster.

In [None]:
!az batchai cluster create \
    -w $WORKSPACE \
    --name $CLUSTER_NAME \
    --image UbuntuLTS \
    --vm-size $VM_SIZE \
    --min $NUM_NODES --max $NUM_NODES \
    --afs-name $FILE_SHARE_NAME \
    --afs-mount-path extfs \
    --user-name $USERNAME \
    --password {get_password(dotenv_for())} \
    --storage-account-name $STORAGE_ACCOUNT_NAME \
    --storage-account-key $storage_account_key \
    --nfs $NFS_NAME \
    --nfs-mount-path nfs \
    --config-file cluster_config/cluster.json

Let's check that the cluster was created succesfully.

In [None]:
!az batchai cluster show -n $CLUSTER_NAME -w $WORKSPACE

In [None]:
!az batchai cluster list -w $WORKSPACE -o table

In [None]:
!az batchai cluster node list -c $CLUSTER_NAME -w $WORKSPACE -o table