# Create Cluster
Copy the datasets and scripts to the storage, and create the Batch AI cluster in the workspace.

The steps are
- [import the libraries and dotenv parameters](#import),
- [create a Batch AI client](#client),
- [copy the scripts and data to Azure storage](#copy), and
- [create the Batch AI cluster](#cluster).

## Imports <a id='import'></a>

In [None]:
from __future__ import print_function
import os
import sys
import glob
import dotenv
import azure.mgmt.batchai.models as models
from azure.storage.blob import BlockBlobService
from azure.storage.file import FileService
sys.path.append('.')
import utilities as utils
%load_ext dotenv

In the next cell are the names of various files and services used or created in this notebook.

In [None]:
# The location of the dotenv file
dotenv_path = dotenv.find_dotenv()
# The Azure blob container created for the datasets
dotenv.set_key(dotenv_path, 'azure_blob_container_name', 'batchaisample')
# The Azure blob container directory containing the datasets
dotenv.set_key(dotenv_path, 'dataset_path', 'dataset')
# The Azure file share created for the scripts and outputs
dotenv.set_key(dotenv_path, 'azure_file_share_name', 'batchaisample')
# The Azure file share directory containing the Python scripts
dotenv.set_key(dotenv_path, 'script_path', 'scripts')
# The script to be run
dotenv.set_key(dotenv_path, 'script_name', 'TrainTestClassifier.py')
# The Batch AI cluster
dotenv.set_key(dotenv_path, 'cluster_name', 'd4')

Import the contents of the `.env` file into the environment

In [None]:
%dotenv -o

Define Python variables used in this notebook.

In [None]:
configuration_path = os.getenv('configuration_path')
azure_blob_container_name = os.getenv('azure_blob_container_name')
dataset_path = os.getenv('dataset_path')
azure_file_share_name = os.getenv('azure_file_share_name')
script_path = os.getenv('script_path')
script_name = os.getenv('script_name')
cluster_name = os.getenv('cluster_name')

## Create a Batch AI client <a id='client'></a>
Read the configuration, and use it to create a Batch AI client.

In [None]:
cfg = utils.config.Configuration(configuration_path)
client = utils.config.create_batchai_client(cfg)

## Copy training datasets and script to Azure storage <a id='copy'></a>

### Azure blob container

We create a blob container named `batchaisample` in the storage account for storing the training and testing datasets created in the [data prep notebook](00_Data_Prep.ipynb).

**Note** You don't need to create new blob container for every cluster. We are doing this here to simplify resource management.

In [None]:
blob_service = BlockBlobService(cfg.storage_account_name, cfg.storage_account_key)
blob_service.create_container(azure_blob_container_name, fail_on_exist=False)

We upload the dataset TSVs to an Azure blob container directory named `dataset` using the Azure SDK for Python.

In [None]:
dataset_files = glob.glob('*.tsv')
for file in dataset_files:
    print(file)
    blob_service.create_blob_from_path(azure_blob_container_name, 
                                       dataset_path + '/' + file,
                                       file)

### Azure file share

We create a file share named `batchaisample` in the storage account to hold the training script file created in the [create model notebook](01_Create_Model.ipynb). This will also contain the output files created by the running script.

**Note** You don't need to create new file share for every cluster. We are doing here to simplify resource management.

In [None]:
file_service = FileService(cfg.storage_account_name, cfg.storage_account_key)
file_service.create_share(azure_file_share_name, fail_on_exist=False)

Upload the training script to file share scripts directory.

In [None]:
file_service.create_directory(
    azure_file_share_name, script_path, fail_on_exist=False)
file_service.create_file_from_path(
    azure_file_share_name, script_path, script_name, script_name)

## Create the Azure Batch AI compute cluster <a id='cluster'></a>

We will be creating a compute cluster named `d4` with `maximum_node_count` nodes of type `Standard_D4_v2`. We are using auto-scale settings so that the cluster will grow in size to meet the load when we submit jobs. Since you're charged for the Batch AI cluster while the nodes are running, we set the minimum number of nodes to 0 so that once the jobs are done, the cluster will shrink back down. At cluster creation time, one node will be allocated for initialization.

In [None]:
vm_size = 'Standard_D4_v2'
maximum_node_count = 16
scale_settings = models.ScaleSettings(
    auto_scale=models.AutoScaleSettings(minimum_node_count=0,
                                        maximum_node_count=maximum_node_count))

Put together the cluster configuration parameters structure.

In [None]:
cluster_parameters = models.ClusterCreateParameters(
    vm_size=vm_size,
    scale_settings=scale_settings,
    user_account_settings=models.UserAccountSettings(
        admin_user_name=cfg.admin,
        admin_user_password=cfg.admin_password or None,
        admin_user_ssh_public_key=cfg.admin_ssh_key or None,
    )
)

Create the cluster.

In [None]:
_ = client.clusters.create(cfg.resource_group, cfg.workspace, cluster_name, cluster_parameters).result()

Monitor the just created cluster. The `utilities` module contains a helper function to print out a detailed status of the cluster.

In [None]:
cluster = client.clusters.get(cfg.resource_group, cfg.workspace, cluster_name)
utils.cluster.print_cluster_status(cluster)

In the [next notebook](05_Hyperparameter_Search.ipynb), we set up and run the hyperparameter search to tune the parameters.