Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Setup AML Workspace
---

This notebook walks you through all the necessary steps to configure your environment for this solution accelerator including:

1. Setting the path to the utils folder to take advantage of environment variables
2. Connecting to your workspace 
3. Create a config.json (this can be skipped if running on an AML compute instance)
4. Create a compute cluster


## Environment Variable

This will allow the user or operations to set environment variables on the local compute or the build agent. This way values can be changed simply by using the name of the environment variable and replacing it with a new value whether that occurs locally or while deploying through Operations.

[NOTE] If you change values in the .env file you must close and re-open your IDE. This is necessary because the values in the .env file are set on open. Similarly, if you change values on the compute you will have to restart terminal. If you are using Linux or Mac you can restart your bash by executing `source ~/.bashrc`.


In [1]:
import sys
sys.path.append("../../")

from utils.env_variables import Env
from utils.aml_workspace import Connect
e=Env()

Failure while loading azureml_run_type_providers. Failed to load entrypoint hyperdrive = azureml.train.hyperdrive:HyperDriveRun._from_run_dto with exception (cryptography 2.9.2 (/home/brandon/miniconda3/envs/azureml/lib/python3.7/site-packages), Requirement.parse('cryptography>=3.2'), {'pyopenssl', 'PyOpenSSL'}).
Failure while loading azureml_run_type_providers. Failed to load entrypoint automl = azureml.train.automl.run:AutoMLRun._from_run_dto with exception (cryptography 2.9.2 (/home/brandon/miniconda3/envs/azureml/lib/python3.7/site-packages), Requirement.parse('cryptography>=3.2'), {'pyopenssl', 'PyOpenSSL'}).
Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.PipelineRun = azureml.pipeline.core.run:PipelineRun._from_dto with exception (cryptography 2.9.2 (/home/brandon/miniconda3/envs/azureml/lib/python3.7/site-packages), Requirement.parse('cryptography>=3.2'), {'pyopenssl', 'PyOpenSSL'}).
Failure while loading azureml_run_type_providers. Failed to

## 1.0 Connect to workspace

Connect this solution accelerator to your AML workspace. This step isn't necessary if you're using a Notebook VM.

The following cell allows you to specify your workspace parameters. This cell uses the python method os.getenv to read values from environment variables which is useful for automation. If no environment variable exists, the parameters will be set to the specified default values.

In [2]:
%load_ext dotenv
import os

subscription_id = e.subscription_id
resource_group = e.resource_group
workspace_name = e.workspace_name
workspace_region = e.workspace_region

In [3]:
from azureml.core import Workspace

try:
    connect = Connect()
    ws = connect.authenticate()
    print("Workspace configuration succeeded. Skip the workspace creation steps below")
except:
    print("Workspace does not exist. Creating workspace")
    ws = Workspace.create(name=workspace_name, subscription_id=subscription_id, resource_group=resource_group,
                            location=workspace_region, create_resource_group=True, sku='enterprise', exist_ok=True)

If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


Workspace configuration succeeded. Skip the workspace creation steps below


In [4]:
ws.get_details()

{'id': '/subscriptions/af3877c2-18a2-4ce2-b67c-a8e21e968128/resourceGroups/coding-forge-rg/providers/Microsoft.MachineLearningServices/workspaces/coding-forge-ml-ws',
 'name': 'coding-forge-ml-ws',
 'identity': {'principal_id': '62dda26f-3571-42bf-8c44-ccb3db7f889b',
  'tenant_id': '72f988bf-86f1-41af-91ab-2d7cd011db47',
  'type': 'SystemAssigned'},
 'location': 'eastus',
 'type': 'Microsoft.MachineLearningServices/workspaces',
 'tags': {'Created By': 'brandon campbell',
  'contact': 'brandon.campbell@microsoft.com',
  'phone': '770.853.0352'},
 'sku': 'Basic',
 'workspaceid': 'e057c1f8-0319-435b-92cd-84642f9e37d5',
 'sdkTelemetryAppInsightsKey': 'f5784ccd-178d-4ecc-9998-b05841b44ae9',
 'description': '',
 'friendlyName': 'coding-forge-ml-ws',
 'creationTime': '2021-09-01T12:30:38.7634249+00:00',
 'containerRegistry': '/subscriptions/af3877c2-18a2-4ce2-b67c-a8e21e968128/resourcegroups/coding-forge-rg/providers/microsoft.containerregistry/registries/codingforgeacr',
 'keyVault': '/subsc

## 2.0 Write config file
Write the details of the workspace to a config.json file:

In [6]:
from azureml.core import Workspace
#ws = Workspace.from_config()
ws.write_config()

## 3.0 Create compute cluster

You will need a compute cluster for training and batch forecasting.
This is a one-time set up so you won't need to re-run this in future notebooks, but you'll need to use the same cluster name to reference this cluster in following notebooks.

In [7]:
# Choose a name for your CPU cluster
cpu_cluster_name = "gpu-clusters" # e.compute_name

We create a STANDARD_D13_V2 compute cluster. D-series VMs are used for tasks that require higher compute power and temporary disk performance. This [page](https://docs.microsoft.com/azure/cloud-services/cloud-services-sizes-specs) will give you more information on VM sizes to help you decide which will best fit your use case.

In [8]:
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException

# Verify that cluster does not exist already
try:
    cpu_cluster = ComputeTarget(workspace=ws, name=cpu_cluster_name)
    print('Found an existing cluster, using it instead.')
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size="STANDARD_NC6",
                                                           min_nodes=e.min_nodes,
                                                           max_nodes=e.max_nodes)
    cpu_cluster = ComputeTarget.create(ws, cpu_cluster_name, compute_config)
    cpu_cluster.wait_for_completion(show_output=True)

InProgress...
SucceededProvisioning operation finished, operation "Succeeded"
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned


## Next Steps

Now that the AML Workspace its configured, it's time to create the sample datasets. Follow the steps in [01_Data_Preparation.ipynb](01_Data_Preparation.ipynb) for that.