Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Installation and configuration
This notebook configures the notebooks in this tutorial to connect to an Azure Machine Learning (AML) Workspace.  You can use an existing workspace or create a new one.

In [1]:
from pathlib import Path

import azureml.core
from AIHelpers.utilities import get_auth
from azureml.core import Workspace, Datastore

## Prerequisites

If you have already completed the prerequisites and selected the correct Kernel for this notebook, the AML Python SDK is already installed. Let's check the AML SDK version.

In [25]:
print("AML SDK Version:", azureml.core.VERSION)

AML SDK Version: 1.0.76


## Set up your Azure Machine Learning workspace

To create or access an Azure ML Workspace, you will need the following information:

* Your subscription id
* A resource group name
* A name for your workspace
* A region for your workspace

**Note**: As with other Azure services, there are limits on certain resources like cluster size associated with the Azure Machine Learning service. Please read [this article](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-manage-quotas) on the default limits and how to request more quota.

If you have a workspace created already, you need to get your subscription and workspace information. You can find the values for those by visiting your workspace in the [Azure portal](http://portal.azure.com). If you don't have a workspace, the create workspace command in the next section will create a resource group and a workspace using the names you provide.

Replace the values in the following cell with your information. If you would like to use service principal authentication as described [here](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/manage-azureml-service/authentication-in-azureml/authentication-in-azure-ml.ipynb) make sure you provide the optional values as well. 

In [None]:
from dotenv import   find_dotenv
env_path = find_dotenv()
if env_path == "":
    Path(".env").touch()
    env_path = find_dotenv()

In [7]:
import yaml

with open("workspace_conf.yml", 'r') as ymlfile:
    cfg = yaml.load(ymlfile)

  after removing the cwd from sys.path.


In [18]:
# Azure resources
subscription_id = cfg['subscription_id']
resource_group = cfg['resource_group']  
workspace_name = cfg['workspace_name']  
workspace_region = cfg['workspace_region']
image_name = cfg['image_name']  # e.g. image_name = "{{cookiecutter.image_name}} (avoid underscore in names)"

sql_server_name=cfg['sql_server_name']    # Name of Azure SQL server
sql_database_name=cfg['sql_database_name']  # Name of Azure SQL database
sql_username=cfg['sql_username']        # The username of the database user to access the database.
sql_password=cfg['sql_password']        # The password of the database user to access the database.

datastore_rg = cfg['datastore_rg']
container_name=cfg['container_name']    # Name of Azure blob container
account_name=cfg['account_name']    # Storage account name
account_key=cfg['account_key']         # Storage account key

tenant_id = "YOUR_TENANT_ID" # Optional for service principal authentication
username = "YOUR_SERVICE_PRINCIPAL_APPLICATION_ID" # Optional for service principal authentication
password = "YOUR_SERVICE_PRINCIPAL_PASSWORD" # Optional for service principal authentication
storageConnString = "YOUR_STORAGE_CONNECTION_STRING"


In [19]:
sql_datastore_name = "ado_sql_datastore" # Hard Code this for this example
blob_datastore_name="ado_blob_datastore"

#### Create the workspace
This cell will create an AML workspace for you in a subscription, provided you have the correct permissions.

This will fail when:
1. You do not have permission to create a workspace in the resource group
2. You do not have permission to create a resource group if it's non-existing.
2. You are not a subscription owner or contributor and no Azure ML workspaces have ever been created in this subscription

If workspace creation fails, please work with your IT admin to provide you with the appropriate permissions or to provision the required resources. If this cell succeeds, you're done configuring AML!  


In [27]:
ws = Workspace.create(
    name=workspace_name,
    subscription_id=subscription_id,
    resource_group=resource_group,
    location=workspace_region,
    create_resource_group=True,
    auth=get_auth(env_path),
    exist_ok=True,
)

Let's check the details of the workspace.

In [28]:
ws.get_details()

{'id': '/subscriptions/0ca618d2-22a8-413a-96d0-0f1b531129c3/resourceGroups/dciborow-ado-test/providers/Microsoft.MachineLearningServices/workspaces/dciborowws',
 'name': 'dciborowws',
 'location': 'westus',
 'type': 'Microsoft.MachineLearningServices/workspaces',
 'sku': 'Enterprise',
 'workspaceid': '3db1660f-e512-4152-a3ec-387abea8b04a',
 'description': '',
 'friendlyName': 'dciborowws',
 'creationTime': '2019-12-18T01:26:53.6336661+00:00',
 'containerRegistry': '/subscriptions/0ca618d2-22a8-413a-96d0-0f1b531129c3/resourceGroups/dciborow-ado-test/providers/Microsoft.ContainerRegistry/registries/dciborowws193f0455',
 'keyVault': '/subscriptions/0ca618d2-22a8-413a-96d0-0f1b531129c3/resourcegroups/dciborow-ado-test/providers/microsoft.keyvault/vaults/dciborowkeyvaultb9b3508b',
 'applicationInsights': '/subscriptions/0ca618d2-22a8-413a-96d0-0f1b531129c3/resourcegroups/dciborow-ado-test/providers/microsoft.insights/components/dciborowinsights3fa1a864',
 'identityPrincipalId': 'ccc30575-90

In [29]:
blob_datastore = Datastore.register_azure_blob_container(workspace=ws, 
                                                         datastore_name=blob_datastore_name, 
                                                         container_name=container_name, 
                                                         account_name=account_name,
                                                         account_key=account_key,
                                                         resource_group=datastore_rg,
                                                         overwrite=True)

In [30]:
from azureml.core.dataset import Dataset

datastore_paths = [(blob_datastore, 'ai_impact_scores.csv')]
 
ai_impact_scores = Dataset.Tabular.from_delimited_files(path=datastore_paths)

ai_impact_scores.register(workspace=ws,
                          name="ai_impact_scores",
                          description = "ai subset of feedback items")


{
  "source": [
    "('ado_blob_datastore', 'ai_impact_scores.csv')"
  ],
  "definition": [
    "GetDatastoreFiles",
    "ParseDelimited",
    "DropColumns",
    "SetColumnTypes"
  ],
  "registration": {
    "id": "cfbfaa8a-e68b-438a-9a23-be134b15c3da",
    "name": "ai_impact_scores",
    "version": 1,
    "description": "ai subset of feedback items",
    "workspace": "Workspace.create(name='dciborowws', subscription_id='0ca618d2-22a8-413a-96d0-0f1b531129c3', resource_group='dciborow-ado-test')"
  }
}

In [None]:
from azureml.core.dataset import Dataset

datastore_paths = [(blob_datastore, 'feedback_items.csv')]
 
ai_impact_scores = Dataset.Tabular.from_delimited_files(path=datastore_paths)

ai_impact_scores.register(workspace=ws,
                          name="feedback_items_sample",
                          description = "subset of feedback items from SQL")

In [None]:
from azureml.core.dataset import Dataset

datastore_paths = [(blob_datastore, 'improvements.csv')]
 
ai_impact_scores = Dataset.Tabular.from_delimited_files(path=datastore_paths)

ai_impact_scores.register(workspace=ws,
                          name="improvements_sample",
                          description = "subset of improvements items from SQL")

In [31]:
sql_datastore = Datastore.register_azure_sql_database(workspace=ws,
                                                      datastore_name=sql_datastore_name,
                                                      server_name=sql_server_name,
                                                      database_name=sql_database_name,
                                                      username=sql_username,
                                                      password=sql_password)

Let's write the workspace configuration for the rest of the notebooks to connect to the workspace.

In [33]:
ws.write_config()

You are now ready to move on to the [data preperation](01_DataPrep.ipynb) notebook.