# Access Azure blob and file stores using RBAC, SAS tokens, and the Python Storage SDK

In this example, we create a new storage account and explore role and token-based access to File Stores and Blob Storage beyond the Workspace. Role Based Access Control (RBAC) and Secure Access Signatures (SAS) offer ways to control access to data and enable managed entities such as Managed Online Endpoints to authenticate themselves to other services. To access datastores from online endpoints in simple scenarios without access controls, see [Mount Workspace Storage in a Managed Online Endpoint](). 

## Prerequisites

* To use Azure Machine Learning, you must have an Azure subscription. If you don't have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning](https://azure.microsoft.com/free/).

* Install and configure the [Python SDK v2](sdk/setup.sh).

* You must have an Azure resource group, and you (or the service principal you use) must have Contributor access to it.

* You must have an Azure Machine Learning workspace. 

* You must have an Azure Secure Container registry. One is created automatically created for a workspace without one upon first usage, however in this example we explicitly reference the container registry by name, so you need it beforehand. You can create one through the Azure Portal. 

# 1. Connect to Azure Machine Learning Workspace

## 1.1. Import the required libraries

In [10]:
from azure.ml import MLClient
from azure.storage.blob import BlobClient, BlobSasPermissions,BlobServiceClient, ContainerSasPermissions, generate_container_sas
from azure.mgmt.storage import StorageManagementClient
from azure.mgmt.storage.models import StorageAccountCreateParameters, Sku, FileShare, BlobContainer
from azure.mgmt.containerregistry import ContainerRegistryManagementClient
from azure.mgmt.authorization import AuthorizationManagementClient
from azure.ml.entities import ManagedOnlineDeployment, ManagedOnlineEndpoint
from azure.identity import DefaultAzureCredential,InteractiveBrowserCredential

In [1]:
# Enter details of your AML Workspace
subscription_id = '<YOUR_SUBSCRIPTION_ID>'
resource_group = '<YOUR_RESOURCE_GROUP>'
workspace = '<YOUR_WORKSPACE>'
# 
container_registry_name = '<YOUR_CONTAINER_REGISTRY>'
storage_account_name = '<YOUR_STORAGE_ACCOUNT_NAME>'
# 
new_storage_account_name = '<NEW_STORAGE_ACCOUNT_NAME>'
new_file_share_name = '<NEW_FILE_SHARE_NAME>'
new_container_name = "<NEW_CONTAINER_NAME>"
new_blob_name = "<NEW_BLOB_NAME>"

In [2]:
subscription_id = '6fe1c377-b645-4e8e-b588-52e57cc856b2'    
resource_group = 'v-alwallace-test'
workspace = 'valwallace'
container_registry_name = 'valwallaceskr'
existing_storage_account_name = 'valwallacestorage'
new_storage_account_name = 'awstoracc'
new_file_share_name = 'endptshare'
new_container_name = "endptcontainer"
new_blob_name = "endptblob"

## 1.2 Configure workspace details and get Azure handles

In [9]:
ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)
storage_client = StorageManagementClient(DefaultAzureCredential(),subscription_id)

NameError: name 'DefaultAzureCredential' is not defined

## 1.2. 

### Create an endpoint

In [None]:
from random import randint
# Required
deployment_name = 'docker-storage'
container_name = 'docker-storage'
# Optional
endpoint_name = f'docker-storage-{randint(1e3,1e7)}'

endpoint = ManagedOnlineEndpoint(name=endpoint_name)
ml_client.online_endpoints.begin_create_or_update(endpoint, local=True)

# Check endpoint status
print(f'Endpoint Status: {endpoint.provisioning_state}')

## Allow a managed endpoint to access a file store with RBAC

Use a storage client to enable the initial privisioning of new storage resources.

`StorageManagementClient(DefaultAzureCredential(),subscription_id)`

### Create a new storage account and file share

In [None]:
# Create a new storage account
response = storage_client.storage_accounts.begin_create(
                resource_group,
                new_storage_account_name,
                StorageAccountCreateParameters(
                    sku=Sku(name='Standard_LRS'),
                    kind='Storage',
                    location='eastus2'))

stor_acct_details = response.result()

# Create a file share
storage_client.file_shares.create(resource_group,
                                  new_storage_account_name,
                                  new_file_share_name,
                                  file_share=FileShare())

### Inspect properties, principal IDs and account IDs

In [None]:
# Retrieve the new file share
new_fs = storage_client.file_shares.get(resource_group,
                                        new_storage_account_name,
                                        new_file_share_name) 
# View storage properties
storage_acct_properties = storage_client.storage_accounts.get_properties(resource_group, 
                                                                         new_storage_account_name)

# Get system identity
system_identity = endpoint.identity.principal_id
storage_id = storage_acct_properties.id

### Create a role assigment for the endpoint

In [None]:
# Use the service principal to allow the endpoint to connect to Samba
!az role assignment create --assignee-object-id {system_identity} --assignee-principal-type ServicePrincipal --role "Storage File Data SMB Share Reader" --scope {storage_id}

## Create a blob container with Shared Access Signature (SAS) access
Begin with the general storage client and use it to retrieve the keys to the storage account.

In [None]:
from datetime import datetime, timedelta 
container = storage_client.blob_containers.create(resource_group,
                                                  new_storage_account_name,
                                                  new_container_name,
                                                  BlobContainer())

storage_url = storage_acct_properties.primary_endpoints.

keys = storage_client.storage_accounts.list_keys(resource_group, 
                                                 new_storage_account_name)

### Use a storage account key to generate a SAS token for a blob container 

In [None]:
container_permissions = ContainerSasPermissions(
    read=True, add=True, list=True, tag=True,
    move=True, create=True,write=True, delete=True)

privileged_container_sas = generate_container_sas(
                                new_storage_account_name,
                                new_container_name,
                                account_key=keys.keys[0].value,
                                expiry = (datetime.utcnow() + timedelta(hours=1)),
                                permission=container_permissions)

### Use a blob service client to generate admin blob and container clients

In [None]:
privileged_blobserv_client = BlobServiceClient(storage_url,
                                               privileged_container_sas)
privileged_container_client = privileged_blobserv_client.get_container_client(new_container_name)
privileged_blob_client = privileged_blobserv_client.get_blob_client(new_container_name,
                                                                    new_blob_name)

# Upload a trained model to blob storage
if not privileged_blob_client.exists: 
    with open('docker_storage/sklearn_regression_model.pkl', 'rb') as f:
        privileged_blob_client.upload_blob(f)

### Get read-only SAS tokens to provide to consumers

In [None]:
readonly_container_sas = generate_container_sas(
                            new_storage_account_name,
                            new_container_name,
                            account_key=keys.keys[0].value,
                            expiry = datetime.today() + timedelta(days=30),
                            permission=BlobSasPermissions(read=True))
                            
readonly_blobserv_client = BlobServiceClient(storage_url,
                                             readonly_container_sas)

## Access a blob from a container with Azure storage clients and SAS tokens

##### Compile signatures and sensitive environment variables

In [None]:
import json
with open('storageclient.config', 'wb') as f:
    json.dump({'account': new_storage_account_name,
                'container': new_container_name, 
                'blob': new_blob_name, 
                'credential': readonly_container_sas},
                file=f)

### Adjust the score.py file to pull a scored model from the blob

```python
config = json.load('/run/secrets/storageclient.config')
```

```python
client = BlobClient(config['account'],
                    config['container'],
                    config['blob'],
                    credential=config['credential'])
```
```python
if client.exists: 
    with open(model_path, 'wb') as f:
        model = pickle.loads(f.download_blob().read_all())
else: 
    raise FileNotFoundError('No model was found in blob storage')
```

### Obfuscate the configuration with Docker secrets and push to ACR

In [None]:
!az login 
!az acr login --name {container_registry_name}
!docker build --secret id=storageclientconfig,src=storageclient.config -t  {container_registry_name}.azurecr.io/storage-client . 

### Configure the deployment with YAML 
This deployment has additional Conda dependencies. We update default values below. 

```YAML
$schema: https://azuremlschemas.azureedge.net/latest/managedOnlineDeployment.schema.json
name: storage-client 
endpoint_name: endpoint-name
code_configuration: 
  code: "."
  scoring_script: score.py
environment:
  image: container_registry.azurecr.io/docker-conda:latest
  conda_file : /docker_storage/conda.yml 
instance_type: Standard_F2s_v2
instance_count: 1
```

In [None]:
import yaml
with open('deployment_local.yml','r') as f:
    deployment_yaml = yaml.safe_load(f)

deployment_yaml['endpoint_name'] = endpoint_name
deployment_yaml['environment']['image'] = f'{container_registry_name}.azurecr.io/{container_name}:latest'

### Create the endpoint and deployment

In [None]:
import os
endpoint = ManagedOnlineEndpoint(name=f'{endpoint_name}')
ml_client.online_endpoints.begin_create_or_update(endpoint, local=True)
endpoint.traffic = 
deployment = ManagedOnlineDeployment.load(os.path.join(deployment_name,'deployment.yml'))
ml_client.online_deployments.begin_create_or_update(deployment)

In [None]:
auth_token = ml_client.online_endpoints.list_keys(endpoint_name).primary_key
endpoint = ml_client.online_endpoints.get(endpoint_name)
scoring_uri = endpoint.scoring_uri

with open('sample-request.json') as f:
    data = json.loads(f.read())
ml_client.online_endpoints.invoke(endpoint.name, data)