In [None]:
# Load Workspace
import azureml.core
from azureml.core import Workspace

# Load the workspace from the saved config file
ws = Workspace.from_config()
print('Ready to use Azure ML {} to work with {}'.format(azureml.core.VERSION, ws.name))

# Load Credentials
from kedro.config import ConfigLoader

conf_paths = ["conf/base", "conf/local"]
conf_loader = ConfigLoader(conf_paths)
conf_catalog = conf_loader.get("credentials*")

# 1. Datastores

In Azure Machine Learning, datastores are abstraction/connectors for cloud data sources. They contain all the information required to connect to data sources. They can be used to:

* Ingest Data into experiment
* Write Outputs from an experiment

### 1.1 Type of Datastore

Azure Machine Learning supports the creation of <b> datastores </b> for multiple kinds of Azure data source, including:

* Azure Storage (blob and file containers)
* Azure Data Lake stores
* Azure SQL Database
* Azure Databricks file system (DBFS)

### 1.2 Recommended Azure Machine Learning Data Workflow

This workflow assumes you have an Azure storage account and data in a cloud-based storage service in Azure.

1. Create an Azure Machine Learning datastore to store connection information to your Azure storage.

2. From that datastore, create an Azure Machine Learning dataset to point to a specific file(s) in your underlying storage.

3. To use that dataset in your machine learning experiment you can either

   * a. Mount it to your experiment's compute target for model training.
    
    OR

   * b. consume it directly in Azure Machine Learning solutions like, automated machine learning (automated ML) experiment runs, machine learning pipelines, or the Azure Machine Learning designer.

4. Create dataset monitors for your model output dataset to detect for data drift.

5. If data drift is detected, update your input dataset and retrain your model accordingly.

The following diagram provides a visual demonstration of this recommended workflow.
<img src="https://docs.microsoft.com/en-gb/azure/machine-learning/media/concept-data/data-concept-diagram.svg">


In [None]:
# Get the default datastore
default_ds = ws.get_default_datastore()

# Enumerate all datastores, indicating which is the default
for ds_name in ws.datastores:
    print(ds_name, "- Default =", ds_name == default_ds.name)

# 2. Adding Data Stores to a Workspace

Every workspace has two built-in datastores:

* an Azure Storage blob container, and
* an Azure Storage file container)

that are used as system storage by Azure Machine Learning. You can also store a limited amount of your own data in these built-in datastores for experiments, model training, and so on.

However, in most machine learning projects, you will likely need to work with data sources of your own - either because you need to store larger volumes of data than the built-in datastores support, or because you need to integrate your machine learning solution with data from existing applications.

## 2.1 Registering a Datastore

To add a datastore to your workspace, you can register it using the graphical interface in Azure Machine Learning Studio, or you can use the Azure Machine Learning SDK. For example, the following code registers an Azure Storage blob container as a datastore named <b>blob_data</b>.

```python
from azureml.core import Workspace, Datastore

ws = Workspace.from_config()

# Register a new datastore
blob_ds = Datastore.register_azure_blob_container(workspace=ws, 
                                                  datastore_name='blob_data', 
                                                  container_name='data_container',
                                                  account_name='az_store_acct',
                                                  account_key='123456abcde789…')
```


In [None]:
from azureml.core import Workspace, Datastore

# Register a new datastore
blob_ds = Datastore.register_azure_blob_container(workspace=ws, 
                                                  datastore_name=conf_catalog['azuremlprimary']['storage_name'], 
                                                  container_name='datacontainer',
                                                  account_name=conf_catalog['azuremlprimary']['storage_name'],
                                                  account_key=conf_catalog['azuremlprimary']['key'])

Managing Datastores

You can view and manage datastores in Azure Machine Learning Studio, or you can use the Azure Machine Learning SDK. For example, the following code lists the names of each datastore in the workspace.

>```python
for ds_name in ws.datastores:
    print(ds_name)
```

You can get a reference to any datastore by using the Datastore.get() method as shown here:

>```python
blob_store = Datastore.get(ws, datastore_name='blob_data')
    print(ds_name)
```

The workspace always includes a default datastore (initially, this is the built-in workspaceblobstore datastore), which you can retrieve by using the get_default_datastore() method of a Workspace object, like this:

>```python
default_store = ws.get_default_datastore()
```

To change the default datastore, use the set_default_datastore() method:


>```python
ws.set_default_datastore('blob_data')
```

In [None]:
for ds_name in ws.datastores:
    print(ds_name)

In [None]:
blob_store = Datastore.get(ws, datastore_name='azuremlprimary')

ws.set_default_datastore('azuremlprimary')
