# Working with AML Compute and other AML Assets

In [None]:
import azureml.core
from azureml.core import Workspace, Environment, Dataset
import pandas as pd

In [None]:
USER_NAME = 'ENTER_YOUR_NAME_HERE'

ENVIRONMENT_NAME = f'sklearn-{USER_NAME}'
DATASET_NAME = f'diabetes-{USER_NAME}'

DATA_PATH = "./data"
BLOB_PATH = f'/data/{USER_NAME}'

ws = Workspace.from_config()

## AML Compute Assets

![Compute Assets](../../media/7-compute-assets.gif)

In [None]:
# Let's print out the details for each of our available compute targets
pd.DataFrame.from_records(
    [
        {'Compute Name': name, 
         'Compute Type': ct.type}
         for name, ct 
        in ws.compute_targets.items()
    ]
)

---
## AML Environments

With AML, you can define and register "Environments" - that can be used across AML. The environment can specify everything from the Docker base image used, to environment variables to set, and the Python packages to install. 

Let's define and register a new environment.

In [None]:
sklearn_env = Environment(ENVIRONMENT_NAME)
sklearn_env

### Adding more Python Packages
By default (at the time this notebook was written), our new environment is based on Python 3.6.2 and only has 1 pip package included. Let's update the python version and add a few more required packages.

In [None]:
# Update Python Version
sklearn_env.python.conda_dependencies.set_python_version("3.9.2")

# Add conda packages
sklearn_env.python.conda_dependencies.add_conda_package("pip")
sklearn_env.python.conda_dependencies.add_conda_package("scikit-learn=0.24.1")
sklearn_env.python.conda_dependencies.add_conda_package("seaborn=0.11.1")
sklearn_env.python.conda_dependencies.add_conda_package("click=7.1.2")
sklearn_env.python.conda_dependencies.add_conda_package("joblib=1.0.1")

# Enable Docker for the environment
sklearn_env.docker.enabled = True

In [None]:
sklearn_env

### Environment Registration
Next - let's "register" this environment to the AML Workspace. That will allow us leverage the environment during multiple use cases.

In [None]:
sklearn_env = sklearn_env.register(ws)

In [None]:
# Optional - build the underlying Docker container
build = sklearn_env.build(ws)

In [None]:
# If azureml.widgets we can look into the build progress.
build.wait_for_completion(show_output=True)

---
## Datastores and Datasets

Next, we will examine Datastores and Datasets. 

In [None]:
print("Datastores:")
for k in ws.datastores.keys():
    print(f"- {k}")
print()

print("Datasets:")
for k in ws.datasets.keys():
    print(f"- {k}")

### Creating Diabetes Dataset
Next, let's create a new dataset for our diabetes data using the CSV file in the data folder. We can do this a few ways. 

First, it's possible in the UI.
<br>![Dataset Creation](../../media/8-dataset-creation-ui.gif)

Below, we'll be uploading and using programatically.

In [None]:
# First, we'll upload the diabetes CSV to our workspaceblobstore
datastore = ws.datastores['workspaceblobstore']

uploaded_file = datastore.upload_files(
    files=[f'{DATA_PATH}/diabetes.csv'], 
    relative_root=DATA_PATH, 
    target_path=BLOB_PATH
)

In [None]:
# Next, we'll register this dataset.
dataset = Dataset.File.from_files((datastore, f'{BLOB_PATH}/diabetes.csv'))

dataset = dataset.register(
    workspace=ws, 
    name=DATASET_NAME, 
    description=f"The diabetes CSV file for {USER_NAME}",
    create_new_version=True
)

In [None]:
print("Datastores:")
for k in ws.datastores.keys():
    print(f"- {k}")
print()

print("Datasets:")
for k in ws.datasets.keys():
    print(f"- {k}")

<br><br><br><br><br>






###### Copyright (c) Microsoft Corporation. All rights reserved.  
###### Licensed under the MIT License.