# Building Machine Learning Systems That Don't Suck


References:

1. [Azure ML Studio Setup](https://learn.microsoft.com/en-us/azure/machine-learning/quickstart-create-resources?view=azureml-api-2)
2. [ML Studio Compute creation](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-compute-instance?view=azureml-api-2&tabs=python)
3. [ML Studio instance pricing](https://azure.microsoft.com/en-us/pricing/details/machine-learning/#pricing)
4. [AWS-Azure Compute comparison](https://www.justaftermidnight247.com/insights/aws-to-azure-instance-mapping-for-easy-comparison/)

Also see `azure_setup.md`. This file documents initial setup steps to get you up and running


## Section 1: Introduction and Initial Setup


In [1]:
%load_ext autoreload
%autoreload 2
%load_ext dotenv
%dotenv

import logging
import sys
from pathlib import Path

import ipytest

CODE_FOLDER = Path("code")
sys.path.extend([f"./{CODE_FOLDER}"])

DATA_FILEPATH = "penguins.csv"

ipytest.autoconfig(raise_on_error=True)

# By default, basic information about HTTP sessions (URLs, headers, etc.)
# is logged at INFO level. Detailed DEBUG level logging, including request/response
# bodies and unredacted headers, can be enabled on a client with the `logging_enable` argument.
# See full SDK logging documentation with examples in the link below.
# https://learn.microsoft.com/en-us/azure/developer/python/sdk/azure-sdk-logging
#
# To prevent these from spoiling the output of this notebook cells,
# we can change the logging  level to ERROR instead.
logging.getLogger("azure.ai.ml").setLevel(logging.WARNING)

Let's now load the workspace configuration using the sdk and create a client for later use.


In [2]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

client = MLClient.from_config(
    credential=DefaultAzureCredential(), file_path="../config.json"
)

Found the config file in: D:\DataspellProjects\ml.school\config.json


If you are running the pipeline in Local Mode on an ARM64 machine (for example, on Apple Silicon), you will need to use a custom Docker image to train and evaluate the model. Let's create a variable indicating if we are running on an ARM64 machine.


In [3]:
# We can retrieve the architecture of the local
# computer using the `uname -m` command.
architecture = !(uname -m)

IS_ARM64_ARCHITECTURE = architecture[0] == "arm64"

In [4]:
# Creating/manage compute
# https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-compute-instance?view=azureml-api-2&tabs=python#create

# Compute Instances need to have a unique name across the region.
# Here we create a unique name with current datetime
from azure.ai.ml.entities import ComputeInstance

# ci_basic_name = "mlschool" + datetime.datetime.now().strftime("%Y%m%d%H%M")
ci_basic_name = "mlschool"
ci_basic = ComputeInstance(
    name=ci_basic_name, size="Standard_F4s_v2", idle_time_before_shutdown_minutes=15
)
client.begin_create_or_update(ci_basic).result()

ComputeInstance({'state': 'Stopped', 'last_operation': {'operation_name': 'Stop', 'operation_time': '2024-04-06T11:35:59.453Z', 'operation_status': 'Succeeded', 'operation_trigger': 'User'}, 'os_image_metadata': <azure.ai.ml.entities._compute._image_metadata.ImageMetadata object at 0x0000019F04C2A150>, 'services': [{'display_name': 'Jupyter', 'endpoint_uri': 'https://mlschool.centralindia.instances.azureml.ms/tree/'}, {'display_name': 'Jupyter Lab', 'endpoint_uri': 'https://mlschool.centralindia.instances.azureml.ms/lab'}], 'type': 'computeinstance', 'created_on': '2024-04-06T11:31:30.618821+0000', 'provisioning_state': 'Succeeded', 'provisioning_errors': None, 'name': 'mlschool', 'description': None, 'tags': None, 'properties': {}, 'print_as_yaml': False, 'id': '/subscriptions/2acef264-d285-40af-be00-8dae3516307c/resourceGroups/ml-school-rg/providers/Microsoft.MachineLearningServices/workspaces/ml-school/computes/mlschool', 'Resource__source_path': '', 'base_path': 'D:\\DataspellProje

In [5]:
# make sure the compute is not running unneccesarily. We will change this later
try:
    client.compute.begin_stop(ci_basic_name).wait()
except:
    pass

HttpResponseError: (BadRequest) {"error":{"code":"ComputeInstanceAlreadyStopped","message":"The specified Azure ML Compute Instance mlschool is already stopped"}}
Code: BadRequest
Message: {"error":{"code":"ComputeInstanceAlreadyStopped","message":"The specified Azure ML Compute Instance mlschool is already stopped"}}

In [6]:
# Get compute
ci_basic_state = client.compute.get(ci_basic_name)
ci_basic_state

ComputeInstance({'state': 'Stopped', 'last_operation': {'operation_name': 'Stop', 'operation_time': '2024-04-06T11:35:59.453Z', 'operation_status': 'Succeeded', 'operation_trigger': 'User'}, 'os_image_metadata': <azure.ai.ml.entities._compute._image_metadata.ImageMetadata object at 0x0000019F04BF2B90>, 'services': [{'display_name': 'Jupyter', 'endpoint_uri': 'https://mlschool.centralindia.instances.azureml.ms/tree/'}, {'display_name': 'Jupyter Lab', 'endpoint_uri': 'https://mlschool.centralindia.instances.azureml.ms/lab'}], 'type': 'computeinstance', 'created_on': '2024-04-06T11:31:30.618821+0000', 'provisioning_state': 'Succeeded', 'provisioning_errors': None, 'name': 'mlschool', 'description': None, 'tags': None, 'properties': {}, 'print_as_yaml': False, 'id': '/subscriptions/2acef264-d285-40af-be00-8dae3516307c/resourceGroups/ml-school-rg/providers/Microsoft.MachineLearningServices/workspaces/ml-school/computes/mlschool', 'Resource__source_path': '', 'base_path': 'D:\\DataspellProje

In [14]:
from azure.ai.ml.entities import PipelineJobSettings, PipelineJob

# Create a PipelineJobSettings object
settings = PipelineJobSettings()

# Set the properties
settings.default_datastore = "penguins"
settings.default_compute = ci_basic_name
settings.continue_on_step_failure = False
settings.framework_version = "2.12"
settings.py_version = "py311"

# Create a PipelineJob
pipeline_job = PipelineJob(settings=settings)

# Now you can inspect the settings of the pipeline_job
print(pipeline_job.settings.default_datastore)
print(pipeline_job.settings.default_compute)
print(pipeline_job.settings.continue_on_step_failure)
print(pipeline_job.settings.framework_version)
print(pipeline_job.settings.py_version)


/subscriptions/2acef264-d285-40af-be00-8dae3516307c/resourceGroups/ml-school-rg/providers/Microsoft.MachineLearningServices/workspaces/ml-school/datastores/penguins
/subscriptions/2acef264-d285-40af-be00-8dae3516307c/resourceGroups/ml-school-rg/providers/Microsoft.MachineLearningServices/workspaces/ml-school/computes/mlschool
False
2.12
py311


In [21]:
DATA_ASSET = 'penguins'
DATA_VERSION = 1
data = client.data.get(DATA_ASSET, DATA_VERSION)

## Section 2 - Exploratory Data Analysis

Let's run Exploratory Data Analysis on the [Penguins dataset](https://www.kaggle.com/parulpandey/palmer-archipelago-antarctica-penguin-data). The goal of this session is to understand the data and the problem we are trying to solve.


In [27]:
import pandas as pd

penguins = pd.read_csv(DATA_FILEPATH)
penguins.describe(include='all')

Unnamed: 0,species,island,culmen_length_mm,culmen_depth_mm,flipper_length_mm,body_mass_g,sex
count,344,344,342.0,342.0,342.0,342.0,334
unique,3,3,,,,,3
top,Adelie,Biscoe,,,,,MALE
freq,152,168,,,,,168
mean,,,43.92193,17.15117,200.915205,4201.754386,
std,,,5.459584,1.974793,14.061714,801.954536,
min,,,32.1,13.1,172.0,2700.0,
25%,,,39.225,15.6,190.0,3550.0,
50%,,,44.45,17.3,197.0,4050.0,
75%,,,48.5,18.7,213.0,4750.0,


In [26]:
penguins.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 344 entries, 0 to 343
Data columns (total 7 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   species            344 non-null    object 
 1   island             344 non-null    object 
 2   culmen_length_mm   342 non-null    float64
 3   culmen_depth_mm    342 non-null    float64
 4   flipper_length_mm  342 non-null    float64
 5   body_mass_g        342 non-null    float64
 6   sex                334 non-null    object 
dtypes: float64(4), object(3)
memory usage: 18.9+ KB
