# Heart Failure Prediction using HyperDrive

This notebook demonstrates hyperparameter tuning using Azure HyperDrive
for a Logistic Regression model to predict heart failure mortality.

## Overview
1. Setup workspace and compute
2. Load and register the dataset
3. Configure HyperDrive with hyperparameter search space
4. Run HyperDrive experiment
5. Analyze results and retrieve best model
6. Register and deploy the best model

## 1. Import Libraries and Setup Workspace

In [None]:
import os
import json
import numpy as np
import pandas as pd

from azureml.core import Workspace, Experiment, Dataset, Environment, ScriptRunConfig
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.train.hyperdrive import (
    HyperDriveConfig,
    RandomParameterSampling,
    BanditPolicy,
    choice,
    uniform,
    loguniform,
    PrimaryMetricGoal
)
from azureml.widgets import RunDetails
from azureml.core.model import Model, InferenceConfig
from azureml.core.webservice import AciWebservice

In [None]:
# Connect to the workspace
ws = Workspace.from_config()
print(f"Workspace name: {ws.name}")
print(f"Subscription ID: {ws.subscription_id}")
print(f"Resource group: {ws.resource_group}")

## 2. Create Compute Cluster

In [None]:
# Define the compute cluster name
compute_name = "cpu-cluster"

try:
    # Check if the compute target already exists
    compute_target = ComputeTarget(workspace=ws, name=compute_name)
    print(f"Found existing compute target: {compute_name}")
except ComputeTargetException:
    # Create a new compute cluster
    print(f"Creating new compute cluster: {compute_name}")
    
    compute_config = AmlCompute.provisioning_configuration(
        vm_size="STANDARD_D2_V2",
        max_nodes=4,
        min_nodes=0
    )
    
    compute_target = ComputeTarget.create(ws, compute_name, compute_config)
    compute_target.wait_for_completion(show_output=True)

print(f"Compute target status: {compute_target.get_status().serialize()}")

## 3. Load and Register Dataset

In [None]:
# Load the local dataset
df = pd.read_csv('heart_failure_clinical_records_dataset.csv')

# Display dataset info
print(f"Dataset shape: {df.shape}")
print(f"\nColumn names: {df.columns.tolist()}")
print(f"\nTarget variable distribution:")
print(df['DEATH_EVENT'].value_counts())
print(f"\nDataset statistics:")
df.describe()

In [None]:
# Check if dataset is already registered, if not register it
try:
    dataset = Dataset.get_by_name(ws, name='heart-failure-dataset')
    print("Dataset already registered.")
except Exception:
    # Get the default datastore
    datastore = ws.get_default_datastore()
    
    # Upload the dataset to the datastore
    datastore.upload_files(
        files=['heart_failure_clinical_records_dataset.csv'],
        target_path='heart-failure-data/',
        overwrite=True,
        show_progress=True
    )
    
    # Create and register a TabularDataset
    dataset = Dataset.Tabular.from_delimited_files(
        path=(datastore, 'heart-failure-data/heart_failure_clinical_records_dataset.csv')
    )
    
    dataset = dataset.register(
        workspace=ws,
        name='heart-failure-dataset',
        description='Heart Failure Clinical Records Dataset from Kaggle',
        create_new_version=True
    )
    print(f"Dataset registered: {dataset.name}")

## 4. Create Environment

In [None]:
# Create a Python environment for the training script
env = Environment.from_conda_specification(
    name='heart-failure-env',
    file_path='conda_env.yml'
)

# Register the environment
env.register(workspace=ws)
print(f"Environment registered: {env.name}")

## 5. Configure HyperDrive

### Hyperparameter Search Space

We are tuning the following hyperparameters for Logistic Regression:

1. **C (Regularization Strength)**: Inverse of regularization strength. Smaller values = stronger regularization.
   - Range: 0.001 to 100 (log-uniform distribution)
   - Rationale: Using log-uniform sampling because regularization strength often works better on a logarithmic scale

2. **max_iter (Maximum Iterations)**: Maximum number of iterations for the solver to converge.
   - Values: 50, 100, 150, 200, 300
   - Rationale: Discrete choices to ensure convergence while not over-iterating

3. **solver (Optimization Algorithm)**: Algorithm to use for optimization.
   - Values: 'lbfgs', 'liblinear', 'saga'
   - Rationale: Different solvers work better for different data characteristics

### Sampling Method: Random Sampling
- Random sampling is chosen because it's more efficient than grid search for exploring the hyperparameter space
- It can find good configurations faster, especially with continuous parameters

### Early Termination Policy: Bandit Policy
- Terminates runs that are not performing well compared to the best run
- slack_factor=0.1 means runs with accuracy < 90% of the best run will be terminated
- evaluation_interval=2 means the policy is applied every 2 iterations

In [None]:
# Define the hyperparameter search space
param_sampling = RandomParameterSampling({
    '--C': loguniform(-3, 2),  # Regularization: 0.001 to 100
    '--max_iter': choice(50, 100, 150, 200, 300),
    '--solver': choice('lbfgs', 'liblinear', 'saga')
})

print("Parameter sampling configured:")
print("  - C: loguniform(-3, 2) -> [0.001, 100]")
print("  - max_iter: choice(50, 100, 150, 200, 300)")
print("  - solver: choice('lbfgs', 'liblinear', 'saga')")

In [None]:
# Define early termination policy
early_termination_policy = BanditPolicy(
    slack_factor=0.1,
    evaluation_interval=2,
    delay_evaluation=5
)

print("Early termination policy configured:")
print("  - Type: Bandit Policy")
print("  - Slack factor: 0.1 (terminate if accuracy < 90% of best)")
print("  - Evaluation interval: 2")
print("  - Delay evaluation: 5 (start evaluating after 5 iterations)")

In [None]:
# Create the ScriptRunConfig
script_config = ScriptRunConfig(
    source_directory='.',
    script='train.py',
    compute_target=compute_target,
    environment=env
)

print("Script run configuration created.")

In [None]:
# Create the HyperDrive configuration
hyperdrive_config = HyperDriveConfig(
    run_config=script_config,
    hyperparameter_sampling=param_sampling,
    policy=early_termination_policy,
    primary_metric_name='Accuracy',
    primary_metric_goal=PrimaryMetricGoal.MAXIMIZE,
    max_total_runs=20,
    max_concurrent_runs=4
)

print("HyperDrive configuration created:")
print("  - Primary metric: Accuracy (maximize)")
print("  - Max total runs: 20")
print("  - Max concurrent runs: 4")

## 6. Run HyperDrive Experiment

In [None]:
# Create the experiment
experiment = Experiment(ws, "heart-failure-hyperdrive")

# Submit the HyperDrive run
print("Submitting HyperDrive experiment...")
hyperdrive_run = experiment.submit(hyperdrive_config)
print(f"Run ID: {hyperdrive_run.id}")

In [None]:
# Display the RunDetails widget to monitor progress
RunDetails(hyperdrive_run).show()

In [None]:
# Wait for the run to complete
hyperdrive_run.wait_for_completion(show_output=True)

## 7. Retrieve and Analyze Best Model

In [None]:
# Get the best run
best_run = hyperdrive_run.get_best_run_by_primary_metric()

# Display best run details
print(f"Best Run ID: {best_run.id}")
print(f"\nBest Run Metrics:")

# Get metrics
metrics = best_run.get_metrics()
for metric_name, metric_value in metrics.items():
    print(f"  {metric_name}: {metric_value}")

In [None]:
# Get the best hyperparameters
best_run_params = best_run.get_details()['runDefinition']['arguments']

print("\nBest Hyperparameters:")
for i in range(0, len(best_run_params), 2):
    print(f"  {best_run_params[i]}: {best_run_params[i+1]}")

In [None]:
# List all child runs with their metrics
print("\nAll HyperDrive Runs:")
print("-" * 80)

child_runs = list(hyperdrive_run.get_children())
for run in sorted(child_runs, key=lambda r: r.get_metrics().get('Accuracy', 0), reverse=True)[:10]:
    metrics = run.get_metrics()
    params = run.get_details()['runDefinition']['arguments']
    print(f"Run ID: {run.id}")
    print(f"  Accuracy: {metrics.get('Accuracy', 'N/A')}")
    print(f"  Parameters: {params}")
    print()

## 8. Register the Best Model

In [None]:
# Download the best model
model_path = best_run.download_file('outputs/model.joblib', output_file_path='outputs/model.joblib')
print(f"Model downloaded to: outputs/model.joblib")

In [None]:
# Register the best model
model_name = 'heart-failure-hyperdrive-model'

# Get the best hyperparameters for tags
best_params = {}
args = best_run.get_details()['runDefinition']['arguments']
for i in range(0, len(args), 2):
    best_params[args[i].replace('--', '')] = args[i+1]

registered_model = best_run.register_model(
    model_name=model_name,
    model_path='outputs/model.joblib',
    description='Heart Failure Prediction Model trained with HyperDrive',
    tags={
        'algorithm': 'LogisticRegression',
        'accuracy': str(metrics.get('Accuracy', 'N/A')),
        'C': str(best_params.get('C', 'N/A')),
        'max_iter': str(best_params.get('max_iter', 'N/A')),
        'solver': str(best_params.get('solver', 'N/A'))
    }
)

print(f"Model registered: {registered_model.name}")
print(f"Model version: {registered_model.version}")
print(f"Model ID: {registered_model.id}")

## 9. Deploy the Model

In [None]:
# Configure inference
inference_config = InferenceConfig(
    entry_script='score.py',
    environment=env
)

# Configure the ACI deployment
aci_config = AciWebservice.deploy_configuration(
    cpu_cores=1,
    memory_gb=1,
    auth_enabled=True,
    enable_app_insights=True,
    description='Heart Failure Prediction Service (HyperDrive)'
)

print("Deployment configuration created!")

In [None]:
# Deploy the model
service_name = 'heart-failure-hd-service'

service = Model.deploy(
    workspace=ws,
    name=service_name,
    models=[registered_model],
    inference_config=inference_config,
    deployment_config=aci_config,
    overwrite=True
)

service.wait_for_deployment(show_output=True)
print(f"\nService state: {service.state}")
print(f"Scoring URI: {service.scoring_uri}")

## 10. Test the Deployed Model

In [None]:
import requests

# Get the scoring URI and keys
scoring_uri = service.scoring_uri
primary_key, secondary_key = service.get_keys()

# Prepare sample data for testing
# Feature order: age, anaemia, creatinine_phosphokinase, diabetes, ejection_fraction,
#                high_blood_pressure, platelets, serum_creatinine, serum_sodium, sex, smoking, time
sample_data = {
    "data": [
        [75, 0, 582, 0, 20, 1, 265000, 1.9, 130, 1, 0, 4],   # Expected: DEATH_EVENT=1
        [55, 0, 7861, 0, 38, 0, 263358.03, 1.1, 136, 1, 0, 6],  # Expected: DEATH_EVENT=1
        [45, 0, 2060, 1, 60, 0, 742000, 0.8, 138, 0, 0, 278]  # Expected: DEATH_EVENT=0
    ]
}

# Set the headers
headers = {
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {primary_key}'
}

# Make the request
response = requests.post(scoring_uri, json=sample_data, headers=headers)

print(f"Status code: {response.status_code}")
print(f"Response: {response.json()}")

## 11. Compare with AutoML Model

After running both notebooks, compare the results:

| Metric | AutoML Model | HyperDrive Model |
|--------|--------------|------------------|
| Accuracy | [Fill after running] | [Fill after running] |
| Algorithm | [Auto-selected] | Logistic Regression |
| Training Time | [Fill after running] | [Fill after running] |

The best model should be deployed based on the comparison results.

In [None]:
# Print service logs for debugging
print("Service Logs:")
print(service.get_logs())

## 12. Cleanup (Optional)

In [None]:
# Delete the web service (uncomment to run)
# service.delete()
# print("Service deleted.")

In [None]:
# Delete the compute cluster (uncomment to run)
# compute_target.delete()
# print("Compute cluster deleted.")