# Fine-Tuning and Deploying Phi-4 for PII Extraction

This notebook walks through the entire process of fine-tuning the `microsoft/Phi-4-mini-instruct` model on a custom PII dataset, registering it, and deploying it to a managed online endpoint using the Azure ML SDK v2.

### 1. Setup and Imports

First, we'll import the necessary libraries and load the configuration from our `.env` file. This file contains your Azure subscription details and resource names.

In [None]:
import os
import json
import time
from dotenv import load_dotenv
from azure.ai.ml import MLClient, command, Input, Output
from azure.ai.ml.entities import (
    Model,
    Environment,
    CodeConfiguration,
    Data,
    AmlCompute
)
from azure.identity import DefaultAzureCredential
from azure.core.exceptions import ResourceNotFoundError

load_dotenv()

# Retrieve configuration values
subscription_id = os.getenv("SUBSCRIPTION_ID")
resource_group = os.getenv("RESOURCE_GROUP")
workspace_name = os.getenv("WORKSPACE_NAME")
cluster_name = os.getenv("CLUSTER_NAME")
vm_size = os.getenv("VM_SIZE")
min_nodes = int(os.getenv("MIN_NODES", 0))
max_nodes = int(os.getenv("MAX_NODES", 1))
endpoint_name = os.getenv("ENDPOINT_NAME")
deployment_name = os.getenv("DEPLOYMENT_NAME")

### 2. Connect to Azure ML Workspace

Using the loaded credentials, we create an `MLClient` object to interact with our workspace.

In [None]:
ml_client = MLClient(
    DefaultAzureCredential(), subscription_id, resource_group, workspace_name
)

print(f"Connected to workspace: {ml_client.workspace_name}")

### 3. Create or Get Compute Cluster

We need a GPU-powered compute cluster to run the fine-tuning job. This cell checks if a cluster with the specified name already exists. If not, it creates a new one. This step can take a few minutes if the cluster is being provisioned for the first time.

In [None]:
try:
    gpu_cluster = ml_client.compute.get(cluster_name)
    print(f"Found existing cluster '{cluster_name}'.")
except ResourceNotFoundError:
    print(f"Cluster '{cluster_name}' not found. Creating a new one...")
    gpu_cluster = AmlCompute(
        name=cluster_name,
        type="amlcompute",
        size=vm_size,
        min_instances=min_nodes,
        max_instances=max_nodes,
        tier="LowPriority",
    )
    ml_client.compute.begin_create_or_update(gpu_cluster).result()
    print(f"Cluster '{cluster_name}' created successfully.")

### 4. Create Data Assets

Next, we upload our local `pii_train.jsonl` and `pii_eval.jsonl` files to Azure ML and register them as Data Assets. This makes them accessible and versioned within the workspace.

In [None]:
train_data_asset = Data(
    path="./data/pii_train.jsonl",
    type="uri_file",
    name="pii_train_data",
    description="Training data for PII detection.",
)
ml_client.data.create_or_update(train_data_asset)
print(f"Data asset '{train_data_asset.name}' created.")

eval_data_asset = Data(
    path="./data/pii_eval.jsonl",
    type="uri_file",
    name="pii_eval_data",
    description="Evaluation data for PII detection.",
)
ml_client.data.create_or_update(eval_data_asset)
print(f"Data asset '{eval_data_asset.name}' created.")

### 5. Define the Training Job

Here, we define the `command` job. This specifies:
- The code to run (`train/train.py`).
- The command-line arguments, including inputs and outputs.
- The compute target (our GPU cluster).
- The software environment needed to run the code.

In [None]:
custom_job_environment = Environment(
    image="mcr.microsoft.com/azureml/curated/acpt-pytorch-2.2-cuda12.1:latest",
    conda_file="./train/environment.yml",
)

job_name = f"phi4-pii-finetune_{int(time.time())}"

# The command is updated to pass the hyperparameters to the new script
job_command = (
    "python train.py "
    "--train_data ${{inputs.train_data}} "
    "--eval_data ${{inputs.eval_data}} "
    "--model_output ${{outputs.model_output}} "
    "--save_merged_model True "
    "--epochs 3 "
    "--learning_rate 2e-5 "
    "--gradient_accumulation_steps 2 "
)

train_job = command(
    name=job_name,
    code="./train",
    command=job_command,
    inputs={
        "train_data": Input(type="uri_file", path=train_data_asset.path),
        "eval_data": Input(type="uri_file", path=eval_data_asset.path),
    },
    outputs={"model_output": Output(type="uri_folder")},
    environment=custom_job_environment,
    compute=cluster_name,
    display_name="Fine-tune Phi-4 Mini for PII (Expert)",
    experiment_name="phi4-pii-finetuning",
)

print(f"Training job '{job_name}' defined with expert configuration.")

### 6. Submit and Stream the Training Job

This is the main training step. We submit the job defined above to Azure ML. The `.stream()` method will display the logs from the remote compute cluster directly in the notebook's output. 

**This will take a significant amount of time (e.g., 30-60+ minutes).**

In [None]:
print(f"Submitting training job: {job_name}")
returned_job = ml_client.jobs.create_or_update(train_job)
ml_client.jobs.stream(returned_job.name)
print(f"Training job '{returned_job.name}' completed.")

### 7. Register the Model

Once the training job is complete, the fine-tuned model artifacts are stored in the job's output. We now register these artifacts as a versioned Model in the Azure ML workspace, which makes it easy to track and deploy.

In [None]:
model_name = "phi-4-pii-model"
model_path = f"azureml://jobs/{returned_job.name}/outputs/model_output"

registered_model = ml_client.models.create_or_update(
    Model(path=model_path, name=model_name, description="Fine-tuned Phi-4 for PII detection.")
)
print(f"Model '{registered_model.name}' version '{registered_model.version}' registered.")

### 8. Create Online Endpoint

An endpoint is an HTTPS endpoint that clients can call to get predictions from your model. We create a 'batch' endpoint, because such ones can be created with low priority (spot VMs), for others you need dedicated VMs which are not only more expensive, but typically require organizational accounts. Azure handles the underlying infrastructure.

We also deploy a model to the endpoint. A deployment is a set of resources required for hosting the model. This step provisions the compute, deploys the model, and configures the scoring logic.

**This step will also take several minutes to complete.**

In [None]:
from azure.ai.ml.entities import (
    BatchEndpoint,
    BatchDeployment,
    BatchRetrySettings,
)
from azure.core.exceptions import ResourceNotFoundError
import time
import pandas as pd
import glob

# Create a unique name for the batch endpoint
batch_endpoint_name = f"pii-batch-{int(time.time())}"

# Create the endpoint
print(f"Creating batch endpoint '{batch_endpoint_name}'...")
endpoint = BatchEndpoint(
    name=batch_endpoint_name,
    description="Batch endpoint for PII extraction with the fine-tuned Phi-4 model.",
)
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()
print("Batch endpoint created successfully.")

# Define the retry settings with a longer timeout (e.g., 1 hour = 3600 seconds)
retry_settings = BatchRetrySettings(max_retries=1, timeout=3600)

# Create the deployment
print("Creating batch deployment...")
deployment = BatchDeployment(
    name="pii-batch-deployment",
    endpoint_name=batch_endpoint_name,
    model=registered_model,
    code_configuration=CodeConfiguration(
        code="./deployment",
        scoring_script="score.py",
    ),
    environment=Environment(
        conda_file="./deployment/environment.yml",
        image="mcr.microsoft.com/azureml/curated/acpt-pytorch-2.2-cuda12.1:latest"
    ),
    compute=cluster_name,
    instance_count=1,
    max_concurrency_per_instance=1,
    mini_batch_size=1,
    logging_level="INFO",
    retry_settings=retry_settings,
)
ml_client.batch_deployments.begin_create_or_update(deployment).result()
print("Batch deployment created successfully.")

# Set the default deployment for the endpoint
endpoint = ml_client.batch_endpoints.get(batch_endpoint_name)
endpoint.defaults.deployment_name = deployment.name
ml_client.batch_endpoints.begin_create_or_update(endpoint).result()
print("Default deployment set.")

### 9. Invoke the Batch Endpoint

Finally, let's send a send our evaluation dataset to the endpoint for batch scoring. The results will be saved to a specified output in Azure ML, and will also be downloaded and displayed in the notebook. 

Note that this is a batch endpoint, based on Low Priority VMs, so it may take some time for the underlying job to start and complete.

In [None]:
import json
import glob
import pandas as pd
from IPython.display import display

# --- Step 1: Define the Input for the Job ---
# The input for the job is our evaluation data asset
input_data = Input(type="uri_file", path=eval_data_asset.path)

# --- Step 2: Invoke the Endpoint to Kick Off the Batch Job ---
print(f"Invoking batch endpoint '{batch_endpoint_name}'...")
job = ml_client.batch_endpoints.invoke(
    endpoint_name=batch_endpoint_name,
    input=input_data,
)
print(f"Batch job '{job.name}' started. Waiting for completion...")

# --- Step 3: Stream Logs and Wait for the Job to Finish ---
ml_client.jobs.stream(job.name)
print("Batch job finished.")

# --- Step 4: Download the Results ---
output_dir = "./batch_results"
print("Downloading results...")
ml_client.jobs.download(name=job.name, download_path=output_dir)
print(f"Results downloaded to {output_dir}")

# --- Step 5: Find and Display the Raw Output File ---
output_files = glob.glob(f"{output_dir}/**/predictions.jsonl", recursive=True)

if output_files:
    print(f"Found output file at: {output_files[0]}")
    
    print("\n--- Raw Batch Scoring Results (First 5 lines) ---")
    with open(output_files[0], 'r') as f:
        for i, line in enumerate(f):
            if i >= 5:
                break
            print(f"--- Record {i+1} ---\n{line.strip()}\n")
else:
    print("Could not find the output JSONL file in the results folder.")

### 10. Cleanup

Run these commands in your terminal to delete the endpoint and compute cluster.

In [None]:
print(f"To clean up, run the following commands in your terminal:")
print(f"\naz ml online-endpoint delete --name {endpoint_name} --yes")
print(f"az ml compute delete --name {cluster_name} --yes")