# FineTuning LLM with Model-As-Service Serverless

This sample shows how to create a serverless FineTuning job to fine tune a model using a dataset generated synthetically using RAFT.

#### Training data
We use the dataset generated in the previous [gen.ipynb](./0_gen.ipynb) notebook using the RAFT method. The dataset has three splits, suitable for:
* Fine-tuning
* Validation
* Evaluation

We will use the fine-tuning and validation splits in this notebook

#### Model
We will use the smaller model of the Llama family to show how a user can finetune a model for chat-completion task.

#### Outline
1. Setup pre-requisites
2. Pick a model to fine-tune.
3. Create training and validation datasets.
4. Configure the fine tuning job.
5. Submit the fine tuning job.

## Overview
![](./doc/raft-process-ft.png)

## Running time and cost

The fine-tuning job usually takes roughly 1.5 hours. Serverless fine-tuning is billed on a time basis and is in the $35/hour range so the cost at this date of running this notebook will be roughly in the $50 range.

## Pre-requisites

#### Quickstart

* Authenticate to Azure using the `az login --use-device-code` command and select an account and subscription configured with **MaaS Serverless**
* Copy file `config.json.sample` to `config.json` and replace  `<WORKSPACE_NAME>`, `<RESOURCE_GROUP>` and `<SUBSCRIPTION_ID>` in `config.json`

#### In case of issue

Checkout the following documentation walking you through setting up the subscription and the Azure ML workspace:
* Follow the prerequisites section of article [MS Learn article "Fine-tune Meta Llama models in Azure AI Studio"](https://aka.ms/c/learn-ft-prereq)
* Connect to AzureML Workspace. Learn more at [set up SDK authentication](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-setup-authentication?tabs=sdk).

**Install dependencies by running below cell. This is not an optional step if running in a new environment.**

The requirements should have been automatically installed if you opened the project in Dev Container or Codespaces, but if not, uncomment the following cell to install the requirements

In [1]:
#%pip install azure-ai-ml
#%pip install azure-identity

#%pip install mlflow
#%pip install azureml-mlflow

### Create AzureML Workspace connections

In [None]:
from azure.ai.ml import MLClient
from azure.identity import (
    DefaultAzureCredential,
    InteractiveBrowserCredential,
)

try:
    credential = DefaultAzureCredential()
    credential.get_token("https://management.azure.com/.default")
except Exception as ex:
    credential = InteractiveBrowserCredential()

# Expects config.json in the same directory as this notebook.
workspace_ml_client = MLClient.from_config(credential=credential)

# the models, fine tuning pipelines and environments are available in the AzureML system registry, "azureml"
registry_ml_client = MLClient(credential, registry_name="azureml")
registry_ml_client_meta = MLClient(credential, registry_name="azureml-meta")

# Get AzureML workspace object.
workspace = workspace_ml_client._workspaces.get(workspace_ml_client.workspace_name)
workspace_id = workspace._workspace_id

### 2. Pick a foundation model to fine tune

We will be fine-tuning a `Llama` model for this recipe.

At the date of writing this notebook, the following student models have been tested successfully

In [None]:
from pathlib import Path
from strictyaml import load

for model in [load(x.read_text()).data for x in Path("parameters").glob("*.yaml")]:
    print(f"- {model['model_name']} ({model['format']} model)")

You may want to try other Llama models available in the model registry. Please note that your experience may vary, some of them might not yet be fine-tunable through the Python SDK.

If you find another model, feel free to submit a [PR on this repository](https://github.com/Azure-Samples/raft-distillation-recipe/pulls).

In [None]:
for m in filter(lambda x: "llama" in x.name.lower() and "-hf" not in x.name.lower(), registry_ml_client_meta.models.list()):
    print(f"- {m.name}")

## Notebook parameters

Those parameters are introspected by Papermill in the [`./run_all.sh`](./run_all.sh) script and can be used to parametize the headless execution of this notebook through the command line with parameter files from the [`./parameters`](./parameters/) folder.

In [5]:
model_name: str = "Meta-Llama-3.1-8B-Instruct"
format: str = "chat"
learning_rate: float = 0.0002

In [None]:
foundation_model = registry_ml_client_meta.models.get(model_name, label="latest")
print("\n\nUsing model name: {0}, version: {1}, id: {2} for fine tuning".format(foundation_model.name, foundation_model.version, foundation_model.id))

In [7]:
from azure.ai.ml.constants._common import AssetTypes
from azure.ai.ml.entities._inputs_outputs import Input

model_to_finetune = Input(type=AssetTypes.MLFLOW_MODEL, path=foundation_model.id)

### 3. Prepare data

We are using the data generated previously using RAFT

#### Create data inputs

In [None]:
import os
from dotenv import load_dotenv

# Variables passed by previous notebooks
load_dotenv(".env.state")

ds_name = os.getenv("DATASET_NAME")
ds_path = f"dataset/{ds_name}"
dataset_path_ft_train = f"{ds_path}-files/{ds_name}-ft.train.jsonl"
dataset_path_ft_valid = f"{ds_path}-files/{ds_name}-ft.valid.jsonl"

print(f"Using dataset {ds_name} for fine tuning")

#### Preview of training split

In [None]:
import pandas as pd

pd.read_json(dataset_path_ft_train, lines=True).head(2)

#### Preview of validation split

In [None]:
import pandas as pd

pd.read_json(dataset_path_ft_valid, lines=True).head(2)

Let's calculate the hashes of dataset files, we'll use those hashes combined with dataset names to distinguish between different generated datasets and track which ones have already been uploaded

In [11]:
from utils import file_sha256

train_hash = file_sha256(dataset_path_ft_train)[:4]
valid_hash = file_sha256(dataset_path_ft_valid)[:4]

Upload training dataset

In [None]:
from azure.ai.ml.entities import Data

dataset_version = "1"
train_dataset_name = f"{ds_name}_train_{train_hash}"
try:
    train_data_created = workspace_ml_client.data.get(train_dataset_name, version=dataset_version)
    print(f"Dataset {train_dataset_name} already exists")
except:
    print(f"Creating dataset {train_dataset_name}")
    train_data = Data(
        path=dataset_path_ft_train,
        type=AssetTypes.URI_FILE,
        description=f"{ds_name} training dataset",
        name=train_dataset_name,
        version=dataset_version,
    )
    train_data_created = workspace_ml_client.data.create_or_update(train_data)

Upload validation dataset

In [None]:
from azure.ai.ml.entities import Data

dataset_version = "1"
validation_dataset_name = f"{ds_name}_validation_{valid_hash}"
try:
    validation_data_created = workspace_ml_client.data.get(validation_dataset_name, version=dataset_version)
    print(f"Dataset {validation_dataset_name} already exists")
except:
    print(f"Creating dataset {validation_dataset_name}")
    validation_data = Data(
        path=dataset_path_ft_valid,
        type=AssetTypes.URI_FILE,
        description=f"{ds_name} validation dataset",
        name=validation_dataset_name,
        version=dataset_version,
    )
    validation_data_created = workspace_ml_client.data.create_or_update(validation_data)

Create training and validation inputs

In [14]:
from azure.ai.ml.entities._inputs_outputs import Input

training_data = Input(
    type=train_data_created.type, path=f"azureml://locations/{workspace.location}/workspaces/{workspace._workspace_id}/data/{train_data_created.name}/versions/{train_data_created.version}"
)
validation_data = Input(
    type=validation_data_created.type,
    path=f"azureml://locations/{workspace.location}/workspaces/{workspace._workspace_id}/data/{validation_data_created.name}/versions/{validation_data_created.version}",
)

### Subscribe to Marketplace offer

In [None]:
from azure.ai.ml.entities import MarketplaceSubscription

model_id = "/".join(foundation_model.id.split("/")[:-2])
subscription_name = model_id.split("/")[-1].replace(".", "-").replace("_", "-")

print(f"Subscribing to Marketplace model: {model_id}")

from azure.core.exceptions import ResourceExistsError
marketplace_subscription = MarketplaceSubscription(
    model_id=model_id,
    name=subscription_name,
)

try:
    marketplace_subscription = workspace_ml_client.marketplace_subscriptions.begin_create_or_update(marketplace_subscription).result()
except ResourceExistsError as ex:
    print(f"Marketplace subscription {subscription_name} already exists for model {model_id}")

### 3. Submit the fine tuning job using the the model and data as inputs
 
Create FineTuning job using all the data that we have so far.

#### Define finetune parameters

##### There are following set of parameters that are required.

1. `model` - Base model to finetune.
2. `training_data` - Training data for finetuning the base model.
3. `task` - FineTuning task to perform. eg. TEXT_COMPLETION for text-generation/text-generation finetuning jobs.
4. `outputs`- Output registered model name.

##### Following parameters are optional:

1. `hyperparameters` - Parameters that control the FineTuning behavior at runtime.
2. `name`- FineTuning job name
3. `experiment_name` - Experiment name for FineTuning job.
4. `display_name` - FineTuning job display name.

In [None]:
from azure.ai.ml.entities._job.finetuning.custom_model_finetuning_job import CustomModelFineTuningJob
import uuid
from azure.ai.ml._restclient.v2024_01_01_preview.models import (
    FineTuningTaskType,
)
from azure.ai.ml.entities._inputs_outputs import Output

guid = uuid.uuid4()
short_guid = str(guid)[:4]
experiment_name = f"ft-raft-{ds_name}"
registered_model_name = f"ft-raft-{model_name}-{ds_name}-{train_hash}-v{dataset_version}".replace(".", "_")
job_name = f"{registered_model_name}-{short_guid}"

from utils import update_state

task = FineTuningTaskType.CHAT_COMPLETION if format == "chat" else FineTuningTaskType.TEXT_COMPLETION
print(f"Model format: {format}")
print(f"Fine tuning task: {task}")

finetuning_job = CustomModelFineTuningJob(
    task=task,
    training_data=training_data,
    validation_data=validation_data,
    hyperparameters={
        "per_device_train_batch_size": "1",
        "learning_rate": str(learning_rate),
        "num_train_epochs": "1",
        "registered_model_name": registered_model_name,
    },
    model=model_to_finetune,
    display_name=job_name,
    name=job_name,
    experiment_name=experiment_name,
    outputs={"registered_model": Output(type="mlflow_model", name=f"ft-job-finetune-registered-{short_guid}")},
)

update_state("FINETUNED_MODEL_NAME", finetuning_job.outputs['registered_model'].name)
update_state("FINETUNED_MODEL_FORMAT", format)

Submit the fine-tuning job

In [None]:
try:
    print(f"Submitting job {finetuning_job.name}")
    created_job = workspace_ml_client.jobs.create_or_update(finetuning_job)
    print(f"Successfully created job {finetuning_job.name}")
    print(f"Studio URL is {created_job.studio_url}")
    print(f"Registered model name will be {registered_model_name}")
except Exception as e:
    print("Error creating job", e)
    raise e

## Next step -> Deployment

[./3_deploy.ipynb](./3_deploy.ipynb) to start deploying the fine-tuned student model