# Supervised Fine-Tuning with CNN/DailyMail Dataset on Microsoft Foundry

This notebook demonstrates how to fine-tune language models using **Supervised Fine-Tuning (SFT)** with the CNN/DailyMail News Summarization dataset.

## What You'll Learn
1. Understand supervised fine-tuning for summarization tasks
2. Prepare and format news summarization data
3. Upload datasets to Microsoft Foundry
4. Create and monitor a supervised fine-tuning job
5. Deploy and test your fine-tuned model

**Note**: Execute each cell in sequence.

## 1. Setup and Installation

Install all required packages from requirements.txt

In [1]:
pip install -r requirements.txt

Collecting azure-ai-projects>=2.0.0b1 (from -r requirements.txt (line 2))
  Downloading azure_ai_projects-2.0.0b3-py3-none-any.whl.metadata (68 kB)
     ---------------------------------------- 0.0/68.9 kB ? eta -:--:--
     ----------------- ---------------------- 30.7/68.9 kB 1.4 MB/s eta 0:00:01
     -------------------------------------- 68.9/68.9 kB 751.8 kB/s eta 0:00:00
Collecting openai (from -r requirements.txt (line 5))
  Using cached openai-2.14.0-py3-none-any.whl.metadata (29 kB)
Collecting azure-identity (from -r requirements.txt (line 8))
  Using cached azure_identity-1.25.1-py3-none-any.whl.metadata (88 kB)
Collecting azure-mgmt-cognitiveservices (from -r requirements.txt (line 9))
  Using cached azure_mgmt_cognitiveservices-14.1.0-py3-none-any.whl.metadata (32 kB)
Collecting python-dotenv (from -r requirements.txt (line 12))
  Using cached python_dotenv-1.2.1-py3-none-any.whl.metadata (25 kB)
Collecting isodate>=0.6.1 (from azure-ai-projects>=2.0.0b1->-r requirements.tx


[notice] A new release of pip is available: 24.0 -> 25.3
[notice] To update, run: python.exe -m pip install --upgrade pip


## 2. Import Libraries

In [1]:
import os
from dotenv import load_dotenv
from azure.identity import DefaultAzureCredential
from azure.ai.projects import AIProjectClient

print("All libraries imported successfully")

All libraries imported successfully


## 3. Configure Azure Environment

Set your Microsoft Foundry Project endpoint, model name and other environment variables. We're using **gpt-4.1** in this example, but you can use other supported GPT models. Copy the file `.env.template` (located in this folder), and save it as file named `.env`. Enter appropriate values for the environment variables used for the job you want to run.

```
MICROSOFT_FOUNDRY_PROJECT_ENDPOINT=<your-endpoint>
MODEL_NAME=gpt-4.1
AZURE_SUBSCRIPTION_ID=<your-subscription-id>
AZURE_RESOURCE_GROUP=<your-resource-group>
AZURE_AOAI_ACCOUNT=<your-foundry-account-name>
```

In [2]:
load_dotenv()

endpoint = os.environ.get("MICROSOFT_FOUNDRY_PROJECT_ENDPOINT")
model_name = os.environ.get("MODEL_NAME")

# Define dataset file paths
training_file_path = "training.jsonl"
validation_file_path = "validation.jsonl"

## 4. Connect to Microsoft Foundry Project

Connect to Microsoft Foundry Project using Azure credential authentication. This initializes the project client and OpenAI client needed for fine-tuning workflows.

**Important**: Ensure you have the **Azure AI User** role assigned to your account for the Microsoft Foundry Project resource.

In [3]:
credential = DefaultAzureCredential()
project_client = AIProjectClient(endpoint=endpoint, credential=credential)
openai_client = project_client.get_openai_client()

print("Connected to Microsoft Foundry Project")

Connected to Microsoft Foundry Project


## 5. Upload Training Files

Upload the training and validation JSONL files to Microsoft Foundry. Each file is assigned a unique ID that will be referenced when creating the fine-tuning job.
If training or validation files are already uploaded to storage account, then you can import those files directly instead of reuploading these files again.

In [17]:
print("Uploading training file...")
with open(training_file_path, "rb") as f:
    train_file = openai_client.files.create(file=f, purpose="fine-tune")

print("Uploading validation file...")
with open(validation_file_path, "rb") as f:
    validation_file = openai_client.files.create(file=f, purpose="fine-tune")

train_file_id = train_file.id
val_file_id = validation_file.id

print(f"Training file ID: {train_file_id}")
print(f"Validation file ID: {val_file_id}")

Uploading training file...
Uploading validation file...
Training file ID: file-cd8eca04fa5c4d97bdbe9b8f6df3e9a0
Validation file ID: file-1de00b2fbb2e416a858d6cf8a93aaa9b


## 6. Import Training Files from Azure Blob Storage (Optional)

If your training data is already stored in Azure Blob Storage, you can import it directly without re-uploading.

**Prerequisites:**
1. Upload `training.jsonl` and `validation.jsonl` to Azure Blob Storage
2. Generate SAS tokens with read permissions.
3. Set environment variables:
   ```bash
   TRAINING_FILE_BLOB_URL=https://<storage-account>.blob.core.windows.net/<container>/training.jsonl?<sas-token>
   VALIDATION_FILE_BLOB_URL=https://<storage-account>.blob.core.windows.net/<container>/validation.jsonl?<sas-token>
   ```

**Note:** This uses the Azure-specific `/openai/files/import` endpoint (preview API). Skip this section if you've already uploaded files in Section 5 so either upload files or import files.

In [12]:
import httpx

training_blob_url = os.environ.get("TRAINING_FILE_BLOB_URL")
validation_blob_url = os.environ.get("VALIDATION_FILE_BLOB_URL")

if training_blob_url:
    print("Importing files from Azure Blob Storage...")
    
    import_url = f"{endpoint}/openai/files/import?api-version=2025-11-15-preview"
    
    token = credential.get_token("https://ai.azure.com/.default")
    
    headers = {
        "Authorization": f"Bearer {token.token}",
        "Content-Type": "application/json"
    }
    
    print("Importing training file...")
    train_import_request = {
        "filename": "training.jsonl",
        "purpose": "fine-tune",
        "content_url": training_blob_url
    }
    
    with httpx.Client() as client:
        response = client.post(import_url, headers=headers, json=train_import_request)
        response.raise_for_status()
        train_file_data = response.json()
        train_file_id = train_file_data["id"]
        print(f"Training file imported: {train_file_id}")
        
        if validation_blob_url:
            print("Importing validation file...")
            val_import_request = {
                "filename": "validation.jsonl",
                "purpose": "fine-tune",
                "content_url": validation_blob_url
            }
            
            response = client.post(import_url, headers=headers, json=val_import_request)
            response.raise_for_status()
            val_file_data = response.json()
            val_file_id = val_file_data["id"]
            print(f"Validation file imported: {val_file_id}")
    
    print("Import completed! Files are being processed...")
else:
    print("Blob URLs not found in environment variables.")

Importing files from Azure Blob Storage...
Importing training file...
Training file imported: file-9e05ab8899db40cfaf06cbbf03d8f74d
Importing validation file...
Validation file imported: file-120e7bb085c04454a24598380e3a8688
Import completed! Files are being processed...


## 7. Wait for File Processing

Microsoft Foundry needs to process the uploaded files before they can be used for fine-tuning.

In [18]:
print("Waiting for files to be processed...")
openai_client.files.wait_for_processing(train_file_id)
openai_client.files.wait_for_processing(val_file_id)
print("Files ready!")

Waiting for files to be processed...
Files ready!


## 8. Create Supervised Fine-Tuning Job

Create a supervised fine-tuning job with your uploaded datasets. Configure the following hyperparameters to control the training process:

**Hyperparameters:**
1. **n_epochs (3)**: Number of complete passes through the training dataset. More epochs can improve performance but may lead to overfitting. Typical range: 1-10.
2. **batch_size (1)**: Number of training examples processed together in each iteration. Smaller batches provide more frequent updates. Typical range: 1-8.
3. **learning_rate_multiplier (1.0)**: Scales the default learning rate. Values < 1.0 make training more conservative, while values > 1.0 speed up learning but may cause instability. Typical range: 0.1-2.0.

**Note**: Adjust these based on your dataset size and quality.

In [None]:
print("Creating supervised fine-tuning job...")

fine_tune_job = openai_client.fine_tuning.jobs.create(
    model=model_name,
    training_file=train_file_id,
    validation_file=val_file_id,
    method={
        "type": "supervised",
        "supervised": {"hyperparameters": {"n_epochs": 3, "batch_size": 1, "learning_rate_multiplier": 1.0}},
    },
    extra_body={"trainingType": "Standard"},
    suffix="cnn-dailymail-summarization"
)

print(f"Fine-tuning job created!")
print(f"Job ID: {fine_tune_job.id}")
print(f"Status: {fine_tune_job.status}")
print(f"Model: {fine_tune_job.model}")

Creating supervised fine-tuning job...
Fine-tuning job created!
Job ID: ftjob-8aad139b19084ebf968c1e58f2dc3326
Status: pending
Model: gpt-4.1-2025-04-14


## 9. Monitor Training Progress

Track the status of your fine-tuning job. You can view the current status, and recent training events. Training duration varies based on dataset size, model, and hyperparameters - typically ranging from minutes to several hours.

In [8]:
job_status = openai_client.fine_tuning.jobs.retrieve(fine_tune_job.id)
print(f"Status: {job_status.status}")

Status: pending


## 10. Retrieve Fine-Tuned Model

After the fine-tuning job succeeded, retrieve the fine-tuned model ID. This ID is required to make inference calls with your customized model.

In [4]:
completed_job = openai_client.fine_tuning.jobs.retrieve("ftjob-8aad139b19084ebf968c1e58f2dc3326")

if completed_job.status == "succeeded":
    fine_tuned_model_id = completed_job.fine_tuned_model
    print(f"Fine-tuned Model ID: {fine_tuned_model_id}")
else:
    print(f"Status: {completed_job.status}")

Fine-tuned Model ID: gpt-4.1-2025-04-14.ft-8aad139b19084ebf968c1e58f2dc3326-cnn-dailymail-summarization


## 11. Deploy the Fine-Tuned Model

Deploy the fine-tuned model to Azure OpenAI as a deployment endpoint. This step is required before making inference calls. The deployment uses GlobalStandard SKU with 50 capacity.

In [5]:
from azure.mgmt.cognitiveservices import CognitiveServicesManagementClient
from azure.mgmt.cognitiveservices.models import Deployment, DeploymentProperties, DeploymentModel, Sku

subscription_id = os.environ.get("AZURE_SUBSCRIPTION_ID")
resource_group = os.environ.get("AZURE_RESOURCE_GROUP")
account_name = os.environ.get("AZURE_AOAI_ACCOUNT")

deployment_name = "gpt-4.1-cnn-dailymail-finetuned"

with CognitiveServicesManagementClient(credential=credential, subscription_id=subscription_id) as cogsvc_client:
    deployment_model = DeploymentModel(format="OpenAI", name=fine_tuned_model_id, version="1")
    deployment_properties = DeploymentProperties(model=deployment_model)
    deployment_sku = Sku(name="GlobalStandard", capacity=50)
    deployment_config = Deployment(properties=deployment_properties, sku=deployment_sku)
    
    print(f"Deploying fine-tuned model: {fine_tuned_model_id}")
    deployment = cogsvc_client.deployments.begin_create_or_update(
        resource_group_name=resource_group,
        account_name=account_name,
        deployment_name=deployment_name,
        deployment=deployment_config,
    )
    
    print("Waiting for deployment to complete...")
    deployment.result()

print(f"Model deployment completed: {deployment_name}")

Deploying fine-tuned model: gpt-4.1-2025-04-14.ft-8aad139b19084ebf968c1e58f2dc3326-cnn-dailymail-summarization
Waiting for deployment to complete...
Model deployment completed: gpt-4.1-cnn-dailymail-finetuned


## 12. Test Fine-Tuned Model

Test your fine-tuned model by generating a summary for a sample news article.

In [9]:
test_article = """Scientists at a leading research university have made a breakthrough in renewable energy technology. The team developed a new type of solar panel that is 40% more efficient than current models. The innovation uses a special coating that captures more sunlight and converts it into electricity more effectively. Researchers say the technology could be available for commercial use within the next five years. The discovery is expected to significantly reduce the cost of solar energy and help accelerate the transition to clean energy sources. Environmental experts have praised the development as a major step forward in the fight against climate change."""

response = openai_client.responses.create(
    model=deployment_name,
    input=[
        {"role": "user", "content": f"Summarize this article:{test_article}"}
    ]
)

print(response.output_text)

Scientists at a leading research university have created a more efficient solar panel .
The new panels are 40% more efficient than current models .
A special coating captures more sunlight and converts it to electricity .
The technology could be available for commercial use within five years .
Experts say the breakthrough will help reduce the cost of clean energy .


##  Congratulations!

You've successfully fine-tuned a model for news summarization using the CNN/DailyMail dataset!

### Next Steps:
1. **Test with more examples**: Try different news articles to evaluate performance
2. **Adjust hyperparameters**: Experiment with different epoch counts, batch sizes, or learning rates
3. **Deploy to production**: Integrate your fine-tuned model into applications
4. **Fine-tune further**: Use your own domain-specific articles for specialized summarization