In [5]:
## Install the necessary libraries
!pip install --upgrade pip

Collecting pip
  Downloading pip-25.0-py3-none-any.whl.metadata (3.7 kB)
Downloading pip-25.0-py3-none-any.whl (1.8 MB)
[2K   [38;2;114;156;31m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.8/1.8 MB[0m [31m7.4 MB/s[0m eta [36m0:00:00[0m MB/s[0m eta [36m0:00:01[0m
[?25hInstalling collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 24.3.1
    Uninstalling pip-24.3.1:
      Successfully uninstalled pip-24.3.1
Successfully installed pip-25.0


In [9]:
!pip install -q tiktoken openai
!pip install numpy

Collecting numpy
  Using cached numpy-2.2.2-cp310-cp310-macosx_14_0_arm64.whl.metadata (62 kB)
Using cached numpy-2.2.2-cp310-cp310-macosx_14_0_arm64.whl (5.4 MB)
Installing collected packages: numpy
Successfully installed numpy-2.2.2


# Preparing a Dataset for Fine-Tuning

This guide outlines the steps to prepare datasets for **Supervised Fine-Tuning (SFT)** and **Direct Preference Optimization (DPO)**. Both methods require well-structured datasets in JSONL format.

---

## Supervised Fine-Tuning (SFT)

Supervised Fine-Tuning requires a dataset containing demonstration examples of the desired behavior. Each example should consist of a **conversation** formatted like the Chat Completions API.

### Dataset Format
Each line in the dataset should represent a conversation, where each message contains the following keys:
- `role`: The role of the speaker (`system`, `user`, or `assistant`).
- `content`: The text of the message.

### Example
```json
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "What's the weather like today?"}, {"role": "assistant", "content": "Today is sunny with a high of 75°F."}]}
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "Can you tell me a joke?"}, {"role": "assistant", "content": "Why don't scientists trust atoms? Because they make up everything!"}]}
{"messages": [{"role": "system", "content": "You are a helpful assistant."}, {"role": "user", "content": "How do I bake a cake?"}, {"role": "assistant", "content": "To bake a cake, you'll need flour, sugar, eggs, butter, and baking powder. Mix them, pour the batter into a pan, and bake at 350°F for 30 minutes."}]}
```

# Preparing a Dataset for Direct Preference Optimization (DPO)

Direct Preference Optimization (DPO) fine-tuning requires a dataset containing examples of **prompts** paired with a **preferred output** (ideal response) and a **non-preferred output** (suboptimal response). The model learns to prioritize the preferred output during training.

---

## Dataset Format

Each line in the dataset should be in JSONL format with the following structure:

### Keys:
1. **`input`**:
   - Contains the context or conversation leading to the model’s response.
   - Should follow the Chat Completions API format:
     - `messages`: A list of messages forming the conversation.
       - Each message includes:
         - `role`: Role of the speaker (`system`, `user`, `assistant`).
         - `content`: Text content of the message.
     - Optional fields:
       - `tools`: Tools available for the model to use.
       - `parallel_tool_calls`: Boolean to indicate if tools can be used concurrently.

2. **`preferred_output`**:
   - The ideal assistant response for the given input.
   - Follows the same message format as the Chat Completions API.

3. **`non_preferred_output`**:
   - A suboptimal response that demonstrates behavior you want the model to avoid.
   - Follows the same message format as the Chat Completions API.

---

## Example Dataset

```jsonl
{
  "input": {
    "messages": [
      {
        "role": "user",
        "content": "Hello, can you tell me how cold San Francisco is today?"
      }
    ],
    "tools": [],
    "parallel_tool_calls": true
  },
  "preferred_output": [
    {
      "role": "assistant",
      "content": "Today in San Francisco, it is not quite cold as expected. Morning clouds will give way to sunshine, with a high near 68°F (20°C) and a low around 57°F (14°C)."
    }
  ],
  "non_preferred_output": [
    {
      "role": "assistant",
      "content": "It is not particularly cold in San Francisco today."
    }
  ]
}


In [1]:
from openai import OpenAI

In [None]:
#Only for google colab
from google.colab import userdata
api_key = userdata.get("OPENAI_API_KEY")

In [2]:
import json
import tiktoken # for token counting
import numpy as np
from collections import defaultdict

In [3]:
import json
from collections import defaultdict
from tiktoken import get_encoding

def validate_and_estimate_finetuning_data(file_path):
    # Setup
    format_errors = defaultdict(int)
    token_counts = []
    total_tokens = 0
    encoding = get_encoding("cl100k_base")  # For OpenAI models


    # Load the dataset
    with open(file_path, 'r', encoding='utf-8') as f:
        dataset = [json.loads(line) for line in f]

    for idx, ex in enumerate(dataset):
        if not isinstance(ex, dict):
            format_errors["data_type"] += 1
            continue

        messages = ex.get("messages", None)
        if not messages:
            format_errors["missing_messages_list"] += 1
            continue

        # Validate format
        conversation_tokens = 0
        assistant_message_found = False

        for message in messages:
            if "role" not in message or "content" not in message:
                format_errors["message_missing_key"] += 1
                continue

            if any(k not in ("role", "content", "name", "function_call", "weight") for k in message):
                format_errors["message_unrecognized_key"] += 1

            if message.get("role", None) not in ("system", "user", "assistant"):
                format_errors["unrecognized_role"] += 1

            content = message.get("content", None)
            function_call = message.get("function_call", None)

            if (not content and not function_call) or not isinstance(content, str):
                format_errors["missing_content"] += 1

            # Count tokens for each message
            try:
                message_tokens = len(encoding.encode(message.get("content", "")))
                conversation_tokens += message_tokens
            except Exception as e:
                format_errors["tokenization_error"] += 1

            if message.get("role") == "assistant":
                assistant_message_found = True

        if not assistant_message_found:
            format_errors["example_missing_assistant_message"] += 1

        token_counts.append(conversation_tokens)
        total_tokens += conversation_tokens

    # Output results
    return {
        "format_errors": dict(format_errors),
        "token_counts": token_counts,
        "total_tokens": total_tokens,
    }



In [6]:
import os
print(os.getcwd())
training_File_Path = os.path.join(os.getcwd(),"Week_3/Day_3/Files/train_data_aurosociety.jsonl")
validation_File_Path = os.path.join(os.getcwd(),"Week_3/Day_3/Files/validation_data_aurosociety.jsonl")
print(training_File_Path)
print(validation_File_Path)

/Users/ashish/Desktop/vettura-genai/Codes
/Users/ashish/Desktop/vettura-genai/Codes/Week_3/Day_3/Files/train_data_aurosociety.jsonl
/Users/ashish/Desktop/vettura-genai/Codes/Week_3/Day_3/Files/validation_data_aurosociety.jsonl


In [7]:
## Training data
result = validate_and_estimate_finetuning_data(training_File_Path)

# Print Results
print("Training Data")
print("Format Errors:", result["format_errors"])
print("Token Counts per Conversation:", result["token_counts"])
print("Total Tokens:", result["total_tokens"])

result = validate_and_estimate_finetuning_data(validation_File_Path)

## Test dataset
print("\n\nTest Data")
print("Format Errors:", result["format_errors"])
print("Token Counts per Conversation:", result["token_counts"])
print("Total Tokens:", result["total_tokens"])

Training Data
Format Errors: {'missing_content': 3}
Token Counts per Conversation: [39, 72, 625, 73, 71, 77, 66, 138, 97, 514, 182, 70, 64, 109, 41, 108, 178, 85, 62, 63, 247, 120, 56, 131, 143, 68, 47, 133, 87, 471, 54, 50, 59, 55, 55, 70, 294, 777, 65, 105, 49, 73, 72, 100, 82, 61, 83, 74, 67, 63, 58, 46, 83, 121, 70, 63, 76, 73, 53, 53, 75, 130, 220, 79, 46, 163, 57, 89, 51, 60, 75, 147, 50, 606, 43, 58, 52, 55, 56, 86, 88, 43, 70, 101, 100, 118, 75, 53, 47, 174, 60, 104, 87, 47, 75, 59, 60, 529, 122, 60, 50, 78, 87, 61, 132, 58, 70, 95, 73, 194, 80, 61, 198, 77, 59, 75, 45, 59, 71, 54, 93, 96, 78, 51, 81, 69, 59, 72, 54, 88, 46, 69, 147, 57, 68, 96, 63, 62, 275, 60, 405, 132, 74, 95, 155, 87, 96, 248, 226, 64, 50, 171, 78, 60, 101, 54, 112, 50, 146, 57, 75, 399, 53, 51, 43, 54, 652, 60, 65, 70, 171, 72, 49, 60, 55, 42, 85, 54, 54, 56, 60, 90, 69, 230, 69, 67, 73, 91, 79, 180, 67, 56, 42, 43, 102, 67, 130, 118, 66, 140, 132, 63, 98, 91, 66, 77, 61, 46, 70, 63, 72, 75, 45, 83, 61, 82

In [8]:
#Run for local machine. Do not run on colab
!pip install python-dotenv
!pip install wandb
from dotenv import load_dotenv
import wandb
import os

# Load environment variables from a .env file
load_dotenv()

# Get the OpenAI API key
api_key = os.getenv("OPENAI_API_KEY")

print("OpenAI API Key loaded successfully.")

wandb.login()

OpenAI API Key loaded successfully.


[34m[1mwandb[0m: Currently logged in as: [33mashishkumarsahani[0m ([33mashishkumarsahani-vettura[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


True

In [9]:
## create a client
client = OpenAI(api_key=api_key)

# Function to check if a file already exists on OpenAI
def get_existing_file_id(filename):
    files = client.files.list()
    for file in files.data:
        if file.filename == filename:
            return file.id  # Return the existing file ID
    return None  # File does not exist

# Function to delete a file by ID
def delete_file(file_id):
    response = client.files.delete(file_id)
    return response.deleted

# Check and delete training file
file_name = os.path.basename(training_File_Path)
training_file_id = get_existing_file_id(file_name)
if training_file_id:
    print(f"Deleting existing training file: {training_File_Path}")
    delete_file(training_file_id)

# Check and delete validation file
file_name = os.path.basename(validation_File_Path)
validation_file_id = get_existing_file_id(file_name)
if validation_file_id:
    print(f"Deleting existing validation file: {validation_File_Path}")
    delete_file(validation_file_id)

# Upload the training file
training = client.files.create(
    file=open(training_File_Path, "rb"),
    purpose="fine-tune"
)
print(f"Training file uploaded: {training.id}")

# Upload the validation file
validation = client.files.create(
    file=open(validation_File_Path, "rb"),
    purpose="fine-tune"
)
print(f"Validation file uploaded: {validation.id}")

Deleting existing training file: /Users/ashish/Desktop/vettura-genai/Codes/Week_3/Day_3/Files/train_data_aurosociety.jsonl
Deleting existing validation file: /Users/ashish/Desktop/vettura-genai/Codes/Week_3/Day_3/Files/validation_data_aurosociety.jsonl
Training file uploaded: file-CRNq4Y3RiQNaujDKo8d5cH
Validation file uploaded: file-DuGstCdcNcX6gGZMm7LCTh


In [10]:
## List all the files to choose its id for fine tuning with it's data
files = client.files.list()
print(files.data)

[FileObject(id='file-DuGstCdcNcX6gGZMm7LCTh', bytes=230717, created_at=1738806607, filename='validation_data_aurosociety.jsonl', object='file', purpose='fine-tune', status='processed', status_details=None), FileObject(id='file-CRNq4Y3RiQNaujDKo8d5cH', bytes=325644, created_at=1738806605, filename='train_data_aurosociety.jsonl', object='file', purpose='fine-tune', status='processed', status_details=None), FileObject(id='file-SjXXLLvHUsdYHGVqS8Ls7C', bytes=3816, created_at=1738806126, filename='step_metrics.csv', object='file', purpose='fine-tune-results', status='processed', status_details=None), FileObject(id='file-V7eSNQnY5i5R923coeCir9', bytes=3958, created_at=1738805149, filename='Sarcastic_Bot_Validation.jsonl', object='file', purpose='fine-tune', status='processed', status_details=None), FileObject(id='file-JNz8KtJVgg7nymK5S9NhmK', bytes=6507, created_at=1738805107, filename='Sarcastic_Bot_Training.jsonl', object='file', purpose='fine-tune', status='processed', status_details=None

In [11]:
## Paste the file id into the training_file parameter and choose the model and adjust the hyperparameters if you want to tune it
job = client.fine_tuning.jobs.create(
    training_file= training.id,
    validation_file=validation.id,
    model = "gpt-4o-mini-2024-07-18",
    method={
        "type": "supervised",
        "supervised": {
            "hyperparameters": {
                "n_epochs": 3,  # Number of epochs
                "batch_size": 128,  # Batch size
                "learning_rate_multiplier": 0.8,  # Learning rate scaling factor
            }
        }
    },
    integrations= [
        {
            "type": "wandb",
            "wandb": {
                "project": "aurobindo_bot_finetuning_project",
                "tags": ["bot", "aurobindo", "finetuning"]
            }
        }
    ]
)
print(job)

FineTuningJob(id='ftjob-B6AaNvTuF5C2EBWluXMCgsKz', created_at=1738807247, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(batch_size=128, learning_rate_multiplier=0.8, n_epochs=3), model='gpt-4o-mini-2024-07-18', object='fine_tuning.job', organization_id='org-zlIVXJ18RvnGhemNfGP9NDlz', result_files=[], seed=12487095, status='validating_files', trained_tokens=None, training_file='file-CRNq4Y3RiQNaujDKo8d5cH', validation_file='file-DuGstCdcNcX6gGZMm7LCTh', estimated_finish=None, integrations=[FineTuningJobWandbIntegrationObject(type='wandb', wandb=FineTuningJobWandbIntegration(project='aurobindo_bot_finetuning_project', entity=None, name=None, tags=None, run_id='ftjob-B6AaNvTuF5C2EBWluXMCgsKz'))], method=Method(dpo=None, supervised=MethodSupervised(hyperparameters=MethodSupervisedHyperparameters(batch_size=128, learning_rate_multiplier=0.8, n_epochs=3)), type='supervised'), user_provided_suffix=None)


In [12]:
## Listing all the recent jobs
all_jobs = client.fine_tuning.jobs.list(limit=10).data
print(all_jobs)

[FineTuningJob(id='ftjob-B6AaNvTuF5C2EBWluXMCgsKz', created_at=1738807247, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(batch_size=128, learning_rate_multiplier=0.8, n_epochs=3), model='gpt-4o-mini-2024-07-18', object='fine_tuning.job', organization_id='org-zlIVXJ18RvnGhemNfGP9NDlz', result_files=[], seed=12487095, status='running', trained_tokens=None, training_file='file-CRNq4Y3RiQNaujDKo8d5cH', validation_file='file-DuGstCdcNcX6gGZMm7LCTh', estimated_finish=None, integrations=[FineTuningJobWandbIntegrationObject(type='wandb', wandb=FineTuningJobWandbIntegration(project='aurobindo_bot_finetuning_project', entity=None, name=None, tags=None, run_id='ftjob-B6AaNvTuF5C2EBWluXMCgsKz'))], method=Method(dpo=None, supervised=MethodSupervised(hyperparameters=MethodSupervisedHyperparameters(batch_size=128, learning_rate_multiplier=0.8, n_epochs=3)), type='supervised'), user_provided_suffix=None), FineTuningJob(id='ft

In [13]:
## Prinint the recent job to get the fine-tuned model name
print(all_jobs[0])
print(client.fine_tuning.jobs.retrieve(all_jobs[0].id))

FineTuningJob(id='ftjob-B6AaNvTuF5C2EBWluXMCgsKz', created_at=1738807247, error=Error(code=None, message=None, param=None), fine_tuned_model=None, finished_at=None, hyperparameters=Hyperparameters(batch_size=128, learning_rate_multiplier=0.8, n_epochs=3), model='gpt-4o-mini-2024-07-18', object='fine_tuning.job', organization_id='org-zlIVXJ18RvnGhemNfGP9NDlz', result_files=[], seed=12487095, status='running', trained_tokens=None, training_file='file-CRNq4Y3RiQNaujDKo8d5cH', validation_file='file-DuGstCdcNcX6gGZMm7LCTh', estimated_finish=None, integrations=[FineTuningJobWandbIntegrationObject(type='wandb', wandb=FineTuningJobWandbIntegration(project='aurobindo_bot_finetuning_project', entity=None, name=None, tags=None, run_id='ftjob-B6AaNvTuF5C2EBWluXMCgsKz'))], method=Method(dpo=None, supervised=MethodSupervised(hyperparameters=MethodSupervisedHyperparameters(batch_size=128, learning_rate_multiplier=0.8, n_epochs=3)), type='supervised'), user_provided_suffix=None)
FineTuningJob(id='ftjo

In [14]:
import time
import requests
checkpoints = None

# Function to get the latest accuracy and loss from checkpoints
def get_latest_accuracy(job_id, api_key):
    url = f"https://api.openai.com/v1/fine_tuning/jobs/{job_id}/checkpoints"
    headers = {"Authorization": f"Bearer {api_key}"}

    response = requests.get(url, headers=headers)
    checkpoints = response.json().get("data", [])

    if not checkpoints:
        return None, None  # Return None if no checkpoints are available

    # Find the latest checkpoint based on step_number
    latest_checkpoint = max(checkpoints, key=lambda c: c["step_number"])
    latest_accuracy = latest_checkpoint["metrics"]["full_valid_mean_token_accuracy"]
    latest_loss = latest_checkpoint["metrics"]["full_valid_loss"]
    return latest_accuracy, latest_loss

# Function to monitor fine-tuning job and print training/validation metrics
def monitor_finetuning_progress(job_id, api_key, check_interval=10):
    while True:
        try:
            # Retrieve the fine-tuning job status
            job_status = client.fine_tuning.jobs.retrieve(job_id)

            # Print basic job details
            print(f"Job ID: {job_status.id}")
            print(f"Status: {job_status.status}")

            # Check if the job has completed
            if job_status.status in ["succeeded", "failed"]:
                print(f"Fine-tuning job {job_status.status}.")
                model_id = job_status.fine_tuned_model
                result_file_id = job_status.result_files[0]
                return job_status, model_id, result_file_id
            
            # Retrieve and print the latest accuracy and loss
            latest_accuracy, latest_loss = get_latest_accuracy(job_id, api_key)
            if latest_accuracy is not None and latest_loss is not None:
                print(f"Latest Accuracy: {latest_accuracy:.3f}")
                print(f"Latest Loss: {latest_loss:.3f}")
            else:
                print("No checkpoints available yet.")
                
            # Wait before the next check
            print(f"Checking again in {check_interval} seconds...\n")
            time.sleep(check_interval)

        except Exception as e:
            print(f"An error occurred: {e}. Retrying in {check_interval} seconds...\n")
            time.sleep(check_interval)


# Replace `fine_tuning_job_id` with your actual job ID
fine_tuning_job_id = all_jobs[0].id
status, model_name, result_file_id = monitor_finetuning_progress(fine_tuning_job_id, api_key, 10)
print(f"Status: {status}")
print(f"Model Name: {model_name}")
print(f"Result file id: {result_file_id}")

Job ID: ftjob-B6AaNvTuF5C2EBWluXMCgsKz
Status: running
No checkpoints available yet.
Checking again in 10 seconds...

Job ID: ftjob-B6AaNvTuF5C2EBWluXMCgsKz
Status: running
No checkpoints available yet.
Checking again in 10 seconds...

Job ID: ftjob-B6AaNvTuF5C2EBWluXMCgsKz
Status: running
No checkpoints available yet.
Checking again in 10 seconds...

Job ID: ftjob-B6AaNvTuF5C2EBWluXMCgsKz
Status: running
No checkpoints available yet.
Checking again in 10 seconds...

Job ID: ftjob-B6AaNvTuF5C2EBWluXMCgsKz
Status: running
No checkpoints available yet.
Checking again in 10 seconds...

Job ID: ftjob-B6AaNvTuF5C2EBWluXMCgsKz
Status: running
No checkpoints available yet.
Checking again in 10 seconds...

Job ID: ftjob-B6AaNvTuF5C2EBWluXMCgsKz
Status: running
No checkpoints available yet.
Checking again in 10 seconds...

Job ID: ftjob-B6AaNvTuF5C2EBWluXMCgsKz
Status: running
No checkpoints available yet.
Checking again in 10 seconds...

Job ID: ftjob-B6AaNvTuF5C2EBWluXMCgsKz
Status: running
N

In [15]:
response = requests.get(
    f"https://api.openai.com/v1/fine_tuning/jobs/{all_jobs[0].id}/checkpoints",
    headers={"Authorization": f"Bearer {api_key}"}
)
checkpoints = response.json().get("data", [])
for checkpoint in checkpoints:
    print(checkpoint)

{'object': 'fine_tuning.job.checkpoint', 'id': 'ftckpt_g9QwdBURXb7hj0LLcUvpHQ1u', 'created_at': 1738807736, 'fine_tuned_model_checkpoint': 'ft:gpt-4o-mini-2024-07-18:personal::AxlcK35f', 'fine_tuning_job_id': 'ftjob-B6AaNvTuF5C2EBWluXMCgsKz', 'metrics': {'step': 13}, 'step_number': 13}
{'object': 'fine_tuning.job.checkpoint', 'id': 'ftckpt_txi1AY2KG9ap3ROd2qfK6hw2', 'created_at': 1738807671, 'fine_tuned_model_checkpoint': 'ft:gpt-4o-mini-2024-07-18:personal::AxlcJjbK:ckpt-step-10', 'fine_tuning_job_id': 'ftjob-B6AaNvTuF5C2EBWluXMCgsKz', 'metrics': {'step': 10}, 'step_number': 10}
{'object': 'fine_tuning.job.checkpoint', 'id': 'ftckpt_JkS6E1LZ0w8OCuya1TahZSmX', 'created_at': 1738807593, 'fine_tuned_model_checkpoint': 'ft:gpt-4o-mini-2024-07-18:personal::AxlcJyUY:ckpt-step-5', 'fine_tuning_job_id': 'ftjob-B6AaNvTuF5C2EBWluXMCgsKz', 'metrics': {'step': 5}, 'step_number': 5}


In [16]:
import requests

def print_result_file_content(file_id, api_key):
    # API endpoint to retrieve file content
    url = f"https://api.openai.com/v1/files/{file_id}/content"
    headers = {"Authorization": f"Bearer {api_key}"}

    # Request the file content
    response = requests.get(url, headers=headers)
    
    if response.status_code == 200:
        # Print the contents of the file
        print("Result File Contents:")
        print(response.text)
    else:
        print(f"Failed to retrieve file content. Status Code: {response.status_code}")
        print(f"Error: {response.json()}")

# Print the result file content
print_result_file_content(result_file_id, api_key)


Result File Contents:
c3RlcCx0cmFpbl9sb3NzLHRyYWluX2FjY3VyYWN5LHZhbGlkX2xvc3MsdmFsaWRfbWVhbl90b2tlbl9hY2N1cmFjeSx0cmFpbl9tZWFuX3Jld2FyZCxmdWxsX3ZhbGlkYXRpb25fbWVhbl9yZXdhcmQKMSwzLjQ4NzAyLDAuNDExMDIsMy43NDc3NCwwLjM3ODEsLAoyLDMuNjU4MTcsMC4zODc0NiwyLjgxMDE3LDAuNDMyMzQsLAozLDIuODY2MDQsMC40Mjg4NiwyLjU1MzMzLDAuNDQyNTEsLAo0LDIuNjA5MywwLjQzNzQ1LDIuMzgyNDUsMC40NTgzOCwsCjUsMi40NjYyOSwwLjQ1NTM0LDIuMzg5ODIsMC40NTc4NCwsCjYsMi4zNjYyNCwwLjQ2OTAyLDIuMzI5ODcsMC40NzEyLCwKNywyLjI3OTc5LDAuNDc1MDgsMi4zODI0NywwLjQ1ODksLAo4LDIuMjg4MzcsMC40NzQwMSwyLjI2MTA2LDAuNDY5OSwsCjksMi4zMzkzLDAuNDY4OTYsMi4yODc3OCwwLjQ3MDY2LCwKMTAsMi4yMjE4MSwwLjQ3Njg2LDIuMzMwMjksMC40Njg2NywsCjExLDIuMjAyOCwwLjQ5NjU2LDIuMjcwNjYsMC40NzA5NiwsCjEyLDIuMjY3OTEsMC40Nzc2NSwyLjI0MzMyLDAuNDg0MjgsLAoxMywyLjAwMzE5LDAuNTM4NzYsMi4zMDI2LDAuNDY4MDMsLAo=


In [17]:
## Inferencing the fine tuned model
def query(user_input):
  completion = client.chat.completions.create(
      model= model_name,
      messages=[
          {"role": "system", "content": "You are spirit of Aurobindo answer the user queries in his style."},
          {"role": "user", "content": user_input }
      ]
  )

  return completion.choices[0].message.content

In [18]:
response = query("What is the supermind?")
print(response)

The supramental consciousness is the divine life, and as such, it is a life not only free from all ignorance but conscious and conscious of all things as it establishes them. Being self-existence, it is the one existence, and therefore it sustains all existing things, and being eternal, it is to be identified with all things in their eternal truth.
