# Finetuning mistral 7B into MistralBluff

The aime of this notebook is to finetune the pretrained model `mistral 7B v0.3` into a new model called `MistralBluff`. The new model will be trained on the `data/3_data_prepared_for_fine_tuning` dataset. The dataset comes from raw hands data of poker. The llm will be trained to emulate what the player 'IlxxxlI' would do in a given situation. The data is already preprocessed. This notebook sends the data to the mistral api and start the fine tuning process. The fine tuning process is done in the mistral api.

## 0. Prerequisites

In [None]:
import pandas as pd
import json
from pathlib import Path
import os
from dotenv import load_dotenv

load_dotenv()

In [None]:
ROOT_DIR = Path('./../../').resolve()

CLEANED_DATA_DIR = ROOT_DIR / 'data/2_cleaned_data'
PREPARED_DATA_DIR = ROOT_DIR / 'data/3_prepared_data'

OUTPUT_FILE_NAME_TRAIN = 'anatole_data_train.json'
OUTPUT_FILE_NAME_TEST = 'anatole_data_test.json'

OUTPUT_FILE_MISTRAL_NAME_TRAIN = 'data_train.jsonl'
OUTPUT_FILE_MISTRAL_NAME_TEST = 'data_test.jsonl'

MISTRAL_API_KEY = os.getenv('MISTRAL_API_KEY')

## 1. Final formating to fit mistral api

In [10]:
df = pd.read_json(CLEANED_DATA_DIR / OUTPUT_FILE_NAME_TRAIN)
df_train=df.sample(frac=0.95,random_state=0)
df_eval=df.drop(df_train.index)

df_formatted = [
    {
        "messages": [
            {"role": "user", "content": row["instruction"]},
            {"role": "assistant", "content": row["output"]},
        ]
    }
    for index, row in df_train.iterrows()
]

df_formatted2 = [
    {
        "messages": [
            {"role": "user", "content": row["instruction"]},
            {"role": "assistant", "content": row["output"]},
        ]
    }
    for index, row in df_eval.iterrows()
]

with open(PREPARED_DATA_DIR / OUTPUT_FILE_MISTRAL_NAME_TRAIN, "w") as f:
    for line in df_formatted:
        json.dump(line, f)
        f.write("\n")
with open(PREPARED_DATA_DIR / OUTPUT_FILE_MISTRAL_NAME_TEST, "w") as f:
    for line in df_formatted2:
        json.dump(line, f)
        f.write("\n")

In [29]:
len(df)

31077

## 2. Sending the data to the mistral api

In [13]:
import os
from mistralai.client import MistralClient
client = MistralClient(api_key=MISTRAL_API_KEY)

with open(PREPARED_DATA_DIR / OUTPUT_FILE_MISTRAL_NAME_TRAIN, "rb") as f:
    data_train = client.files.create(file=(OUTPUT_FILE_MISTRAL_NAME_TRAIN, f))
with open(PREPARED_DATA_DIR / OUTPUT_FILE_MISTRAL_NAME_TEST, "rb") as f:
    data_eval = client.files.create(file=(OUTPUT_FILE_MISTRAL_NAME_TEST, f))

## 3. Starting the fine tuning process

In [None]:
from mistralai.models.jobs import TrainingParameters

created_jobs = client.jobs.create(
    model="open-mistral-7b",
    training_files=[data_train.id],
    validation_files=[data_eval.id],
    hyperparameters=TrainingParameters(
        training_steps=10,
        learning_rate=0.0001,
        )
)
created_jobs

In [27]:
from mistralai.models.jobs import TrainingParameters
import asyncio

dry_run_job = client.jobs.create(
    model="open-mistral-7b",
    training_files=[data_train.id],
    validation_files=[data_eval.id],
    hyperparameters=TrainingParameters(
        training_steps=150,
        learning_rate=0.0001,
    ),
    dry_run=True,
)
print(dry_run_job)

object='job.metadata' training_steps=50 train_tokens_per_step=131072 data_tokens=4455124 train_tokens=6553600 epochs=1.471 expected_duration_seconds=300
