<img src="https://res.cloudinary.com/dbl53sidm/image/upload/v1696398508/mistral-7b-v0.1_opibjl.jpg" width="100%">

## Instruct Fine-tuning [Mistral 7B Instruct](https://mistral.ai/news/announcing-mistral-7b/) using qLora and Supervise Finetuning


## Meet Mistral 7B Instruct

The team at [MistralAI](https://mistral.ai/news/announcing-mistral-) has created an exceptional language model called Mistral 7B Instruct. It has consistently delivered outstanding results in a range of benchmarks, which positions it as an ideal option for natural language generation and understanding. This guide will concentrate on how to fine-tune the model for coding purposes, but the methodology can effectively be applied to other tasks.



## Prerequisites


Checking if your GPU is correctly detected:

In [None]:
!nvidia-smi

Tue Apr 16 18:14:17 2024       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 535.104.05             Driver Version: 535.104.05   CUDA Version: 12.2     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA L4                      Off | 00000000:00:03.0 Off |                    0 |
| N/A   42C    P8              16W /  72W |      1MiB / 23034MiB |      0%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

## Step 1 - Install necessary packages


In [None]:
!pip install -q -U bitsandbytes
!pip install -q -U git+https://github.com/huggingface/transformers.git
!pip install -q -U git+https://github.com/huggingface/peft.git
!pip install -q -U git+https://github.com/huggingface/accelerate.git
!pip install -q datasets scipy
!pip install -q trl

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone


## Step 2 - Model loading
We'll load the model using QLoRA quantization to reduce the usage of memory


In [None]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16
)

Now we specify the model ID and then we load it with our previously defined quantization configuration.

In [None]:
model_id = "mistralai/Mistral-7B-Instruct-v0.1"

model = AutoModelForCausalLM.from_pretrained(model_id, quantization_config=bnb_config, device_map={"":0})
tokenizer = AutoTokenizer.from_pretrained(model_id, add_eos_token=True)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

generation_config.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/1.47k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/493k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/1.80M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/72.0 [00:00<?, ?B/s]

In [None]:
def get_completion(query: str, model, tokenizer) -> str:
  device = "cuda:0"
  prompt_template = """
  <s>
  [INST]
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
  {query}
  [/INST]
  </s>
  <s>

  """
  prompt = prompt_template.format(query=query)

  encodeds = tokenizer(prompt, return_tensors="pt", add_special_tokens=True)

  model_inputs = encodeds.to(device)


  generated_ids = model.generate(**model_inputs, max_new_tokens=1000, do_sample=True, pad_token_id=tokenizer.eos_token_id)
  decoded = tokenizer.batch_decode(generated_ids)
  return (decoded[0])

In [None]:
query = """
Generate a commentary providing an analysis into the following chess match. Assume the persona of a former chess grandmaster for the same. Here's the match title/tournament, other info: Magnus Carlsen's amazing positional decisions against Vladimir Kramnik . The chess moves of the match are provided as input.
1.d4 Nf6 2.Nf3 d5 3.Bf4 c5 4.e3 Nc6 5.Nbd2 cxd4 6.exd4 Qb6 7.Nb3 Bf5 8.Be2 e6 9.O-O Be7 10.a4 a6 11.a5 Qd8 12.Ne5 Rc8 13.c3 O-O 14.Re1 Ne4 15.Bd3 Nxe5 16.Bxe5 Bg6 17.Bf4 Nd6 18.Bxg6 hxg6 19.Bxd6 Bxd6 20.g3 b5 21.axb6 Qxb6 22.Re2 Rb8 23.Nc1 a5 24.Nd3 Qb5 25.h4 Rfc8 26.Rd2 Ra8 27.h5 gxh5 28.Qxh5 g6 29.Qh6 a4 30.Kg2 Bf8 31.Qf4 Bg7 32.Ne5 Qe8 33.Rd3 f6 34.Ng4 Qf8 35.Rf3 a3 36.bxa3 f5 37.Qg5 Kf7 38.Rh1 Rxa3 39.Rh7 Rcxc3 40.Nh6+ Ke8 41.Qxg6+ Kd8 1-0
"""
result = get_completion(query=query, model=model, tokenizer=tokenizer)
print(result)

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


<s> 
  <s> 
  [INST]
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
  
Generate a commentary providing an analysis into the following chess match. Assume the persona of a former chess grandmaster for the same. Here's the match title/tournament, other info: Magnus Carlsen's amazing positional decisions against Vladimir Kramnik . The chess moves of the match are provided as input.
1.d4 Nf6 2.Nf3 d5 3.Bf4 c5 4.e3 Nc6 5.Nbd2 cxd4 6.exd4 Qb6 7.Nb3 Bf5 8.Be2 e6 9.O-O Be7 10.a4 a6 11.a5 Qd8 12.Ne5 Rc8 13.c3 O-O 14.Re1 Ne4 15.Bd3 Nxe5 16.Bxe5 Bg6 17.Bf4 Nd6 18.Bxg6 hxg6 19.Bxd6 Bxd6 20.g3 b5 21.axb6 Qxb6 22.Re2 Rb8 23.Nc1 a5 24.Nd3 Qb5 25.h4 Rfc8 26.Rd2 Ra8 27.h5 gxh5 28.Qxh5 g6 29.Qh6 a4 30.Kg2 Bf8 31.Qf4 Bg7 32.Ne5 Qe8 33.Rd3 f6 34.Ng4 Qf8 35.Rf3 a3 36.bxa3 f5 37.Qg5 Kf7 38.Rh1 Rxa3 39.Rh7 Rcxc3 40.Nh6+ Ke8 41.Qxg6+ Kd8 1-0

  [/INST]
  </s> 
  <s> 

  </s></s>


## Step 3 - Load dataset for finetuning

### Formatting the Dataset

Now, let's format the dataset in the required [Mistral-7B-Instruct-v0.1 format](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1).

> Many tutorials and blogs skip over this part, but I feel this is a really important step.

We'll put each instruction and input pair between `[INST]` and `[/INST]` output after that, like this:

```
<s>[INST] What is your favorite condiment? [/INST]
Well, I'm quite partial to a good squeeze of fresh lemon juice. It adds just the right amount of zesty flavor to whatever I'm cooking up in the kitchen!</s>
```

You can use the following code to process your dataset and create a JSONL file in the correct format:

In [None]:
import pandas as pd
from datasets import Dataset
prefix_text = 'Below is an instruction that describes a task. Write a response that ' \
               'appropriately completes the request.\n\n'
train_df = pd.read_json('/content/chess_inputs_final.json')
test_df = pd.read_json("/content/test.json")
train_data = Dataset.from_pandas(train_df)
test_data = Dataset.from_pandas(test_df)
def generate_prompt(data_point):
    return f"""<s>[INST]{prefix_text} {data_point["instruction"]} here are the inputs {data_point["input"]} [/INST]{data_point["output"]}</s>"""
text_column = [generate_prompt(data_point) for data_point in train_data]
train_data = train_data.add_column("prompt",text_column)
text_column = [generate_prompt(data_point) for data_point in test_data]
test_data = test_data.add_column("prompt",text_column)


We'll need to tokenize our data so the model can understand.


In [None]:
train_data = train_data.shuffle(seed=1234)  # Shuffle dataset here
train_data = train_data.map(lambda samples: tokenizer(samples["prompt"]), batched=True)
test_data = test_data.shuffle(seed=1234)  # Shuffle dataset here
test_data = test_data.map(lambda samples: tokenizer(samples["prompt"]), batched=True)

Map:   0%|          | 0/100 [00:00<?, ? examples/s]

Map:   0%|          | 0/32 [00:00<?, ? examples/s]

Split dataset into 90% for training and 10% for testing

### After Formatting, We should get something like this

```json
{
"text":"<s>[INST] Create a function to calculate the sum of a sequence of integers. here are the inputs [1, 2, 3, 4, 5] [/INST]
# Python code def sum_sequence(sequence): sum = 0 for num in sequence: sum += num return sum</s>",
"instruction":"Create a function to calculate the sum of a sequence of integers",
"input":"[1, 2, 3, 4, 5]",
"output":"# Python code def sum_sequence(sequence): sum = 0 for num in,
 sequence: sum += num return sum"
"prompt":"<s>[INST] Create a function to calculate the sum of a sequence of integers. here are the inputs [1, 2, 3, 4, 5] [/INST]
# Python code def sum_sequence(sequence): sum = 0 for num in sequence: sum += num return sum</s>"

}
```

While using SFT (**[Supervised Fine-tuning Trainer](https://huggingface.co/docs/trl/main/en/sft_trainer)**) for fine-tuning, we will be only passing in the “text” column of the dataset for fine-tuning.

In [None]:
print(train_data)

Dataset({
    features: ['instruction', 'input', 'output', 'prompt', 'input_ids', 'attention_mask'],
    num_rows: 100
})


## Step 4 - Apply Lora  
Here comes the magic with peft! Let's load a PeftModel and specify that we are going to use low-rank adapters (LoRA) using get_peft_model utility function and  the prepare_model_for_kbit_training method from PEFT.

In [None]:
from peft import LoraConfig, PeftModel, prepare_model_for_kbit_training, get_peft_model
model.gradient_checkpointing_enable()
model = prepare_model_for_kbit_training(model)

In [None]:
print(model)

MistralForCausalLM(
  (model): MistralModel(
    (embed_tokens): Embedding(32000, 4096)
    (layers): ModuleList(
      (0-31): 32 x MistralDecoderLayer(
        (self_attn): MistralSdpaAttention(
          (q_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (k_proj): Linear4bit(in_features=4096, out_features=1024, bias=False)
          (v_proj): Linear4bit(in_features=4096, out_features=1024, bias=False)
          (o_proj): Linear4bit(in_features=4096, out_features=4096, bias=False)
          (rotary_emb): MistralRotaryEmbedding()
        )
        (mlp): MistralMLP(
          (gate_proj): Linear4bit(in_features=4096, out_features=14336, bias=False)
          (up_proj): Linear4bit(in_features=4096, out_features=14336, bias=False)
          (down_proj): Linear4bit(in_features=14336, out_features=4096, bias=False)
          (act_fn): SiLU()
        )
        (input_layernorm): MistralRMSNorm()
        (post_attention_layernorm): MistralRMSNorm()
      )
    )

Use the following function to find out the linear layers for fine tuning.
QLoRA paper : "We find that the most critical LoRA hyperparameter is how many LoRA adapters are used in total and that LoRA on all linear transformer block layers is required to match full finetuning performance."

In [None]:
import bitsandbytes as bnb
def find_all_linear_names(model):
  cls = bnb.nn.Linear4bit #if args.bits == 4 else (bnb.nn.Linear8bitLt if args.bits == 8 else torch.nn.Linear)
  lora_module_names = set()
  for name, module in model.named_modules():
    if isinstance(module, cls):
      names = name.split('.')
      lora_module_names.add(names[0] if len(names) == 1 else names[-1])
    if 'lm_head' in lora_module_names: # needed for 16-bit
      lora_module_names.remove('lm_head')
  return list(lora_module_names)

In [None]:
modules = find_all_linear_names(model)
print(modules)

['k_proj', 'o_proj', 'gate_proj', 'v_proj', 'up_proj', 'down_proj', 'q_proj']


In [None]:
from peft import LoraConfig, get_peft_model

lora_config = LoraConfig(
    r=8,
    lora_alpha=32,
    target_modules=modules,
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, lora_config)

In [None]:
trainable, total = model.get_nb_trainable_parameters()
print(f"Trainable: {trainable} | total: {total} | Percentage: {trainable/total*100:.4f}%")


Trainable: 20971520 | total: 7262703616 | Percentage: 0.2888%


## Step 5 - Run the training!

In [None]:
from huggingface_hub import notebook_login
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [None]:
!cat /root/.cache/huggingface/token

Setting the training arguments:
* for the reason of demo, we just ran it for few steps (100) just to showcase how to use this integration with existing tools on the HF ecosystem.

In [None]:
# from datasets import load_dataset
# data = load_dataset("TokenBender/code_instructions_122k_alpaca_style", split='train')
# data = data.train_test_split(test_size=0.1)
# train_data = data["train"]
# test_data = data["test"]

In [None]:
# import transformers

# tokenizer.pad_token = tokenizer.eos_token


# trainer = transformers.Trainer(
#     model=model,
#     train_dataset=train_data,
#     eval_dataset=test_data,
#     args=transformers.TrainingArguments(
#         per_device_train_batch_size=1,
#         gradient_accumulation_steps=4,
#         warmup_ratio=0.03,
#         max_steps=100,
#         learning_rate=2e-4,
#         fp16=True,
#         logging_steps=1,
#         output_dir="outputs_mistral_b_finance_finetuned_test",
#         optim="paged_adamw_8bit",
#         save_strategy="epoch",
#     ),
#     data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
# )


### Fine-Tuning with qLora and Supervised Fine-Tuning

We're ready to fine-tune our model using qLora. For this tutorial, we'll use the `SFTTrainer` from the `trl` library for supervised fine-tuning. Ensure that you've installed the `trl` library as mentioned in the prerequisites.

In [None]:
#new code using SFTTrainer
import transformers

from trl import SFTTrainer

tokenizer.pad_token = tokenizer.eos_token
torch.cuda.empty_cache()

trainer = SFTTrainer(
    model=model,
    train_dataset=train_data,
    eval_dataset=test_data,
    dataset_text_field="prompt",
    peft_config=lora_config,
    args=transformers.TrainingArguments(
        per_device_train_batch_size=1,
        gradient_accumulation_steps=4,
        warmup_steps=0.03,
        max_steps=100,
        learning_rate=2e-4,
        logging_steps=1,
        output_dir="outputs",
        optim="paged_adamw_8bit",
        save_strategy="epoch",
    ),
    data_collator=transformers.DataCollatorForLanguageModeling(tokenizer, mlm=False),
)

max_steps is given, it will override any value given in num_train_epochs


Start the training

### Let's start the training process

In [None]:
model.config.use_cache = False  # silence the warnings. Please re-enable for inference!
trainer.train()




Step,Training Loss
1,2.9109
2,2.8451
3,2.7221
4,2.651
5,2.7205
6,2.511
7,2.5305
8,2.4329
9,2.4102
10,2.1742




TrainOutput(global_step=100, training_loss=1.9156808269023895, metrics={'train_runtime': 3482.2007, 'train_samples_per_second': 0.115, 'train_steps_per_second': 0.029, 'total_flos': 5.217341941717402e+16, 'train_loss': 1.9156808269023895, 'epoch': 4.0})

 Share adapters on the 🤗 Hub

In [None]:
new_model = "Mistralai-Chess-Commentary-Instruct-Finetune-test" #Name of the model you will be pushing to huggingface model hub

In [None]:
trainer.model.save_pretrained(new_model)

In [None]:
import torch
torch.cuda.empty_cache()

In [None]:
base_model = AutoModelForCausalLM.from_pretrained(
    model_id,
    low_cpu_mem_usage=True,
    return_dict=True,
    torch_dtype=torch.float16,
    device_map={"": 0},
)
merged_model= PeftModel.from_pretrained(base_model, new_model)
merged_model= merged_model.merge_and_unload()

# Save the merged model
merged_model.save_pretrained("merged_model",safe_serialization=True)
tokenizer.save_pretrained("merged_model")
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [None]:
!huggingface-cli login

In [None]:
# Push the model and tokenizer to the Hugging Face Model Hub
merged_model.push_to_hub("nigeldias27/Mistralai-Chess-Commentary-Instruct-Finetune-test", use_temp_dir=False)
tokenizer.push_to_hub("nigeldias27/Mistralai-Chess-Commentary-Instruct-Finetune-test", use_temp_dir=False)

HfHubHTTPError: 403 Client Error: Forbidden for url: https://huggingface.co/api/models/nigeldias27/Mistralai-Chess-Commentary-Instruct-Finetune-test/preupload/main (Request ID: Root=1-661ed6b1-2e491a4c1007ba9e4817101c;a6d1b0ae-8ccf-450d-9e66-06594d7f48ea)

Forbidden: you must use a write token to upload to a repository.

## Step 6 Evaluating the model qualitatively: run an inference!



In [None]:
def get_completion_merged(query: str, model, tokenizer) -> str:
  device = "cuda:0"

  prompt_template = """
  <s>
  [INST]
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
  {query}
  [/INST]
  </s>


  """
  prompt = prompt_template.format(query=query)

  encodeds = tokenizer(prompt, return_tensors="pt", add_special_tokens=True)

  model_inputs = encodeds.to(device)

  generated_ids = merged_model.generate(**model_inputs, max_new_tokens=1000, do_sample=True, pad_token_id=tokenizer.eos_token_id)
  decoded = tokenizer.batch_decode(generated_ids)
  return (decoded[0])

In [None]:
query = """
Generate a commentary providing an analysis into the following chess match. Assume the persona of a former chess grandmaster for the same. Here's the match title/tournament, other info: The intense clash of leaders | Nepo vs Gukesh | FIDE Candidates 2024. The chess moves of the match are provided as input.
1.d4 Nf6 2.c4 e6 3.g3 d5 4.Bg2 c5 5.cxd5 Nxd5 6.Nf3 cxd4 7.O-O Bc5 8.Nxd4 O-O 9.Nb3 Be7 10.e4 Nb4 11.Nc3 N8c6 12.Be3 Qxd1 13.Raxd1 Ne5 14.a3 Nbc6 15.f4 Nc4 16.Bc1 f6 17.Rfe1 Rb8 18.Bf1 Nd6 19.Nb5 Nxb5 20.Bxb5 Kf7 21.Be3 a6 22.Be2 b5 23.Nc5 Rd8 24.Rxd8 Bxd8 25.Rc1 Bb6 26.Kf2 Bxc5 27.Bxc5 Bb7 28.Ke3 Rd8 29.Bb6 Rd7 30.Rd1 Rxd1 31.Bxd1 g6 32.Bc5 h5 33.b3 Bc8 34.a4 e5 35.axb5 axb5 36.f5 Kg7 37.fxg6 Bg4 38.Bc2 Be6 39.Kd2 b4 40.Bd1 Kxg6 1/2-1/2
"""
result = get_completion_merged(query=query, model=model, tokenizer=tokenizer)
print(result)

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


<s> 
  <s> 
  [INST]
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
  
Generate a commentary providing an analysis into the following chess match. Assume the persona of a former chess grandmaster for the same. Here's the match title/tournament, other info: The intense clash of leaders | Nepo vs Gukesh | FIDE Candidates 2024. The chess moves of the match are provided as input.
1.d4 Nf6 2.c4 e6 3.g3 d5 4.Bg2 c5 5.cxd5 Nxd5 6.Nf3 cxd4 7.O-O Bc5 8.Nxd4 O-O 9.Nb3 Be7 10.e4 Nb4 11.Nc3 N8c6 12.Be3 Qxd1 13.Raxd1 Ne5 14.a3 Nbc6 15.f4 Nc4 16.Bc1 f6 17.Rfe1 Rb8 18.Bf1 Nd6 19.Nb5 Nxb5 20.Bxb5 Kf7 21.Be3 a6 22.Be2 b5 23.Nc5 Rd8 24.Rxd8 Bxd8 25.Rc1 Bb6 26.Kf2 Bxc5 27.Bxc5 Bb7 28.Ke3 Rd8 29.Bb6 Rd7 30.Rd1 Rxd1 31.Bxd1 g6 32.Bc5 h5 33.b3 Bc8 34.a4 e5 35.axb5 axb5 36.f5 Kg7 37.fxg6 Bg4 38.Bc2 Be6 39.Kd2 b4 40.Bd1 Kxg6 1/2-1/2

  [/INST]
  </s> 


  </s> [Music] [Music] [Applause] it's the all-important fourth round at the FID candidates 2024 

In [None]:
query='''
Generate a commentary providing an analysis into the following chess match. Assume the persona of a former chess grandmaster for the same. Here's the match title/tournament, other info: Gukesh vs Volotokin | London Chess Classic 2023. The chess moves of the match are provided as input.
"\n1.c4 Nf6 2.Nc3 c5 3.g3 d5 4.cxd5 Nxd5 5.Bg2 Nc7 6.b3 e5 7.Bb2 Be7 8.Rc1 O-O 9.Ne4 b6 10.Bxe5 Bb7 11.Ba1 Qc8 12.g4 Ne6 13.Nh3 Nc6 14.e3 Nb4 15.O-O Nxa2 16.Rb1 Nb4 17.f4 f6 18.Qf3 Bc6 19.f5 Nc7 20.g5 fxg5 21.Qg3 h6 22.f6 gxf6 23.Bxf6 Bxe4 24.Nxg5 Bxf6 25.Rxf6 Rxf6 26.Nxe4+ Kh8 27.Nxf6 Qf5 28.Rf1 1-0"
'''
result = get_completion_merged(query=query, model=model, tokenizer=tokenizer)
print(result)

A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set `padding_side='left'` when initializing the tokenizer.


<s> 
  <s> 
  [INST]
  Below is an instruction that describes a task. Write a response that appropriately completes the request.
  
Generate a commentary providing an analysis into the following chess match. Assume the persona of a former chess grandmaster for the same. Here's the match title/tournament, other info: Gukesh vs Volotokin | London Chess Classic 2023. The chess moves of the match are provided as input.
"
1.c4 Nf6 2.Nc3 c5 3.g3 d5 4.cxd5 Nxd5 5.Bg2 Nc7 6.b3 e5 7.Bb2 Be7 8.Rc1 O-O 9.Ne4 b6 10.Bxe5 Bb7 11.Ba1 Qc8 12.g4 Ne6 13.Nh3 Nc6 14.e3 Nb4 15.O-O Nxa2 16.Rb1 Nb4 17.f4 f6 18.Qf3 Bc6 19.f5 Nc7 20.g5 fxg5 21.Qg3 h6 22.f6 gxf6 23.Bxf6 Bxe4 24.Nxg5 Bxf6 25.Rxf6 Rxf6 26.Nxe4+ Kh8 27.Nxf6 Qf5 28.Rf1 1-0"

  [/INST]
  </s> 


  </s> 
Gukesh takes on Anton Volonin round five of the London system and as we have seen many times in this tournament Shivam's opens the game with 1 C4 so does gukesh and also Volonin we know is a big admirer of gkes and would be more than happy to play ag

In [None]:
!zip -r chessModel.zip /content/Mistralai-Chess-Commentary-Instruct-Finetune-test/

  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/ (stored 0%)
  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/config.json (deflated 45%)
  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/adapter_model.safetensors (deflated 7%)
  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/model-00002-of-00003.safetensors (deflated 8%)
  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/README.md (deflated 66%)
  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/model.safetensors.index.json (deflated 95%)
  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/model-00001-of-00003.safetensors (deflated 9%)
  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/generation_config.json (deflated 22%)
  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/adapter_config.json (deflated 54%)
  adding: content/Mistralai-Chess-Commentary-Instruct-Finetune-test/mode