## **References:**
- **Notebook:** https://colab.research.google.com/drive/1vIrqH5uYDQwsJ4-OO3DErvuv4pBgVwk4?usp=sharing

- **Hugging Face:** https://huggingface.co/unsloth/gemma-2-9b-bnb-4bit

## *Installing the requirements*

In [None]:
%%capture
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes

## *Loading the Model*



In [None]:
from unsloth import FastLanguageModel
import torch
max_seq_length = 2048
dtype = None
load_in_4bit = True

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "unsloth/gemma-2-9b",
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
==((====))==  Unsloth 2024.8: Fast Gemma2 patching. Transformers = 4.44.0.
   \\   /|    GPU: Tesla T4. Max memory: 14.748 GB. Platform = Linux.
O^O/ \_/ \    Pytorch: 2.3.1+cu121. CUDA = 7.5. CUDA Toolkit = 12.1.
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.26.post1. FA2 = False]
 "-____-"     Free Apache license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/6.13G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/173 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/40.0k [00:00<?, ?B/s]

tokenizer.model:   0%|          | 0.00/4.24M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/636 [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.5M [00:00<?, ?B/s]

In [None]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj", "k_proj", "v_proj", "o_proj",
                      "gate_proj", "up_proj", "down_proj",],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",
    use_gradient_checkpointing = "unsloth",
    random_state = 3407,
    use_rslora = False,
    loftq_config = None
)

Unsloth 2024.8 patched 42 layers with 42 QKV layers, 42 O layers and 42 MLP layers.


## *Load the Dataset*

In [None]:
import pandas as pd
from datasets import Dataset

# Load the CSV file using pandas
train_excel_file = pd.read_csv('train_data.csv')
val_excel_file = pd.read_csv('val_data.csv')

# Convert the pandas DataFrame to a datasets Dataset
train_dataset = Dataset.from_pandas(train_excel_file)
val_dataset = Dataset.from_pandas(val_excel_file)

# Now you have two separate datasets for training and validation
print("Training dataset size:", len(train_dataset))
print("Validation dataset size:", len(val_dataset))

Training dataset size: 4164
Validation dataset size: 200


In [None]:
legal_prompt = """You are tasked with generating detailed summaries from the perspectives of both the prosecution and the defense for the following legal judgment text. Each summary should be comprehensive and highlight key aspects relevant to each side.\n\n

    ### judgement:\n{}\n\n
    ### summary:\n{}\n\n
    """
legal_prompt_prosecution = """You are required to generate a detailed summary of the legal judgment text from the perspective of the prosecution. Focus on key arguments, evidence, and legal points that strengthen the prosecution's case.

### Judgment:\n{}\n\n
### Prosecution Summary:\n{}\n\n
"""
legal_prompt_defence = """You are required to generate a detailed summary of the legal judgment text from the perspective of the defence. Focus on key arguments, evidence, and legal points that strengthen the defence's case.

### Judgment:\n{}\n\n
### Defence Summary:\n{}\n\n
"""

EOS_TOKEN = tokenizer.eos_token
def formatting_prompts_func(examples):
    judgement = examples["Judgement"]
    summary       = examples["Perspective-based Summary"]
    # summary = examples["prosecutor_pov"]
    # summary = examples["defense_pov"]

    texts = []
    for judgement, summary in zip(judgement, summary):
        text = legal_prompt.format(judgement, summary) + EOS_TOKEN
        # text = legal_prompt_prosecution.format(judgement, summary) + EOS_TOKEN
        # text = legal_prompt_defence.format(judgement, summary) + EOS_TOKEN
        texts.append(text)
    return { "text" : texts, }
pass

# Mapping
train_dataset = train_dataset.map(formatting_prompts_func, batched = True)
val_dataset = val_dataset.map(formatting_prompts_func, batched = True)

Map:   0%|          | 0/4164 [00:00<?, ? examples/s]

Map:   0%|          | 0/200 [00:00<?, ? examples/s]

### *Data Prep*

## *Train the Model*

In [None]:
from trl import SFTTrainer
from transformers import TrainingArguments
from unsloth import is_bfloat16_supported

repository = "user_name/model_name"


trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = train_dataset,
    eval_dataset = val_dataset,
    dataset_text_field = "text",
    max_seq_length = max_seq_length,
    dataset_num_proc = 2,
    packing = False,
    args = TrainingArguments(
        per_device_train_batch_size = 2,
        gradient_accumulation_steps = 4,
        warmup_steps = 5,
        max_steps = 60,
        learning_rate = 2e-4,
        fp16 = not is_bfloat16_supported(),
        bf16 = is_bfloat16_supported(),
        logging_steps = 1,
        optim = "adamw_8bit",
        weight_decay = 0.01,
        lr_scheduler_type = "linear",
        seed = 3407,
        output_dir = repository,
    ),
)

Map (num_proc=2):   0%|          | 0/4164 [00:00<?, ? examples/s]

Map (num_proc=2):   0%|          | 0/200 [00:00<?, ? examples/s]

max_steps is given, it will override any value given in num_train_epochs


In [None]:
#@title Show current memory stats
gpu_stats = torch.cuda.get_device_properties(0)
start_gpu_memory = round(torch.cuda.max_memory_reserved() / 1024 / 1024 / 1024, 3)
max_memory = round(gpu_stats.total_memory / 1024 / 1024 / 1024, 3)
print(f"GPU = {gpu_stats.name}. Max memory = {max_memory} GB.")
print(f"{start_gpu_memory} GB of memory reserved.")

GPU = Tesla T4. Max memory = 14.748 GB.
6.576 GB of memory reserved.


In [None]:
trainer_stats = trainer.train()

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs = 1
   \\   /|    Num examples = 4,164 | Num Epochs = 1
O^O/ \_/ \    Batch size per device = 2 | Gradient Accumulation steps = 4
\        /    Total batch size = 8 | Total steps = 60
 "-____-"     Number of trainable parameters = 54,018,048


Step,Training Loss
1,1.6473
2,1.6851
3,1.8058
4,1.7261
5,1.6205
6,1.4661
7,1.587
8,1.5461
9,1.5529
10,1.4444


## *Saving and load the model*

In [None]:
# Push model and tokenizer to Hugging Face Hub
trainer.create_model_card()
trainer.push_to_hub(repository)