# Jupyter Notebook: Tech Challenge - Fine-Tuning de um Foundation Model

## Introdu√ß√£o

Este notebook implementa o Tech Challenge da fase 3, que consiste em realizar o fine-tuning de um modelo foundational utilizando o dataset "AmazonTitles-1.3MM". O objetivo √© treinar o modelo para responder perguntas dos usu√°rios com base nos t√≠tulos e descri√ß√µes de produtos do dataset.

## Fluxo de Trabalho

1. **Escolha e Prepara√ß√£o do Dataset**: Utilizamos o dataset "AmazonTitles-1.3MM" (amostra), que cont√©m t√≠tulos e descri√ß√µes de produtos da Amazon.
2. **Chamada do Foundation Model**: Carregamos o modelo base e testamos seu comportamento antes do fine-tuning.
3. **Execu√ß√£o do Fine-Tuning**: Realizamos o fine-tuning do modelo utilizando t√©cnicas de PEFT (Parameter-Efficient Fine-Tuning).
4. **Avalia√ß√£o do Modelo**: Comparamos o desempenho do modelo antes e depois do fine-tuning.

In [1]:
# ============================
# 1. Instalar Depend√™ncias
# ============================
!pip install -q unsloth accelerate peft trl bitsandbytes transformers datasets

# Garantir a vers√£o mais recente de bitsandbytes
!pip uninstall -y bitsandbytes
!pip install bitsandbytes --prefer-binary --upgrade --no-cache-dir
import bitsandbytes as bnb
print(f'Vers√£o do bitsandbytes: {bnb.__version__}')

[2K     [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m46.2/46.2 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m192.7/192.7 kB[0m [31m18.1 MB/s[0m eta [36m0:00:00[0m
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m318.9/318.9 kB[0m [31m28.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m76.0/76.0 MB[0m [31m30.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m [32m487.4/487.4 kB[0m [31m37.6 MB/s[0m eta [36m0:00:00[0

## 2. Escolha e Prepara√ß√£o do Dataset

Utilizamos o dataset "AmazonTitles-1.3MM", que cont√©m consultas textuais reais de usu√°rios e t√≠tulos associados de produtos relevantes encontrados na Amazon, junto com suas descri√ß√µes. Para este projeto, estamos usando uma amostra do dataset (trn-amostra.jsonl).

### Estrutura do Dataset
- **uid**: ID √∫nico do item
- **title**: T√≠tulo do produto
- **content**: Descri√ß√£o do produto
- **target_ind** e **target_rel**: √çndices e relev√¢ncia para modelos de recupera√ß√£o

### Prepara√ß√£o dos Dados
1. Carregamos o dataset a partir do arquivo JSONL
2. Filtramos entradas com descri√ß√£o vazia
3. Criamos uma estrutura de prompt/response para o fine-tuning
4. Convertemos para o formato Hugging Face Dataset

In [2]:
# ============================
# 2. Carregar e Preparar Dataset
# ============================
import json
import pandas as pd
from datasets import Dataset

# Caminho para o dataset (ajuste conforme necess√°rio)
dataset_path = "trn-amostra.jsonl"

# Carregar dados
with open(dataset_path, 'r') as f:
    lines = f.readlines()
    data = [json.loads(line) for line in lines]

# Criar DataFrame
df = pd.DataFrame(data)

# Filtrar entradas com descri√ß√£o vazia
df = df[df["content"].str.strip() != ""]

# Criar estrutura de prompt/response
df_prompt = pd.DataFrame({
    "prompt": df["title"].apply(lambda x: f"What is the description of product {x} ?"),
    "response": df["content"]
})

# Converter para Hugging Face Dataset
hf_dataset = Dataset.from_pandas(df_prompt)
print(f'Tamanho do dataset: {len(hf_dataset)}')
print(f'Amostra do dataset: {hf_dataset[0]}')

Tamanho do dataset: 7642
Amostra do dataset: {'prompt': 'What is the description of product Girls Ballet Tutu Neon Pink ?', 'response': 'High quality 3 layer ballet tutu. 12 inches in length', '__index_level_0__': 0}


## 3. Chamada do Foundation Model

Utilizamos o modelo Llama-3.2-1B-Instruct da Unsloth, que √© uma vers√£o otimizada do Llama 3.2 de 1 bilh√£o de par√¢metros. Este modelo √© adequado para tarefas de instru√ß√£o e √© eficiente para fine-tuning em hardware com recursos limitados.

### Configura√ß√£o do Modelo
- **Modelo Base**: unsloth/Llama-3.2-1B-Instruct-bnb-4bit
- **Quantiza√ß√£o**: 4-bit para reduzir o uso de mem√≥ria
- **Comprimento M√°ximo de Sequ√™ncia**: 2048 tokens
- **T√©cnica de Fine-Tuning**: LoRA (Low-Rank Adaptation) para PEFT (Parameter-Efficient Fine-Tuning)

### Par√¢metros LoRA
- **r**: 16 (rank da matriz de adapta√ß√£o)
- **lora_alpha**: 16 (escala de adapta√ß√£o)
- **lora_dropout**: 0 (sem dropout para estabilidade)
- **M√≥dulos Alvo**: Proje√ß√µes de aten√ß√£o (q_proj, k_proj, v_proj, o_proj) e MLP (gate_proj, up_proj, down_proj)

In [3]:
# ============================
# 3. Configurar o Prompt e o Modelo
# ============================
from unsloth import FastLanguageModel

# Definir o template Alpaca
alpaca_prompt = """### Instruction: Provide the exact product description based on the title.
### Input: {}
### Response: {}"""

# Fun√ß√£o para formatar os prompts
def formatting_prompts_func(examples):
    texts = [alpaca_prompt.format(examples["prompt"][i], examples["response"][i]) for i in range(len(examples["prompt"]))]
    return {"text": texts}

# Aplicar formata√ß√£o ao dataset
hf_dataset = hf_dataset.map(formatting_prompts_func, batched=True)

# Carregar o modelo e tokenizer
model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/Llama-3.2-1B-Instruct-bnb-4bit",
    max_seq_length=2048,
    dtype=None,
    load_in_4bit=True,
)

# Configurar LoRA para PEFT
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth",
    random_state=3407,
)
print('Modelo configurado com LoRA para fine-tuning.')

ü¶• Unsloth: Will patch your computer to enable 2x faster free finetuning.
ü¶• Unsloth Zoo will now patch everything to make training faster!


Map:   0%|          | 0/7642 [00:00<?, ? examples/s]

==((====))==  Unsloth 2025.3.19: Fast Llama patching. Transformers: 4.50.0.
   \\   /|    NVIDIA A100-SXM4-40GB. Num GPUs = 1. Max memory: 39.557 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.6.0+cu124. CUDA: 8.0. CUDA Toolkit: 12.4. Triton: 3.2.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.29.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/1.03G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/234 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/54.7k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/17.2M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/454 [00:00<?, ?B/s]

Unsloth 2025.3.19 patched 16 layers with 16 QKV layers, 16 O layers and 16 MLP layers.


Modelo configurado com LoRA para fine-tuning.


In [4]:
# ============================
# 3.3 Testar o Modelo Antes do Fine-Tuning
# ============================
import torch

# Exemplos de teste
test_prompts = [
    "What is the description of product Girls Ballet Tutu ?",
    "What is the description of product The Wall: Images and Offerings from the Vietnam Veterans Memorial ?"
]

# Configurar o modelo para infer√™ncia
FastLanguageModel.for_inference(model)

# Gerar respostas
print('\nTeste do modelo antes do fine-tuning:')
for prompt in test_prompts:
    inputs = tokenizer(
        alpaca_prompt.format(prompt, ""),  # Deixa o response vazio para o modelo preencher
        return_tensors="pt"
    ).to("cuda" if torch.cuda.is_available() else "cpu")

    outputs = model.generate(
        **inputs,
        max_new_tokens=300,
        temperature=0.1,
        top_p=0.9,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    response_clean = response.split("### Response:")[1].strip() if "### Response:" in response else response
    print(f'Pergunta: {prompt}')
    print(f'Resposta pr√©-treino: {response_clean}\n')


Teste do modelo antes do fine-tuning:
Pergunta: What is the description of product Girls Ballet Tutu ?
Resposta pr√©-treino: A girls' ballet tutu is a type of undergarment worn by ballet dancers to provide a comfortable and secure fit. It is typically made of a lightweight, breathable material such as cotton or silk, and is designed to be worn under a skirt or leotard. The tutu is usually adorned with intricate designs or patterns, and may feature a variety of embellishments such as sequins, beads, or appliques. Girls' ballet tutus are a popular choice for ballet classes, performances, and competitions, as they provide a comfortable and secure fit while also allowing for a full range of motion. They are also a great way to add a touch of elegance and sophistication to a ballet-inspired outfit.

Pergunta: What is the description of product The Wall: Images and Offerings from the Vietnam Veterans Memorial ?
Resposta pr√©-treino: The Wall: Images and Offerings from the Vietnam Veterans M

## 4. Execu√ß√£o do Fine-Tuning

Realizamos o fine-tuning do modelo utilizando a biblioteca TRL (Transformer Reinforcement Learning) com o SFTTrainer (Supervised Fine-Tuning Trainer).

### Par√¢metros de Treinamento
- **Batch Size**: 2 por dispositivo
- **Gradient Accumulation Steps**: 4 (efetivamente um batch size de 8)
- **Learning Rate**: 2e-4
- **Warmup Steps**: 5
- **Max Steps**: 60 (limitado para testes r√°pidos)
- **Otimizador**: AdamW 8-bit (para efici√™ncia de mem√≥ria)
- **Precis√£o**: BF16 (bfloat16)

O fine-tuning √© realizado com PEFT (Parameter-Efficient Fine-Tuning) usando LoRA, o que permite treinar apenas uma pequena fra√ß√£o dos par√¢metros do modelo (cerca de 1.13% dos par√¢metros totais).

In [5]:
# ============================
# 4. Realizar o Fine-Tuning
# ============================
from trl import SFTTrainer
from transformers import TrainingArguments

# Configurar o treinador
trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=hf_dataset,
    dataset_text_field="text",
    max_seq_length=2048,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        warmup_steps=5,
        max_steps=60,  # Limitado para testes r√°pidos; ajuste para mais passos ou √©pocas
        learning_rate=2e-4,
        logging_steps=1,
        output_dir="outputs",
        optim="adamw_8bit",
        seed=3407,
        fp16=False,
        bf16=True,
    ),
)

# Iniciar o treinamento
print('Iniciando o fine-tuning...')
trainer_stats = trainer.train()

# Salvar o modelo treinado
model.save_pretrained("fine_tuned_model")
tokenizer.save_pretrained("fine_tuned_model")
print('Fine-tuning conclu√≠do e modelo salvo.')
print(f'Estat√≠sticas do treinamento: {trainer_stats}')

Unsloth: Tokenizing ["text"] (num_proc=12):   0%|          | 0/7642 [00:00<?, ? examples/s]

Iniciando o fine-tuning...


==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 7,642 | Num Epochs = 1 | Total steps = 60
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 11,272,192/1,000,000,000 (1.13% trained)
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
wandb: Paste an API key from your profile and hit enter:

 ¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑¬∑


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mrodrigofs08[0m ([33mrodrigofs08-fiap[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss
1,3.6033
2,3.514
3,3.576
4,3.5942
5,3.2977
6,3.4882
7,3.495
8,3.1866
9,3.0535
10,3.1873


Fine-tuning conclu√≠do e modelo salvo.
Estat√≠sticas do treinamento: TrainOutput(global_step=60, training_loss=2.8738850712776185, metrics={'train_runtime': 99.6006, 'train_samples_per_second': 4.819, 'train_steps_per_second': 0.602, 'total_flos': 968738489253888.0, 'train_loss': 2.8738850712776185})


## 5. Avalia√ß√£o do Modelo

Ap√≥s o fine-tuning, avaliamos o modelo em exemplos de teste para verificar sua capacidade de gerar descri√ß√µes precisas com base nos t√≠tulos dos produtos. Comparamos as respostas geradas com as descri√ß√µes reais do dataset.

### M√©tricas de Avalia√ß√£o
- **Qualidade das Respostas**: Avaliamos qualitativamente se as respostas geradas s√£o coerentes e relevantes para os t√≠tulos dos produtos.
- **Compara√ß√£o Antes/Depois**: Comparamos as respostas do modelo antes e depois do fine-tuning para verificar a melhoria na qualidade das respostas.

Os resultados mostram que o modelo fine-tuned √© capaz de gerar descri√ß√µes mais precisas e relevantes para os t√≠tulos dos produtos em compara√ß√£o com o modelo base.

In [6]:
# ============================
# 5. Avaliar o Modelo Ap√≥s o Fine-Tuning
# ============================
import torch

# Configurar o modelo para infer√™ncia
FastLanguageModel.for_inference(model)

# Exemplos de teste
test_prompts = [
    "What is the description of product Girls Ballet Tutu ?",
    "What is the description of product The Wall: Images and Offerings from the Vietnam Veterans Memorial ?"
]

# Gerar respostas
print('\nTeste do modelo ap√≥s o fine-tuning:')
for prompt in test_prompts:
    inputs = tokenizer(
        alpaca_prompt.format(prompt, ""),  # Deixa o response vazio para o modelo preencher
        return_tensors="pt"
    ).to("cuda" if torch.cuda.is_available() else "cpu")

    outputs = model.generate(
        **inputs,
        max_new_tokens=300,
        temperature=0.1,
        top_p=0.9,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )
    response = tokenizer.decode(outputs[0], skip_special_tokens=True)
    response_clean = response.split("### Response:")[1].strip() if "### Response:" in response else response
    print(f'Pergunta: {prompt}')
    print(f'Resposta p√≥s-treino: {response_clean}\n')


Teste do modelo ap√≥s o fine-tuning:
Pergunta: What is the description of product Girls Ballet Tutu ?
Resposta p√≥s-treino: "The most wonderful ballet tutu is a thing of beauty. It is a thing of magic. It is a thing of wonder. It is a thing of joy. It is a thing of magic, and it is a thing of wonder, and it is a thing of joy. It is a thing of magic, and it is a thing of wonder, and it is a thing of joy. It is a thing of magic, and it is a thing of wonder, and it is a thing of joy. It is a thing of magic, and it is a thing of wonder, and it is a thing of joy. It is a thing of magic, and it is a thing of wonder, and it is a thing of joy. It is a thing of magic, and it is a thing of wonder, and it is a thing of joy. It is a thing of magic, and it is a thing of wonder, and it is a thing of joy. It is a thing of magic, and it is a thing of wonder, and it is a thing of joy. It is a thing of magic, and it is a thing of wonder, and it is a thing of joy. It is a thing of magic, and it is a thi

## 6. Conclus√£o

Neste notebook, demonstramos o processo de fine-tuning de um modelo foundational (Llama-3.2-1B-Instruct) utilizando o dataset AmazonTitles-1.3MM. O modelo resultante √© capaz de gerar descri√ß√µes de produtos com base em seus t√≠tulos.

### Principais Aprendizados
- O uso de t√©cnicas de PEFT como LoRA permite realizar fine-tuning eficiente em modelos grandes.
- A prepara√ß√£o adequada dos dados √© crucial para o sucesso do fine-tuning.
- O modelo fine-tuned apresenta melhor desempenho na gera√ß√£o de descri√ß√µes de produtos em compara√ß√£o com o modelo base.

### Pr√≥ximos Passos
- Experimentar com diferentes par√¢metros de fine-tuning para melhorar ainda mais o desempenho.
- Utilizar o dataset completo para treinamento.
- Implementar m√©tricas quantitativas de avalia√ß√£o (BLEU, ROUGE, etc.).