* TECH CHALLENGE - FASE 03: FINE-TUNING AMAZON PRODUCTS
* Versão baseada no notebook do professor (Unsloth)

* 1: CONFIGURAÇÃO E INSTALAÇÃO DAS DEPENDÊNCIAS

In [1]:
!pip install "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
!pip install --no-deps xformers "trl<0.9.0" peft accelerate bitsandbytes
!pip install transformers datasets gradio

Collecting unsloth@ git+https://github.com/unslothai/unsloth.git (from unsloth[colab-new]@ git+https://github.com/unslothai/unsloth.git)
  Cloning https://github.com/unslothai/unsloth.git to /tmp/pip-install-x0z1tvne/unsloth_76469f1d13f34e0c92d15c6c8817d559
  Running command git clone --filter=blob:none --quiet https://github.com/unslothai/unsloth.git /tmp/pip-install-x0z1tvne/unsloth_76469f1d13f34e0c92d15c6c8817d559
  Resolved https://github.com/unslothai/unsloth.git to commit 1b8269f794e5ff824167b794689ba27f5dc500bc
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting unsloth_zoo>=2025.9.7 (from unsloth@ git+https://github.com/unslothai/unsloth.git->unsloth[colab-new]@ git+https://github.com/unslothai/unsloth.git)
  Downloading unsloth_zoo-2025.9.7-py3-none-any.whl.metadata (31 kB)
Collecting tyro (from unsloth@ git+https://github.com/unslothai/unsloth.git-

In [2]:
from unsloth import FastLanguageModel, is_bfloat16_supported
import torch
import json
from datasets import load_dataset, Dataset
from trl import SFTTrainer
from transformers import TrainingArguments, TextStreamer
import gradio as gr

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!


In [3]:
print(f"PyTorch versão: {torch.__version__}")
print(f"CUDA disponível: {torch.cuda.is_available()}")

PyTorch versão: 2.8.0+cu126
CUDA disponível: True


* 2: PREPARAÇÃO DO DATASET AMAZON

In [4]:
!gunzip trn.json.gz

In [5]:
try:
    with open('/content/trn.json', 'r') as f:
        test_line = f.readline()
    print("Dataset encontrado!")
except FileNotFoundError:
    print("Erro ao tentar carregar o arquivo!")

Dataset encontrado!


* 3: FORMATAÇÃO DOS DADOS PARA FINE-TUNING

In [6]:
def format_amazon_dataset_expanded(json_path="/content/trn.json", max_examples=150):

    print(f"Formatting {max_examples} fine-tuning products...")

    training_examples = []

    with open(json_path, 'r', encoding='utf-8') as f:
        for i, line in enumerate(f):
            if i >= max_examples:
                break

            try:
                item = json.loads(line)
                title = item.get('title', '').strip()
                content = item.get('content', '').strip()

                if not title or not content:
                    continue

                # MAIS variações de perguntas para melhor generalização
                questions = [
                    f"Tell me about this product: {title}",
                    f"What are the characteristics of: {title}",
                    f"Describe in detail: {title}",
                    f"What can you tell me about: {title}",
                    f"I need information about: {title}"
                ]

                # 3 exemplos por produto
                for question in questions[:3]:
                    example = {
                        "instruction": "DESCRIBE THIS AMAZON PRODUCT",
                        "input": question,
                        "output": content
                    }
                    training_examples.append(example)

            except json.JSONDecodeError:
                continue

    print(f"Expanded dataset: {len(training_examples)} created examples")
    return training_examples

In [7]:
amazon_training_data = format_amazon_dataset_expanded(max_examples=200)

Formatting 200 fine-tuning products...
Expanded dataset: 294 created examples


In [8]:
with open('/content/amazon_formatted_data_expanded.json', 'w', encoding='utf-8') as f:
    json.dump(amazon_training_data, f, ensure_ascii=False, indent=2)

print(f"Expanded data saved: {len(amazon_training_data)} examples")
print(f"Example:")
print(json.dumps(amazon_training_data[0], indent=2, ensure_ascii=False))

Expanded data saved: 294 examples
Example:
{
  "instruction": "DESCRIBE THIS AMAZON PRODUCT",
  "input": "Tell me about this product: Girls Ballet Tutu Neon Pink",
  "output": "High quality 3 layer ballet tutu. 12 inches in length"
}


* 4: CONFIGURAÇÃO DO MODELO BASE

In [9]:
max_seq_length = 2048
dtype = None
load_in_4bit = True

print("Loading Llama-3-8B model...")

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="unsloth/llama-3-8b-bnb-4bit",
    max_seq_length=max_seq_length,
    dtype=dtype,
    load_in_4bit=load_in_4bit,
)

print("Model loaded!")
print(f"Parameters: {model.num_parameters():,}")

Loading Llama-3-8B model...
==((====))==  Unsloth 2025.9.6: Fast Llama patching. Transformers: 4.55.4.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.8.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.4.0
\        /    Bfloat16 = FALSE. FA [Xformers = 0.0.32.post2. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.70G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/198 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

Model loaded!
Parameters: 8,030,261,248


* 5: TESTE DO MODELO ANTES DO FINE-TUNING

In [10]:
FastLanguageModel.for_inference(model)

alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

# Teste simples que o modelo base consegue processar
test_input_simple = "Describe Sony WH-1000XM4 headphones:"

# Modified to pass input as a list
inputs = tokenizer([test_input_simple], return_tensors="pt").to("cuda") # return_tensors="pt" pytorch

print("Testing BASE model...")
outputs = model.generate(
    **inputs,
    max_new_tokens=100,
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

response = tokenizer.batch_decode(outputs)[0]
print(response.replace("<|begin_of_text|>", ""))

Testing BASE model...
Describe Sony WH-1000XM4 headphones: The Sony WH-1000XM4 headphones are the best noise-canceling headphones. It is the latest model of the Sony WH-1000XM3. It is one of the best headphones in the market. The Sony WH-1000XM4 is the upgraded version of the Sony WH-1000XM3. It is the best noise-canceling headphones. The Sony WH-1000XM4 headphones are available in various colors. The Sony WH-1000XM4 headphones are available


* 6: CONFIGURAÇÃO DO FINE-TUNING COM LoRA

In [11]:
model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                    "gate_proj", "up_proj", "down_proj"],
    lora_alpha=16,
    lora_dropout=0,
    bias="none",
    use_gradient_checkpointing="unsloth",
    random_state=3407,
    use_rslora=False,
    loftq_config=None,
)

print("LoRA configured!")
print(f"Trainable parameters: {model.print_trainable_parameters()}")

Unsloth 2025.9.6 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


LoRA configured!
trainable params: 41,943,040 || all params: 8,072,204,288 || trainable%: 0.5196
Trainable parameters: None


* 7: PREPARAÇÃO DOS DADOS PARA TREINAMENTO

In [12]:
EOS_TOKEN = tokenizer.eos_token

def formatting_prompts_func(examples):
    instructions = examples["instruction"]
    inputs = examples["input"]
    outputs = examples["output"]
    texts = []

    for instruction, input_text, output in zip(instructions, inputs, outputs):
        text = alpaca_prompt.format(instruction, input_text, output) + EOS_TOKEN
        texts.append(text)

    return {"text": texts}

dataset = Dataset.from_list(amazon_training_data)
print(f"Dataset: {len(dataset)} examples")

dataset = dataset.map(formatting_prompts_func, batched=True)
print("Formatted dataset!")

Dataset: 294 examples


Map:   0%|          | 0/294 [00:00<?, ? examples/s]

Formatted dataset!


* 8: CONFIGURAÇÃO E EXECUÇÃO DO TREINAMENTO

In [13]:
print("Setting up expanded training...")

trainer = SFTTrainer(
    model=model,
    tokenizer=tokenizer,
    train_dataset=dataset,
    dataset_text_field="text",
    max_seq_length=max_seq_length,
    dataset_num_proc=2,
    packing=False,
    args=TrainingArguments(
        per_device_train_batch_size=2,
        gradient_accumulation_steps=4,
        warmup_steps=15,
        max_steps=100,
        learning_rate=2e-4,
        fp16=not is_bfloat16_supported(),
        bf16=is_bfloat16_supported(),
        logging_steps=1,
        optim="adamw_8bit",
        weight_decay=0.01,
        lr_scheduler_type="linear",
        seed=3407,
        output_dir="outputs",
        report_to="none",
    ),
)

print("Starting expanded training...")

Setting up expanded training...


Map (num_proc=2):   0%|          | 0/294 [00:00<?, ? examples/s]

Starting expanded training...


In [14]:
trainer_stats = trainer.train()

print("Training completed!")
print(f"Final Loss: {trainer_stats.training_loss:.4f}")

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 294 | Num Epochs = 3 | Total steps = 100
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 41,943,040 of 8,072,204,288 (0.52% trained)


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss
1,3.189
2,2.9836
3,2.4705
4,2.856
5,2.957
6,2.7015
7,2.5931
8,2.486
9,2.4597
10,2.3128


Training completed!
Final Loss: 1.2448


* 9: TESTE DO MODELO APÓS FINE-TUNING

In [15]:
FastLanguageModel.for_inference(model)

test_questions = [
    "Describe Sony WH-1000XM4 headphones",
    "What are the features of: Samsung Galaxy S21 Ultra",
    "What can you tell me about: Nike Air Max 270"
]

for question in test_questions:
    print(f"\n{'='*60}")
    print(f"QUESTION: {question}")

    test_prompt = alpaca_prompt.format(
        "DESCRIBE THIS AMAZON PRODUCT",
        question,
        ""
    )

    inputs = tokenizer([test_prompt], return_tensors="pt").to("cuda")

    print("RESPONSE AFTER FINE-TUNING:")
    text_streamer = TextStreamer(tokenizer, skip_prompt=True)

    outputs = model.generate(
        **inputs,
        streamer=text_streamer,
        max_new_tokens=150,
        temperature=0.7,
        do_sample=True,
        pad_token_id=tokenizer.eos_token_id
    )

    full_response = tokenizer.decode(outputs[0], skip_special_tokens=True)

    print(full_response)


QUESTION: Describe Sony WH-1000XM4 headphones
RESPONSE AFTER FINE-TUNING:
The WH-1000XM4 Wireless Noise Cancelling Headphones improve on already excellent predecessors. They're lighter, and offer improved sound, better voice-calling performance and longer battery life. They're also packed with features, including multipoint Bluetooth pairing, Amazon's Alexa and Google Assistant voice control, and an app that lets you personalize your experience. If you're looking for a top-flight pair of noise-cancelling headphones, Sony's WH-1000XM4 should be on your shortlist.<|end_of_text|>
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
DESCRIBE THIS AMAZON PRODUCT

### Input:
Describe Sony WH-1000XM4 headphones

### Response:
The WH-1000XM4 Wireless Noise Cancelling Headphones improve on already excellent predecessors. They're lighter, and offer improved sound, better vo

* 10: SALVAR MODELO

In [16]:
print("Salvando modelo expandido...")

model.save_pretrained("amazon_product_model_expanded")
tokenizer.save_pretrained("amazon_product_model_expanded")

Salvando modelo expandido...


('amazon_product_model_expanded/tokenizer_config.json',
 'amazon_product_model_expanded/special_tokens_map.json',
 'amazon_product_model_expanded/tokenizer.json')

* 11: EXECUÇÃO MANUAL MODELO

In [17]:
test_prompt = alpaca_prompt.format(
    "DESCRIBE THIS AMAZON PRODUCT",
    "Tell me about: Nike Air Max 270",
    ""
)

inputs = tokenizer([test_prompt], return_tensors="pt").to("cuda")

print("RESPONSE AFTER FINE-TUNING:")
text_streamer = TextStreamer(tokenizer, skip_prompt=True)

outputs = model.generate(
    **inputs,
    streamer=text_streamer,
    max_new_tokens=150,
    temperature=0.7,
    do_sample=True,
    pad_token_id=tokenizer.eos_token_id
)

full_response = tokenizer.decode(outputs[0], skip_special_tokens=True)

print(full_response)

RESPONSE AFTER FINE-TUNING:
The Nike Air Max 270 is a modern re-interpretation of the revolutionary Air Max concept. The Air Max 270 combines the iconic look of the 1990s with a new, modern design.<|end_of_text|>
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
DESCRIBE THIS AMAZON PRODUCT

### Input:
Tell me about: Nike Air Max 270

### Response:
The Nike Air Max 270 is a modern re-interpretation of the revolutionary Air Max concept. The Air Max 270 combines the iconic look of the 1990s with a new, modern design.
