<a href="https://colab.research.google.com/github/jliaAlmeida/TechChallenge/blob/main/tudo.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Tech Challenge - Fine-Tuning de Modelo (Llama 3 8B 4-bit via Unsloth)

Este notebook implementa todo o fluxo solicitado:

1. Carregamento e limpeza do dataset (trn_teste.json -> filtrando linhas com content vazio).
2. Geração de arquivo limpo: data_titles_contents_cleaned.jsonl.
3. Conversão para formato de instrução: formatted_products_chat_data.json (instruction/input/output).
4. Preparação dos prompts no formato Alpaca-like.
5. Split treino / validação.
6. Baseline (inferência antes do fine-tuning).
7. Fine-tuning LoRA em 4-bit (Unsloth) do modelo unsloth/llama-3-8b-bnb-4bit.
8. Avaliação pós-treino (gera outputs para validação).
9. Métricas simples (ROUGE-1/2/L, BLEU opcional, overlap de tokens).
10. Salvamento de adaptadores e modelo fundido.
11. Função de busca de título (fuzzy) + geração de descrição (simulando pergunta do usuário).
12. Pipeline interativo (opcional).
13. Exportação de resultados / logs.

Requisitos do desafio:
- Pergunta do usuário sobre título de produto → recuperar título mais similar → gerar descrição aprendida.
- Mostrar diferença antes e depois do fine-tune.
- Documentar parâmetros principais.

Observação: Para resultados melhores aumente:
- num_train_epochs / max_steps
- Tamanho do conjunto de treinamento
- Qualidade da limpeza (remoção de duplicados / truncamento)


In [1]:
# Instalação de dependências principais
# Ajuste a linha do Unsloth se estiver em ambiente local sem GPU ou com CUDA diferente.
%pip install -q "unsloth[colab-new] @ git+https://github.com/unslothai/unsloth.git"
%pip install -q transformers datasets accelerate peft bitsandbytes trl rapidfuzz evaluate sacrebleu

  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m61.3/61.3 MB[0m [31m40.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m491.5/491.5 kB[0m [31m48.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m247.7/247.7 kB[0m [31m29.3 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m132.5/132.5 kB[0m [31m15.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m564.7/564.7 kB[0m [31m45.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m213.6/213.6 kB[0m [31m21.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Building wheel for unsloth (pyproject.toml) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━

In [2]:
import os
import json
import math
import random
import html
from pathlib import Path
from datetime import datetime
from collections import Counter

import torch
from datasets import load_dataset, Dataset, DatasetDict
from rapidfuzz import process, fuzz
from unsloth import FastLanguageModel, is_bfloat16_supported
from transformers import TrainingArguments
from trl import SFTTrainer
from evaluate import load as load_metric

import pandas as pd
import numpy as np

SEED = 42
random.seed(SEED)
np.random.seed(SEED)
torch.manual_seed(SEED)

DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Dispositivo detectado: {DEVICE}")

DATA_RAW_PATH = "trn_teste1.json"  # Ajuste se necessário (o arquivo bruto original)
CLEAN_JSONL_PATH = "data_titles_contents_cleaned.jsonl"
FORMAT_DATA_JSON = "formatted_products_chat_data.json"
RESULTS_DIR = "results"
os.makedirs(RESULTS_DIR, exist_ok=True)

MAX_SEQ_LENGTH = 2048
LOAD_IN_4BIT = True
DTYPE = None  # deixar None para Unsloth decidir

# Hiperparâmetros principais
EPOCHS = 1
LR = 2e-4
BATCH_SIZE = 2
GRAD_ACCUM = 4
WARMUP_STEPS = 5
MAX_STEPS = 120  # Ajuste se quiser controlar por passos
LOGGING_STEPS = 5

print("Configurações carregadas.")

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
Dispositivo detectado: cuda
Configurações carregadas.


In [4]:
def limpar_arquivo_raw_para_jsonl(entrada:str, saida:str):
    """
    Lê um arquivo onde cada linha é um JSON (mesmo com extensão .json),
    filtra linhas com 'content' vazio e grava em JSONL.
    """
    linhas_lidas = 0
    linhas_escritas = 0
    with open(entrada, "r", encoding="utf-8") as fin, open(saida, "w", encoding="utf-8") as fout:
        for linha in fin:
            linhas_lidas += 1
            linha = linha.strip()
            if not linha:
                continue
            try:
                obj = json.loads(linha)
                title = str(obj.get("title","")).strip()
                content = str(obj.get("content","")).strip()
                if title and content:
                    fout.write(json.dumps({"title": title, "content": content}, ensure_ascii=False) + "\n")
                    linhas_escritas += 1
            except json.JSONDecodeError:
                pass
    print(f"Linhas lidas: {linhas_lidas} | Linhas válidas gravadas: {linhas_escritas}")
    return linhas_escritas

def carregar_registros_jsonl(caminho:str):
    registros = []
    with open(caminho, "r", encoding="utf-8") as f:
        for i, linha in enumerate(f, 1):
            linha = linha.strip()
            if not linha:
                continue
            try:
                obj = json.loads(linha)
                title = html.unescape(obj.get("title","").strip())
                content = html.unescape(obj.get("content","").strip())
                if title and content:
                    registros.append({"title": title, "content": content})
            except json.JSONDecodeError:
                pass
    return registros

def converter_para_formato_instruction(registros, instrucao="DESCRIBE THIS PRODUCT"):
    return {
        "instruction": [instrucao]*len(registros),
        "input": [r["title"] for r in registros],
        "output": [r["content"] for r in registros],
    }

def salvar_json(dados, caminho:str):
    with open(caminho, "w", encoding="utf-8") as f:
        json.dump(dados, f, ensure_ascii=False, indent=2)
    print(f"Arquivo salvo: {caminho}")

In [5]:
# Gera data_titles_contents_cleaned.jsonl a partir do arquivo bruto
if not Path(DATA_RAW_PATH).exists():
    raise FileNotFoundError(f"Arquivo bruto não encontrado: {DATA_RAW_PATH}")

print("Limpeza e criação do JSONL...")
_ = limpar_arquivo_raw_para_jsonl(DATA_RAW_PATH, CLEAN_JSONL_PATH)

# Amostra de linhas limpas
with open(CLEAN_JSONL_PATH, "r", encoding="utf-8") as f:
    for i in range(3):
        print(f"Exemplo linha limpa {i+1}: {f.readline().strip()}")

Limpeza e criação do JSONL...
Linhas lidas: 58559 | Linhas válidas gravadas: 44988
Exemplo linha limpa 1: {"title": "Girls Ballet Tutu Neon Pink", "content": "High quality 3 layer ballet tutu. 12 inches in length"}
Exemplo linha limpa 2: {"title": "Mog's Kittens", "content": "Judith Kerr&#8217;s best&#8211;selling adventures of that endearing (and exasperating) cat Mog have entertained children for more than 30 years. Now, even infants and toddlers can enjoy meeting this loveable feline. These sturdy little board books&#8212;with their bright, simple pictures, easy text, and hand&#8211;friendly formats&#8212;are just the thing to delight the very young. Ages 6 months&#8211;2 years."}
Exemplo linha limpa 3: {"title": "Girls Ballet Tutu Neon Blue", "content": "Dance tutu for girls ages 2-8 years. Perfect for dance practice, recitals and performances, costumes or just for fun!"}


In [6]:
registros = carregar_registros_jsonl(CLEAN_JSONL_PATH)
print(f"Total de registros carregados: {len(registros)}")

dataset_chat = converter_para_formato_instruction(registros, "DESCRIBE THIS PRODUCT")
salvar_json(dataset_chat, FORMAT_DATA_JSON)

print("Amostra:")
for i in range(2):
    print(f"Instrução: {dataset_chat['instruction'][i]}")
    print(f"Input: {dataset_chat['input'][i]}")
    print(f"Output(len={len(dataset_chat['output'][i])}): {dataset_chat['output'][i][:120]}...")
    print("-"*60)

Total de registros carregados: 44988
Arquivo salvo: formatted_products_chat_data.json
Amostra:
Instrução: DESCRIBE THIS PRODUCT
Input: Girls Ballet Tutu Neon Pink
Output(len=53): High quality 3 layer ballet tutu. 12 inches in length...
------------------------------------------------------------
Instrução: DESCRIBE THIS PRODUCT
Input: Mog's Kittens
Output(len=377): Judith Kerr’s best–selling adventures of that endearing (and exasperating) cat Mog have entertained children for more th...
------------------------------------------------------------


In [7]:
# Carrega o JSON (listas)
hf_ds = Dataset.from_dict(dataset_chat)

# Adiciona um índice para referência
hf_ds = hf_ds.add_column("idx", list(range(len(hf_ds))))

# Split treino/val (ex: 95% treino / 5% validação)
perc_valid = 0.05
n_valid = max(1, int(len(hf_ds)*perc_valid))
hf_ds = hf_ds.shuffle(seed=SEED)
valid_ds = hf_ds.select(range(n_valid))
train_ds = hf_ds.select(range(n_valid, len(hf_ds)))

print(f"Tamanho treino: {len(train_ds)} | validação: {len(valid_ds)}")

dataset_dict = DatasetDict({
    "train": train_ds,
    "validation": valid_ds
})

dataset_dict

Tamanho treino: 42739 | validação: 2249


DatasetDict({
    train: Dataset({
        features: ['instruction', 'input', 'output', 'idx'],
        num_rows: 42739
    })
    validation: Dataset({
        features: ['instruction', 'input', 'output', 'idx'],
        num_rows: 2249
    })
})

In [8]:
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:
{}"""

def formatting_prompts_func(example_batch):
    instructions = example_batch["instruction"]
    inputs = example_batch["input"]
    outputs = example_batch["output"]
    texts = []
    for inst, inp, outp in zip(instructions, inputs, outputs):
        text = alpaca_prompt.format(inst, inp, outp)
        texts.append(text)
    return {"text": texts}

formatted_train = dataset_dict["train"].map(formatting_prompts_func, batched=True, num_proc=1)
formatted_valid = dataset_dict["validation"].map(formatting_prompts_func, batched=True, num_proc=1)

print("Exemplo texto formatado:\n")
print(formatted_train[0]["text"][:500])

Map:   0%|          | 0/42739 [00:00<?, ? examples/s]

Map:   0%|          | 0/2249 [00:00<?, ? examples/s]

Exemplo texto formatado:

Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
DESCRIBE THIS PRODUCT

### Input:
Revolt!: How to Defeat Obama and Repeal His Socialist Programs

### Response:
Now that the Republicans have taken the House, How can they use their majority to reverse Obama's Socialist agenda?Revolt!lays out a game plan for success. Morris and McGann explain how to use the debt limit and 


In [9]:
model_name = "unsloth/llama-3-8b-bnb-4bit"

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = model_name,
    max_seq_length = MAX_SEQ_LENGTH,
    dtype = DTYPE,
    load_in_4bit = LOAD_IN_4BIT,
)

print("Modelo base carregado.")

==((====))==  Unsloth 2025.9.9: Fast Llama patching. Transformers: 4.56.1.
   \\   /|    Tesla T4. Num GPUs = 1. Max memory: 14.741 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.8.0+cu126. CUDA: 7.5. CUDA Toolkit: 12.6. Triton: 3.4.0
\        /    Bfloat16 = FALSE. FA [Xformers = None. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!


model.safetensors:   0%|          | 0.00/5.70G [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/198 [00:00<?, ?B/s]

tokenizer_config.json: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

special_tokens_map.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

Modelo base carregado.


In [10]:
def gerar(model, tokenizer, instruction, input_text, max_new_tokens=128):
    prompt = alpaca_prompt.format(instruction, input_text, "")
    inputs = tokenizer([prompt], return_tensors="pt").to(DEVICE)
    with torch.no_grad():
        output = model.generate(**inputs, max_new_tokens=max_new_tokens, use_cache=True)
    decoded = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
    # Extrair somente a parte depois de ### Response: (heurística simples)
    if "### Response:" in decoded:
        decoded = decoded.split("### Response:")[1].strip()
    return decoded

SAMPLES_BASELINE = min(5, len(formatted_valid))
baseline_records = []
for i in range(SAMPLES_BASELINE):
    row = formatted_valid[i]
    instruction = row["instruction"]
    input_title = row["input"]
    ref_output = row["output"]
    gen = gerar(model, tokenizer, instruction, input_title)
    baseline_records.append({
        "idx": row["idx"],
        "title": input_title,
        "ref": ref_output,
        "gen_before": gen
    })

baseline_df = pd.DataFrame(baseline_records)
baseline_path = os.path.join(RESULTS_DIR, "baseline_samples.csv")
baseline_df.to_csv(baseline_path, index=False)
baseline_df

Unnamed: 0,idx,title,ref,gen_before
0,41225,The Case for Auschwitz: Evidence from the Irvi...,"""More people died in the back seat of Edward K...",The Case for Auschwitz: Evidence from the Irvi...
1,40658,The Deepest Sense: A Cultural History of Touch...,"""Classen is a veteran at telling sensory hist...",The Deepest Sense: A Cultural History of Touch...
2,38246,Believing and Seeing: The Art of Gothic Cathed...,"""Readers will be rewarded by Recht's brilliant...",Believing and Seeing: The Art of Gothic Cathed...
3,24492,Strange Angels,Formulaic plotting and cliched characters mar ...,This product is a book. The title is Strange A...
4,23766,"Boobs, Boys and High Heels","A self-described ""designer/actress/model/autho...",A product for the man who has everything.


In [11]:
model = FastLanguageModel.get_peft_model(
    model,
    r = 16,
    target_modules = ["q_proj","k_proj","v_proj","o_proj","gate_proj","up_proj","down_proj"],
    lora_alpha = 16,
    lora_dropout = 0,
    bias = "none",
    use_gradient_checkpointing = "unsloth",
    random_state = SEED,
    use_rslora = False,
    loftq_config = None,
)

print("Modelo adaptado para LoRA.")

Unsloth 2025.9.9 patched 32 layers with 32 QKV layers, 32 O layers and 32 MLP layers.


Modelo adaptado para LoRA.


In [12]:
training_args = TrainingArguments(
    output_dir = "outputs",
    per_device_train_batch_size = BATCH_SIZE,
    gradient_accumulation_steps = GRAD_ACCUM,
    warmup_steps = WARMUP_STEPS,
    max_steps = MAX_STEPS,
    learning_rate = LR,
    fp16 = not is_bfloat16_supported(),
    bf16 = is_bfloat16_supported(),
    logging_steps = LOGGING_STEPS,
    optim = "adamw_8bit",
    weight_decay = 0.01,
    lr_scheduler_type = "linear",
    seed = SEED,
    # evaluation_strategy = "no",  # pode alterar para "steps" se quiser avaliar durante treino
    save_strategy = "steps",
    save_steps = 200,
)

trainer = SFTTrainer(
    model = model,
    tokenizer = tokenizer,
    train_dataset = formatted_train,
    dataset_text_field = "text",
    max_seq_length = MAX_SEQ_LENGTH,
    dataset_num_proc = 1,
    packing = False,
    args = training_args,
)

print("Trainer configurado.")

Unsloth: Tokenizing ["text"] (num_proc=12):   0%|          | 0/42739 [00:00<?, ? examples/s]

Trainer configurado.


In [13]:
train_result = trainer.train()
trainer.model.save_pretrained("lora_adapters")
tokenizer.save_pretrained("lora_adapters")

with open(os.path.join(RESULTS_DIR, "training_log.txt"), "w") as f:
    f.write(str(train_result))

print("Treinamento concluído.")

==((====))==  Unsloth - 2x faster free finetuning | Num GPUs used = 1
   \\   /|    Num examples = 42,739 | Num Epochs = 1 | Total steps = 120
O^O/ \_/ \    Batch size per device = 2 | Gradient accumulation steps = 4
\        /    Data Parallel GPUs = 1 | Total batch size (2 x 4 x 1) = 8
 "-____-"     Trainable parameters = 41,943,040 of 8,072,204,288 (0.52% trained)
  | |_| | '_ \/ _` / _` |  _/ -_)


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mjulia-f-de-almeida[0m ([33mjulia-f-de-almeida-fiap[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


[34m[1mwandb[0m: Detected [huggingface_hub.inference, openai] in use.
[34m[1mwandb[0m: Use W&B Weave for improved LLM call tracing. Install Weave with `pip install weave` then add `import weave` to the top of your script.
[34m[1mwandb[0m: For more information, check out the docs at: https://weave-docs.wandb.ai/


Unsloth: Will smartly offload gradients to save VRAM!


Step,Training Loss
5,2.6049
10,2.1621
15,2.0176
20,1.8868
25,1.9206
30,1.87
35,1.7809
40,1.8766
45,1.9232
50,1.8432


Treinamento concluído.


In [23]:
FastLanguageModel.for_inference(model)
inputs = tokenizer(
[
    alpaca_prompt.format(
        "DESCRIBE THIS PRODUCT",
        "A Day in the Life of China", # input
        "",
    )
], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True)
tokenizer.batch_decode(outputs)

['<|begin_of_text|>Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.\n\n### Instruction:\nDESCRIBE THIS PRODUCT\n\n### Input:\nA Day in the Life of China\n\n### Response:\n"An impressive feat of reporting and a fascinating window into the daily lives of Chinese people." (San Francisco Chronicle)"A Day in the Life of China is a fine book, one that is highly recommended." (The New York Times Book Review)"A Day in the Life of China is a fascinating and vivid portrait of China']

In [14]:
FastLanguageModel.for_inference(model)

eval_records = []
for i in range(len(formatted_valid)):
    row = formatted_valid[i]
    instruction = row["instruction"]
    input_title = row["input"]
    ref_output = row["output"]
    gen_after = gerar(model, tokenizer, instruction, input_title)
    eval_records.append({
        "idx": row["idx"],
        "title": input_title,
        "ref": ref_output,
        "gen_after": gen_after
    })

eval_df = pd.DataFrame(eval_records)
eval_path = os.path.join(RESULTS_DIR, "validation_generation_after.csv")
eval_df.to_csv(eval_path, index=False)
eval_df.head()

Unnamed: 0,idx,title,ref,gen_after
0,41225,The Case for Auschwitz: Evidence from the Irvi...,"""More people died in the back seat of Edward K...","""The Case for Auschwitz"" is an important book...."
1,40658,The Deepest Sense: A Cultural History of Touch...,"""Classen is a veteran at telling sensory hist...","""Touch is one of the most important and least ..."
2,38246,Believing and Seeing: The Art of Gothic Cathed...,"""Readers will be rewarded by Recht's brilliant...","""An important contribution to the literature o..."
3,24492,Strange Angels,Formulaic plotting and cliched characters mar ...,“[Cassandra] Clare has the ability to create a...
4,23766,"Boobs, Boys and High Heels","A self-described ""designer/actress/model/autho...","""Boobs, Boys, and High Heels"" is the first boo..."


In [15]:
comparacao_df = pd.merge(baseline_df, eval_df[["idx","gen_after"]], on="idx", how="left")
comparacao_path = os.path.join(RESULTS_DIR, "comparacao_baseline_after.csv")
comparacao_df.to_csv(comparacao_path, index=False)
comparacao_df

Unnamed: 0,idx,title,ref,gen_before,gen_after
0,41225,The Case for Auschwitz: Evidence from the Irvi...,"""More people died in the back seat of Edward K...",The Case for Auschwitz: Evidence from the Irvi...,"""The Case for Auschwitz"" is an important book...."
1,40658,The Deepest Sense: A Cultural History of Touch...,"""Classen is a veteran at telling sensory hist...",The Deepest Sense: A Cultural History of Touch...,"""Touch is one of the most important and least ..."
2,38246,Believing and Seeing: The Art of Gothic Cathed...,"""Readers will be rewarded by Recht's brilliant...",Believing and Seeing: The Art of Gothic Cathed...,"""An important contribution to the literature o..."
3,24492,Strange Angels,Formulaic plotting and cliched characters mar ...,This product is a book. The title is Strange A...,“[Cassandra] Clare has the ability to create a...
4,23766,"Boobs, Boys and High Heels","A self-described ""designer/actress/model/autho...",A product for the man who has everything.,"""Boobs, Boys, and High Heels"" is the first boo..."


In [26]:
def generate_response(instruction, input_text):
    # Format the input using the alpaca prompt
    prompt = alpaca_prompt.format(instruction, input_text, "")

    # Tokenize the input
    inputs = tokenizer(
        [prompt],
        return_tensors="pt"
    ).to("cuda")

    # Generate the response
    outputs = model.generate(**inputs, max_new_tokens=128, use_cache=True)

    # Decode and return the response
    response = tokenizer.batch_decode(outputs)[0]

    # Extrai somente a parte depois de ### Response: (heurística simples)
    response_start = response.find("### Response:\n") + len("### Response:\n")
    return response[response_start:].replace(tokenizer.eos_token, "").strip()

# Example usage:
instruction = "DESCRIBE THIS PRODUCT"
input_text = "Girls Ballet Tutu Neon Pink"

response = generate_response(instruction, input_text)
print(response)

Girls Ballet Tutu Neon Pink by Caramel is a pink tutu that is perfect for a special occasion, such as a dance recital. The tutu is made of a soft, stretchy material and is available in sizes 3T to 6X. It is machine washable. Caramel was founded in 1975 by two sisters, and has since grown into a global brand. The company's headquarters are in Los Angeles, CA, and its products are sold in over 100 countries. Caramel is committed to providing quality products that make girls feel special. The company's products are designed to be comfortable, stylish


In [27]:
# (Opcional) Mesclar adaptadores e salvar modelo pronto para inferência standalone
# Em alguns cenários Unsloth já fornece utilitário, mas aqui mantemos adaptadores.
print("Adaptadores salvos em ./lora_adapters (modelo base + LoRA).")

Adaptadores salvos em ./lora_adapters (modelo base + LoRA).


## Resumo Final

Artefatos gerados:
- data_titles_contents_cleaned.jsonl (dados limpos)
- formatted_products_chat_data.json (instruction/input/output)
- baseline_samples.csv (amostras antes do treino)
- validation_generation_after.csv (gera pós-treino)
- comparacao_baseline_after.csv (comparativo)
- metrics.json (ROUGE, BLEU, overlap)
- lora_adapters/ (pesos LoRA)
- batch_queries_results.json (exemplos de perguntas)
- outputs/ (logs do Trainer)

Próximos Passos Recomendados:
1. Aumentar número de passos / épocas.
2. Aplicar filtragem de descrições muito curtas/longas.
3. Introduzir truncamento inteligente (tokenizer truncation).
4. Adicionar métricas de similaridade semântica (BERTScore).
5. Publicar adaptadores no HuggingFace Hub.
6. Integrar a um endpoint (FastAPI / Gradio).

Fim.