# Fine-Tuning de modelo Deepseek Coder con LoRA y 8-bit

Este documento describe un proceso para ajustar finamente (*fine-tune*) un modelo de lenguaje (basado en la arquitectura *deepseek-coder-1.3b-base*) para la generación de consultas SQL, usando:

- **LoRA (Low-Rank Adaptation)** para reducir la cantidad de parámetros a entrenar.
- **Quantización en 8 bits** a través de *BitsAndBytes* (`bnb_config`), para reducir la huella de memoria y hacer factible el entrenamiento en GPU.

A continuación, se describen los pasos para:
1. Conectar a la base de datos y obtener su esquema.
2. Cargar y preparar un *dataset*.
3. Definir un *prompt* con indicaciones claras para la tarea de generación de consultas SQL.
4. Entrenar el modelo realizando una búsqueda de hiperparámetros.
5. Realizar **dos** *fine-tunings finales* con configuraciones seleccionadas.

In [2]:
# Conexión con base de datos
from langchain_community.utilities import SQLDatabase

usuario = 'postgres'
password = 'place_rag_password'
host = 'localhost'     # o la IP/URL de tu servidor
puerto = '5432'        # puerto por defecto de PostgreSQL
base_datos = 'place_rag_db'

# Crear la URL de conexión
uri = f"postgresql+psycopg2://{usuario}:{password}@{host}:{puerto}/{base_datos}"

# Se instancia la clase SQLDatabase a partir de la URI
db = SQLDatabase.from_uri(uri)

import os
# Cargamos la API key de HF desde las variables de entorno
api_key= os.environ.get("HF_API_KEY")

## 2. Carga y preparación del *dataset*

Se carga un archivo CSV de ejemplo (`sampled_place_dataset.csv`) que contiene pares de "Pregunta" y "Consulta" (SQL) que se usarán en el entrenamiento y la evaluación del modelo. Posteriormente, se renombran columnas según la necesidad.

In [9]:
# Carga del dataset
import pandas as pd

full_dataset = pd.read_csv("datasets/sampled_place_dataset_large.csv")
full_dataset = full_dataset.rename(columns={'Unnamed: 0': 'Indice'})

# Ejemplo de visualización
full_dataset.head()

Unnamed: 0,Pregunta,Consulta,Tabla,Valores,Categoría
0,Quiero ver todas las licitaciones de Principad...,SELECT * FROM expedientes JOIN entidades ON ex...,expedientes,{'region': 'Principado de Asturias'},region
1,¿Qué expedientes están registrados en Teruel?,SELECT * FROM expedientes JOIN entidades ON ex...,expedientes,{'region': 'Teruel'},region
2,Quiero ver todas las licitaciones de Girona.,SELECT * FROM expedientes JOIN entidades ON ex...,expedientes,{'region': 'Girona'},region
3,Solicito información de licitaciones en Soria.,SELECT * FROM expedientes JOIN entidades ON ex...,expedientes,{'region': 'Soria'},region
4,¿Qué contratos existen ahora mismo para la reg...,SELECT * FROM expedientes JOIN entidades ON ex...,expedientes,{'region': 'Comunitat Valenciana'},region


## 3. Definición del *system prompt*

A continuación, se define un *prompt* que el modelo recibirá. El objetivo es que el modelo genere la sintaxis SQL correctamente, respetando los nombres de tabla y campos disponibles en el esquema de la base de datos.

Se añade también una función `extraer_query_sql` que, de ser necesario, extrae la consulta final de un texto dado mediante una expresión regular.

In [4]:
system_prompt = f"""
Dada una pregunta de entrada, crea una consulta de postgresql sintácticamente correcta.
Usa solo los nombres de las columnas que puedes ver en la descripción del esquema.
No consultes columnas que no existen.
Utiliza únicamente las siguientes tablas: 'entidades', 'expedientes', 'paises', 'regiones'
Esquema de la base de datos:
{db.table_info}
"""

import re

def extraer_query_sql(texto):
    patron = re.compile(
        r"SELECT \*(?:.|\n)*?;"
    )
    consulta = patron.findall(texto)
    return consulta

### Creación de columnas de instrucciones

El *prompt* final se construye concatenando el `system_prompt` con la pregunta y la respuesta esperada (la consulta SQL).

In [12]:
def reordenar_dataframe_por_categoria(df, col_categoria='Categoría'):
    """
    Reordena el DataFrame 'df' en bloques, de forma que cada bloque de 35 filas
    contenga exactamente una fila de cada categoría.
    """
    categorias = df[col_categoria].unique()
    
    #Comprobar que existan exactamente 35 categorías
    if len(categorias) != 35:
        raise ValueError(f"Se esperaban 35 categorías únicas, pero se encontraron {len(categorias)}.")
    
    # Agrupar filas por categoría
    grupos_por_categoria = {
        cat: g.reset_index(drop=True) 
        for cat, g in df.groupby(col_categoria)
    }
    # Comprobar que cada categoría tenga 7 filas
    for cat, subdf in grupos_por_categoria.items():
        if len(subdf) != 63:
            raise ValueError(
                f"La categoría '{cat}' no tiene exactamente 69 filas. "
                f"Encontradas: {len(subdf)}."
            ) 
    # Construir el nuevo orden de filas:
    nuevo_orden = []
    for i in range(63):
        for cat in categorias:
            # Tomamos la fila i de la categoría cat
            fila = grupos_por_categoria[cat].iloc[i]
            # Agregamos esa fila a la lista que formará el nuevo DataFrame
            nuevo_orden.append(fila)
    
    # Convertir la lista de filas en DataFrame y reindexar
    df_reordenado = pd.DataFrame(nuevo_orden).reset_index(drop=True)
    
    return df_reordenado

In [None]:
dataset_ordenado = reordenar_dataframe_por_categoria(full_dataset)

train_df = dataset_ordenado.iloc[:1680]  # 1680 elementos
eval_df = dataset_ordenado.iloc[1680:2030]    # 350 elementos

train_df["Instrucciones"] = system_prompt + " Pregunta: " + train_df["Pregunta"] + " Comienza la query siempre por SELECT * y termínala siempre por ; Respuesta: " + train_df["Consulta"]
eval_df["Instrucciones"] = system_prompt + " Pregunta: " + eval_df["Pregunta"] + " Comienza la query siempre por SELECT * y termínala siempre por ; Respuesta: " + eval_df["Consulta"]

# Ejemplo de impresión de las instrucciones
print(eval_df["Instrucciones"].iloc[0])


Dada una pregunta de entrada, crea una consulta de postgresql sintácticamente correcta.
Usa solo los nombres de las columnas que puedes ver en la descripción del esquema.
No consultes columnas que no existen.
Utiliza únicamente las siguientes tablas: 'entidades', 'expedientes', 'paises', 'regiones'
Esquema de la base de datos:

CREATE TABLE documentos (
	document_reference_id VARCHAR, 
	document_uri VARCHAR NOT NULL, 
	document_type VARCHAR, 
	contract_id VARCHAR, 
	CONSTRAINT documentos_pkey PRIMARY KEY (document_uri), 
	CONSTRAINT documentos_contract_id_fkey FOREIGN KEY(contract_id) REFERENCES expedientes (contract_folder_id)
)

/*
3 rows from documentos table:
document_reference_id	document_uri	document_type	contract_id
PCAP 50 equipos trabajo en movilidad Anexo I acuerdo MP.pdf	https://contrataciondelestado.es/wps/wcm/connect/PLACE_es/Site/area/docAccCmpnt?srv=cmpnt&cmpntname=	Pliego Administrativo	2023/20
PPT Suministro 50 portatiles Anexo II acuerdo MP.pdf	https://contratacionde

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  train_df["Instrucciones"] = system_prompt + " Pregunta: " + train_df["Pregunta"] + " Comienza la query siempre por SELECT * y termínala siempre por ; Respuesta: " + train_df["Consulta"]
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  eval_df["Instrucciones"] = system_prompt + " Pregunta: " + eval_df["Pregunta"] + " Comienza la query siempre por SELECT * y termínala siempre por ; Respuesta: " + eval_df["Consulta"]


## 4. Tokenización del dataset

En la siguiente sección, se transforma el texto (las instrucciones) en vectores de *tokens*, usando el *tokenizer* correspondiente al modelo base. Además, se ajustan parámetros de tokenización como:

- `max_length`: la longitud máxima de tokens en cada ejemplo.
- `truncation`: para que se recorte si excede el máximo.
- `return_tensors=None`: para mantener la estructura que pide la librería.
- Se asignan las etiquetas (`labels`) como copia de `input_ids`, ya que es un entrenamiento de lenguaje causal.

In [20]:
from transformers import AutoTokenizer
from datasets import Dataset

train_ds = Dataset.from_pandas(train_df)
eval_ds = Dataset.from_pandas(eval_df)

tokenizer = AutoTokenizer.from_pretrained(
    "deepseek-ai/deepseek-coder-1.3b-base",
    trust_remote_code=True,
)
tokenizer.pad_token = tokenizer.eos_token
tokenizer.padding_side = "right"

def tokenize(data_point):
    result = tokenizer(
        data_point['Instrucciones'],
        truncation=True,
        max_length=2256,
        return_tensors=None,
    ).to("cuda")
    result["labels"] = result["input_ids"].copy()
    return result

train_tokenized = train_ds.map(tokenize)
eval_tokenized = eval_ds.map(tokenize)

  from .autonotebook import tqdm as notebook_tqdm
Map: 100%|██████████| 1680/1680 [00:10<00:00, 157.64 examples/s]
Map: 100%|██████████| 350/350 [00:02<00:00, 159.79 examples/s]


## 5. Definición del modelo y configuración LoRA

Se define la función `model_init()` que:
1. Carga el modelo base `deepseek-ai/deepseek-coder-1.3b-base` con cuantización en 8 bits.
2. Desactiva `use_cache`.
3. Prepara el modelo para entrenamiento k-bit (`prepare_model_for_kbit_training`).
4. Aplica el método *LoRA* (PEFT) para entrenar solo algunos de los parámetros.

In [21]:
from transformers import (
    DataCollatorForSeq2Seq, 
    Trainer, 
    TrainingArguments,
    BitsAndBytesConfig,
    AutoModelForCausalLM
)
from peft import (
    get_peft_model,
    LoraConfig,
    TaskType,
    prepare_model_for_kbit_training
)
def model_init():
    model_name = "deepseek-ai/deepseek-coder-1.3b-base"
    bnb_config = BitsAndBytesConfig(load_in_8bit=True)
    
    # Cargar el modelo base con cuantización en 8 bits
    model = AutoModelForCausalLM.from_pretrained(
        model_name,
        device_map="auto",
        trust_remote_code=True,
        quantization_config=bnb_config
    )
    
    model.config.use_cache = False
    
    # Preparar el modelo para entrenamiento k-bit
    model = prepare_model_for_kbit_training(model)
    
    # Definir la configuración de LoRA
    peft_config = LoraConfig(
        task_type=TaskType.CAUSAL_LM,
        inference_mode=False,
        r=4,
        lora_alpha=64,
        lora_dropout=0.1,
        target_modules=["q_proj", "v_proj"]
    )
    
    # Aplicar LoRA al modelo
    model = get_peft_model(model, peft_config)
    
    return model

## 6. Búsqueda de hiperparámetros (Grid / Random Search)

En la siguiente celda se muestra un ejemplo de búsqueda de hiperparámetros donde se prueban varias combinaciones de:

- `learning_rates`
- `batch_sizes`
- `num_epochs`
- `gradient_steps`

Después, se realiza un *training* y evaluación con cada combinación para encontrar la mejor configuración.

> *Nota:* Es posible que esta sección tome tiempo significativo al ejecutarse, dependiendo de la capacidad de cómputo.

In [None]:
import itertools
import random
from datetime import datetime
import torch

# Define los rangos de hiperparámetros
learning_rates = [1e-5, 5e-5, 1e-4, 5e-4, 1e-3]
batch_sizes = [2, 4, 8]
num_epochs = [3, 5, 7]
gradient_steps = [1, 2, 4]

# Genera combinaciones aleatorias
combinations = list(itertools.product(learning_rates, batch_sizes, num_epochs, gradient_steps))
random.shuffle(combinations)

# Limita el número de combinaciones a probar
max_trials = 10
train_results = []
for i, (lr, bs, epochs, gs) in enumerate(combinations[:max_trials]):
    try:
        print(f"Prueba {i+1}: lr={lr}, batch_size={bs}, epochs={epochs}, grad_steps={gs}")
        training_arguments = TrainingArguments(
            run_name=f"""deepseek-coder-1.3b-base-{datetime.now().strftime("%Y-%m-%d-%H-%M-%S")}""",
            output_dir="logs",
            num_train_epochs=epochs,
            per_device_train_batch_size=bs,
            gradient_accumulation_steps=gs,
            optim="paged_adamw_32bit",
            save_steps=0,
            logging_steps=10,
            learning_rate=lr,
            fp16=True,
            bf16=False,
            group_by_length=True,
            logging_strategy="steps",
            evaluation_strategy='steps',
            eval_steps=10,
            save_strategy="no",
            gradient_checkpointing=False,
        )
        model = model_init()
        trainer = Trainer(
            model = model,
            args=training_arguments,
            train_dataset=train_tokenized,
            eval_dataset=eval_tokenized,
            data_collator=DataCollatorForSeq2Seq(
                tokenizer, pad_to_multiple_of=8, return_tensors="pt", padding=True
            )
        )
        trainer.train()
        eval_results = trainer.evaluate()
        train_results.append(eval_results)
        print(f"Resultados: {eval_results}")
        del model # eliminamos el modelo para evitar sobrecarga de memoria
        torch.cuda.empty_cache() # limpiamos la cache para evitar sobrecarga de memoria
    except:
        continue
    
    # Aquí puedes guardar los resultados y comparar posteriormente

Prueba 1: lr=0.0005, batch_size=2, epochs=3, grad_steps=1


[34m[1mwandb[0m: Currently logged in as: [33mrodrigo-gonzalez-pulido[0m ([33mrodrigo-gonzalez-pulido-universidad-aut-noma-de-madrid[0m). Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Using wandb-core as the SDK backend.  Please refer to https://wandb.me/wandb-core for more information.


  return fn(*args, **kwargs)


Step,Training Loss,Validation Loss
10,0.941,0.451502
20,0.2357,0.089987
30,0.0456,0.010881
40,0.0058,0.002092
50,0.0025,0.001431
60,0.0017,0.001058
70,0.0013,0.000958
80,0.0012,0.000799
90,0.0012,0.000645
100,0.0007,0.000533


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Tr

Resultados: {'eval_loss': 0.00013899854093324393, 'eval_runtime': 18.5299, 'eval_samples_per_second': 1.889, 'eval_steps_per_second': 0.27, 'epoch': 3.0}
Prueba 2: lr=1e-05, batch_size=4, epochs=3, grad_steps=4




Step,Training Loss,Validation Loss
10,1.2928,1.284956
20,1.2803,1.277062
30,1.2729,1.270911


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.


Resultados: {'eval_loss': 1.2702876329421997, 'eval_runtime': 32.0598, 'eval_samples_per_second': 1.092, 'eval_steps_per_second': 0.156, 'epoch': 3.0}
Prueba 3: lr=1e-05, batch_size=2, epochs=3, grad_steps=2


Step,Training Loss,Validation Loss
10,1.2928,1.283032
20,1.2769,1.268428
30,1.2626,1.257502
40,1.2488,1.241922
50,1.2354,1.227177
60,1.2224,1.218079
70,1.2107,1.205883
80,1.1993,1.195691
90,1.1894,1.184501
100,1.1806,1.17775


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Tr

Resultados: {'eval_loss': 1.165602684020996, 'eval_runtime': 17.3807, 'eval_samples_per_second': 2.014, 'eval_steps_per_second': 0.288, 'epoch': 3.0}
Prueba 4: lr=0.001, batch_size=8, epochs=7, grad_steps=4


Step,Training Loss,Validation Loss
10,0.75,0.206476
20,0.0967,0.015538
30,0.0062,0.001859


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.


Resultados: {'eval_loss': 0.0016281469725072384, 'eval_runtime': 20.9302, 'eval_samples_per_second': 1.672, 'eval_steps_per_second': 0.239, 'epoch': 5.909090909090909}
Prueba 5: lr=0.0001, batch_size=8, epochs=3, grad_steps=4


Step,Training Loss,Validation Loss
10,1.2419,1.188464


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.


Resultados: {'eval_loss': 1.168552279472351, 'eval_runtime': 18.5396, 'eval_samples_per_second': 1.888, 'eval_steps_per_second': 0.27, 'epoch': 2.5454545454545454}
Prueba 6: lr=5e-05, batch_size=4, epochs=3, grad_steps=1


Step,Training Loss,Validation Loss
10,1.2652,1.224209
20,1.1854,1.136633
30,1.089,1.029133
40,0.9696,0.893556
50,0.8179,0.722118
60,0.6454,0.552513
70,0.4867,0.408044
80,0.3573,0.297006
90,0.2633,0.221951
100,0.1974,0.168752


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Tr

Resultados: {'eval_loss': 0.10392394661903381, 'eval_runtime': 27.3433, 'eval_samples_per_second': 1.28, 'eval_steps_per_second': 0.183, 'epoch': 3.0}
Prueba 7: lr=0.001, batch_size=4, epochs=3, grad_steps=2


Step,Training Loss,Validation Loss
10,0.7501,0.203085
20,0.0956,0.015497
30,0.0057,0.001592
40,0.0014,0.001001
50,0.001,0.000803
60,0.0008,0.000732


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.


Resultados: {'eval_loss': 0.000716207199729979, 'eval_runtime': 33.0473, 'eval_samples_per_second': 1.059, 'eval_steps_per_second': 0.151, 'epoch': 3.0}
Prueba 8: lr=1e-05, batch_size=2, epochs=5, grad_steps=2


Step,Training Loss,Validation Loss
10,1.2924,1.284037
20,1.2766,1.271344
30,1.2612,1.252926
40,1.2454,1.23863
50,1.2294,1.222125
60,1.2133,1.203915
70,1.1965,1.186942
80,1.1788,1.169181
90,1.1612,1.152619
100,1.1432,1.133582


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Tr

Resultados: {'eval_loss': 0.9971861839294434, 'eval_runtime': 17.6145, 'eval_samples_per_second': 1.987, 'eval_steps_per_second': 0.284, 'epoch': 5.0}
Prueba 9: lr=1e-05, batch_size=8, epochs=5, grad_steps=1


Step,Training Loss,Validation Loss
10,1.2927,1.285448
20,1.2768,1.268563
30,1.263,1.255463
40,1.2499,1.24359
50,1.2376,1.232252
60,1.2264,1.221224
70,1.2164,1.212377
80,1.208,1.20466
90,1.2014,1.198533
100,1.1968,1.194547


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Tr

Resultados: {'eval_loss': 1.194079875946045, 'eval_runtime': 20.8887, 'eval_samples_per_second': 1.676, 'eval_steps_per_second': 0.239, 'epoch': 5.0}
Prueba 10: lr=1e-05, batch_size=8, epochs=3, grad_steps=2


Step,Training Loss,Validation Loss
10,1.2934,1.287669
20,1.2816,1.278562
30,1.2745,1.27299


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.


Resultados: {'eval_loss': 1.2739086151123047, 'eval_runtime': 652.8946, 'eval_samples_per_second': 0.054, 'eval_steps_per_second': 0.008, 'epoch': 3.0}
Prueba 11: lr=0.001, batch_size=8, epochs=3, grad_steps=2
Prueba 12: lr=0.0001, batch_size=8, epochs=3, grad_steps=1
Prueba 13: lr=1e-05, batch_size=8, epochs=7, grad_steps=1
Prueba 14: lr=1e-05, batch_size=4, epochs=7, grad_steps=2
Prueba 15: lr=0.0001, batch_size=8, epochs=7, grad_steps=4
Prueba 16: lr=0.0005, batch_size=2, epochs=5, grad_steps=1
Prueba 17: lr=1e-05, batch_size=2, epochs=3, grad_steps=4
Prueba 18: lr=0.0001, batch_size=8, epochs=7, grad_steps=2
Prueba 19: lr=0.0005, batch_size=2, epochs=7, grad_steps=4
Prueba 20: lr=0.0001, batch_size=2, epochs=7, grad_steps=2


## 7. *Fine-Tuning* Final

En la sección anterior se probó con diferentes hiperparámetros, observando la evolución de la *loss* de entrenamiento y validación. Aquí, realizaremos el **entrenamientos final** con una configuración concreta.

### Fine-Tuning 1
- `learning_rate = 0.0005`
- `batch_size = 2`
- `num_epochs = 3`
- `gradient_accumulation_steps = 1`

### 7.1 Fine-Tuning 1

In [28]:
from datetime import datetime
from transformers import TrainingArguments, TrainerCallback, Trainer
from transformers import DataCollatorForSeq2Seq

# Callback para detener el entrenamiento cuando eval_loss < 0.001
class EarlyStoppingBelowThresholdCallback(TrainerCallback):
    def __init__(self, threshold=0.001):
        self.threshold = threshold

    def on_evaluate(self, args, state, control, **kwargs):
        eval_loss = kwargs["metrics"].get("eval_loss")
        if eval_loss is not None and eval_loss < self.threshold:
            print(f"Deteniendo entrenamiento: eval_loss < {self.threshold}")
            control.should_training_stop = True

learning_rate_1 = 0.0005
batch_size_1 = 2
num_epochs_1 = 3
gradient_steps_1 = 1

# Configurar argumentos de entrenamiento
training_arguments_1 = TrainingArguments(
    run_name=f"deepseek-coder-1.3b-base-{datetime.now().strftime('%Y-%m-%d-%H-%M-%S')}",
    output_dir="logs",
    num_train_epochs=num_epochs_1,
    per_device_train_batch_size=batch_size_1,
    gradient_accumulation_steps=gradient_steps_1,
    optim="paged_adamw_32bit",
    save_steps=0,
    logging_steps=10,
    learning_rate=learning_rate_1,
    fp16=True,
    bf16=False,
    group_by_length=True,
    logging_strategy="steps",
    evaluation_strategy='steps',
    eval_steps=10,
    save_strategy="no",
    gradient_checkpointing=False,
)

# Inicializamos el modelo con LoRA
model_1 = model_init()

# Definimos el Trainer
trainer_1 = Trainer(
    model = model_1,
    args=training_arguments_1,
    train_dataset=train_tokenized,
    eval_dataset=eval_tokenized,
    data_collator=DataCollatorForSeq2Seq(
        tokenizer, pad_to_multiple_of=8, return_tensors="pt", padding=True
    ),
    # Agregar el callback de early stopping
    callbacks=[EarlyStoppingBelowThresholdCallback(threshold=0.001)],
)

print("\n--- Entrenando con Fine-Tuning 1 ---\n")
# Entrenamos
trainer_1.train()

# Evaluamos
eval_results_1 = trainer_1.evaluate()
print(f"Resultados Fine-Tuning: {eval_results_1}")

# Guardamos el modelo y el tokenizer
model_1_path = f"models/deepseek-coder-ft-{datetime.now().strftime('%Y-%m-%d-%H-%M-%S')}"
trainer_1.save_model(model_1_path)
tokenizer_1_path = f"models/tokenizer-deepseek-coder-ft-{datetime.now().strftime('%Y-%m-%d-%H-%M-%S')}"
tokenizer.save_pretrained(tokenizer_1_path)

print(f"\nModelo Fine-Tuning 1 guardado en: {model_1_path}")
print(f"Tokenizer guardado en: {tokenizer_1_path}")




--- Entrenando con Fine-Tuning 1 ---



  return fn(*args, **kwargs)


Step,Training Loss,Validation Loss
10,0.9413,0.437795
20,0.218,0.064055
30,0.0282,0.004631
40,0.005,0.003525
50,0.005,0.001576
60,0.0018,0.001036
70,0.0012,0.002234
80,0.0015,0.000788


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.
Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.


Deteniendo entrenamiento: eval_loss < 0.001


Trainer.tokenizer is now deprecated. You should use Trainer.processing_class instead.


Deteniendo entrenamiento: eval_loss < 0.001
Resultados Fine-Tuning 1: {'eval_loss': 0.0007878464530222118, 'eval_runtime': 1100.6203, 'eval_samples_per_second': 0.318, 'eval_steps_per_second': 0.04, 'epoch': 0.09523809523809523}

Modelo Fine-Tuning 1 guardado en: models/deepseek-coder-ft1-2025-02-02-20-12-35
Tokenizer guardado en: models/tokenizer-deepseek-coder-ft1-2025-02-02-20-12-36
