# Clasificador de peliculas


El presente labook aborda el desarrollo de un sistema de clasificación multietiqueta de géneros cinematográficos a partir de un conjunto de datos compuesto por las variables year, plot y genres. El flujo de trabajo incluye la preparación, limpieza y normalización del texto, así como la vectorización mediante técnicas avanzadas de NLP. Posteriormente, se integran modelos preentrenados del ecosistema Hugging Face, basados en arquitecturas transformer de última generación, combinados con redes neuronales para la capa de clasificación final. El objetivo es entrenar y evaluar un modelo capaz de asignar múltiples géneros a cada sinopsis, maximizando métricas de desempeño como el F1-score dentro de un entorno de aprendizaje supervisado.

## Carga de librerias

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
# Importación librerías
import pandas as pd
import os
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.multiclass import OneVsRestClassifier
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
from sklearn.metrics import r2_score, roc_auc_score
from sklearn.model_selection import train_test_split
from sklearn.model_selection import train_test_split, cross_val_score, GridSearchCV
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
from sklearn.ensemble import RandomForestClassifier
from nltk.stem.snowball import SnowballStemmer
from nltk.stem import WordNetLemmatizer
from sklearn.metrics import accuracy_score
from sklearn.feature_extraction.text import CountVectorizer
from scipy.spatial.distance import cosine
import tensorflow as tf
import numpy as np
import ast

## Carga del dataset

In [3]:
# Carga de datos de archivo .csv
dataTraining = pd.read_csv('https://github.com/albahnsen/MIAD_ML_and_NLP/raw/main/datasets/dataTraining.zip', encoding='UTF-8', index_col=0)
dataTesting = pd.read_csv('https://github.com/albahnsen/MIAD_ML_and_NLP/raw/main/datasets/dataTesting.zip', encoding='UTF-8', index_col=0)

In [4]:
# Visualización datos de entrenamiento
dataTraining.head()

Unnamed: 0,year,title,plot,genres,rating
3107,2003,Most,most is the story of a single father who takes...,"['Short', 'Drama']",8.0
900,2008,How to Be a Serial Killer,a serial killer decides to teach the secrets o...,"['Comedy', 'Crime', 'Horror']",5.6
6724,1941,A Woman's Face,"in sweden , a female blackmailer with a disfi...","['Drama', 'Film-Noir', 'Thriller']",7.2
4704,1954,Executive Suite,"in a friday afternoon in new york , the presi...",['Drama'],7.4
2582,1990,Narrow Margin,"in los angeles , the editor of a publishing h...","['Action', 'Crime', 'Thriller']",6.6


In [5]:
dataTesting.head()

Unnamed: 0,year,title,plot
1,1999,Message in a Bottle,"who meets by fate , shall be sealed by fate ...."
4,1978,Midnight Express,"the true story of billy hayes , an american c..."
5,1996,Primal Fear,martin vail left the chicago da ' s office to ...
6,1950,Crisis,husband and wife americans dr . eugene and mr...
7,1959,The Tingler,the coroner and scientist dr . warren chapin ...


## Aplicacion de tecnicas NLP

In [13]:
nltk.download('stopwords')
nltk.download('wordnet')

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to /root/nltk_data...


True

In [6]:
lemmatize = WordNetLemmatizer()

In [10]:
import nltk
from nltk.tokenize import TreebankWordTokenizer
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer

# Instanciar herramientas
tokenizer = TreebankWordTokenizer()
lemmatizer = WordNetLemmatizer()
stop_words = set(stopwords.words('english'))

# Función de limpieza y lematización
def lemmatize_and_clean(text):
    tokens = tokenizer.tokenize(text.lower())
    return [lemmatizer.lemmatize(token, pos='v') for token in tokens if token.isalpha() and token not in stop_words]


In [11]:
print(stop_words)

{'as', "he'd", 'what', "hasn't", "he'll", "haven't", 'against', 'needn', "it'll", 'themselves', 'ours', 've', 'from', 'ourselves', 'by', 'not', 'only', 'hasn', "i'll", 'myself', "it'd", "it's", 'doesn', 's', 'you', "hadn't", 'than', 'those', "you'd", 'yourselves', "we've", 'now', 'before', 'once', 'shan', 'but', "won't", 'ain', 'yourself', 'doing', "aren't", "needn't", 'so', 'same', 't', 'when', 'how', 'himself', "they'd", 'your', 'out', 'its', 'am', 'haven', "you're", "they'll", "we'll", 'should', 'on', "didn't", 'couldn', 'or', 'our', 'has', "mustn't", 'have', 'yours', 'through', 'just', 'can', 'was', "isn't", 'won', 'into', "that'll", 'below', 'this', 'weren', 'him', 'had', 'with', 'shouldn', 'wasn', 'if', 'are', 'again', 'll', 'she', 'i', 'don', 'why', 'a', 'mightn', 'an', 'and', 'most', 'does', 'having', 'herself', 'more', 'own', 'such', 'while', 'over', "don't", 'any', 'each', 'few', 'the', 'who', "i've", 'until', 'isn', "she's", 'under', "they've", 'is', 'because', 'my', "we're"

In [14]:
vect = CountVectorizer(analyzer=lemmatize_and_clean)
X = vect.fit_transform(dataTraining['plot'])
X.shape

(7895, 32411)

Se generaron 7.895 instancias (una por cada sinopsis en dataTraining['plot']).

El vocabulario final contiene 32.411 términos únicos, que pasan a constituir las dimensiones del espacio vectorial.

Cada sinopsis se representa como un vector disperso de longitud 32.411, donde cada componente indica la frecuencia del término correspondiente.

In [15]:
dataTraining['genres'] = dataTraining['genres'].map(lambda x: ast.literal_eval(x) if isinstance(x, str) else x)
le = MultiLabelBinarizer()
y = le.fit_transform(dataTraining['genres'])

In [16]:
# Dividir entrenamiento y prueba
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

In [19]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense, Dropout, LeakyReLU
from tensorflow.keras.optimizers import Adam
from tensorflow.keras import backend as K
from tensorflow.keras.metrics import AUC

# Convertir a matriz densa para Keras
X_train = X.toarray()
y_train = y
# 0. Limpieza de sesión
K.clear_session()

# Parámetros
vocab_size = 10000
max_len = 200
embedding_dim = 128
num_classes = y_train.shape[1]
learning_rate = 0.001
optimizer     = Adam(learning_rate=learning_rate)
epochs        = 10
batch_size    = 32
l2_lambda     = 1e-4

# Modelo
model = Sequential()
model.add(Embedding(input_dim=vocab_size, output_dim=embedding_dim, input_length=max_len))
model.add(LSTM(64, return_sequences=False))
model.add(Dropout(0.5))
model.add(Dense(64))
model.add(LeakyReLU(alpha=0.01))
model.add(Dense(num_classes, activation='sigmoid'))

model.compile(
    loss='binary_crossentropy',
    optimizer=Adam(learning_rate=0.001),
    metrics=[AUC(name='auc')]
)

model.build(input_shape=(None, max_len))
model.summary()


In [21]:
# Entrenamiento con EarlyStopping
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.metrics import roc_auc_score

# Entrenamiento con early stopping
history = model.fit(
    X_train, y_train,
    validation_split=0.2,
    epochs=epochs,
    batch_size=batch_size,
    callbacks=[
        EarlyStopping(
            monitor='val_loss',
            patience=5,
            restore_best_weights=True
        )
    ],
    verbose=1
)

# Evaluación final: predicciones probabilísticas
# Convert X_test to a dense array to avoid the SparseToDense error
y_pred_proba = model.predict(X_test.toarray())

# Cálculo de ROC AUC (macro y micro)
roc_auc_macro = roc_auc_score(y_test, y_pred_proba, average='macro')
roc_auc_micro = roc_auc_score(y_test, y_pred_proba, average='micro')

print(f'ROC AUC (macro): {roc_auc_macro:.4f}')
print(f'ROC AUC (micro): {roc_auc_micro:.4f}')

Epoch 1/10
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m183s[0m 923ms/step - auc: 0.7879 - loss: 0.2952 - val_auc: 0.7875 - val_loss: 0.2955
Epoch 2/10
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m178s[0m 899ms/step - auc: 0.7898 - loss: 0.2933 - val_auc: 0.7873 - val_loss: 0.2953
Epoch 3/10
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m178s[0m 898ms/step - auc: 0.7899 - loss: 0.2944 - val_auc: 0.7879 - val_loss: 0.2948
Epoch 4/10
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m178s[0m 898ms/step - auc: 0.7888 - loss: 0.2957 - val_auc: 0.7890 - val_loss: 0.2947
Epoch 5/10
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m177s[0m 897ms/step - auc: 0.7910 - loss: 0.2939 - val_auc: 0.7887 - val_loss: 0.2951
Epoch 6/10
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m177s[0m 896ms/step - auc: 0.7887 - loss: 0.2945 - val_auc: 0.7888 - val_loss: 0.2945
Epoch 7/10
[1m198/198[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0

El uso de modelos preentrenados resulta ventajoso porque aprovechan representaciones lingüísticas aprendidas sobre grandes corpus, lo que permite obtener un rendimiento significativamente superior al de los enfoques tradicionales entrenados desde cero, especialmente en tareas con recursos de datos limitados por esa razon se procedera a hacer uso del ecosistema Hugging Face.

## REDES NEURONALES + TRANSFORMER

In [None]:
import pandas as pd
from nltk.tokenize import TreebankWordTokenizer
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from sklearn.preprocessing import MultiLabelBinarizer
from sklearn.model_selection import train_test_split
from datasets import Dataset
from transformers import BertTokenizerFast, BertForSequenceClassification, Trainer, TrainingArguments
import torch
from sklearn.metrics import roc_auc_score
import numpy as np
from sklearn.metrics import roc_auc_score, log_loss


In [22]:


# ---------------------
# 1. Limpieza y lematización para otro modelo
# ---------------------
tokenizer_nltk = TreebankWordTokenizer()
lemmatizer = WordNetLemmatizer()
stop_words = set(stopwords.words('english'))

def lemmatize_and_clean(text):
    tokens = tokenizer_nltk.tokenize(text.lower())
    return " ".join([lemmatizer.lemmatize(token, pos='v') for token in tokens if token.isalpha() and token not in stop_words])

# Opción: Si quieres limpiar texto antes de BERT (no obligatorio, pero opcional)
dataTraining['plot_clean'] = dataTraining['plot'].apply(lemmatize_and_clean)



In [None]:
# ---------------------
# 2. Codificación de etiquetas (géneros)
# ---------------------
import ast  # alternativa más segura que eval()
import numpy as np # Import numpy

dataTraining['genres'] = dataTraining['genres'].map(
    lambda x: ast.literal_eval(x) if isinstance(x, str) else x
)

mlb = MultiLabelBinarizer()
# Convert to float32 here
dataTraining['label'] = mlb.fit_transform(dataTraining['genres']).astype(np.float32).tolist()
dataTraining=dataTraining.drop_duplicates(subset='plot').reset_index(drop=True)

# ---------------------
# 3. División train/test
# ---------------------
X_train, X_test, y_train, y_test = train_test_split(
    dataTraining['plot'],  # o 'plot' si no quieres limpiar antes
    dataTraining['label'],
    test_size=0.2,
    random_state=42
)


In [None]:
train_dataset = Dataset.from_dict({'text': list(X_train), 'labels': list(y_train)})
test_dataset  = Dataset.from_dict({'text': list(X_test),  'labels': list(y_test)})

In [None]:
# ---------------------
# 4. Tokenización BERT
# ---------------------
bert_tokenizer = BertTokenizerFast.from_pretrained('bert-base-uncased')

def tokenize_function(examples):
    return bert_tokenizer(examples["text"], padding="max_length", truncation=True, max_length=256)

tokenized_train = train_dataset.map(tokenize_function, batched=True)
tokenized_test  = test_dataset.map(tokenize_function, batched=True)

tokenizer_config.json:   0%|          | 0.00/48.0 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

config.json:   0%|          | 0.00/570 [00:00<?, ?B/s]

Map:   0%|          | 0/6315 [00:00<?, ? examples/s]

Map:   0%|          | 0/1579 [00:00<?, ? examples/s]

In [None]:
# ---------------------
# 5. Modelo BERT multietiqueta
# ---------------------
num_labels = len(mlb.classes_)
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=num_labels, problem_type="multi_label_classification")




Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


model.safetensors:   0%|          | 0.00/440M [00:00<?, ?B/s]

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [None]:
# ---------------------
# 6. Métrica personalizada
# ---------------------
def compute_metrics(pred):
    logits, labels = pred
    # En multilabel, necesitas aplicar sigmoid
    probs = 1 / (1 + np.exp(-logits))

    try:
        auc = roc_auc_score(labels, probs, average="macro")
    except ValueError:
        auc = float('nan')  # En caso de que falle por datos vacíos

    loss = log_loss(labels, probs)

    return {
        'roc_auc_macro': auc,
        'log_loss': loss
    }


In [None]:
# ---------------------
# 7. Configuración del entrenamiento
# ---------------------
training_args = TrainingArguments(
    output_dir="./bert_genre_model_optuna_v2",
    learning_rate=3.1449904984156505e-05,
    per_device_train_batch_size=8,
    per_device_eval_batch_size=8,
    num_train_epochs=20,
    weight_decay=0.02345047974845148,
    warmup_steps=564,
    eval_strategy="epoch",
    save_strategy="epoch",
    logging_dir='./logs',
    logging_steps=10,
    load_best_model_at_end=True,
    metric_for_best_model="roc_auc_macro"
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_test,
    tokenizer=bert_tokenizer,
    compute_metrics=compute_metrics
)



In [None]:
  # ---------------------
  # 8. Entrenamiento
  # ---------------------
  trainer.train()

  # ---------------------
  # 9. Evaluación final
  # ---------------------
  trainer.evaluate()




Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2315,0.224052,0.847354,3.124958,6.3435,248.915,31.213
2,0.1952,0.190065,0.888339,2.71976,6.3363,249.199,31.249
3,0.137,0.181561,0.903923,2.50583,6.3536,248.521,31.164
4,0.1061,0.184287,0.90374,2.930735,6.342,248.976,31.221
5,0.0855,0.19555,0.906501,2.723488,6.33,249.445,31.279
6,0.0759,0.204314,0.90423,2.941021,6.3436,248.913,31.213
7,0.048,0.213102,0.904092,3.050194,6.3349,249.254,31.255
8,0.0293,0.221337,0.907905,3.090428,6.3441,248.893,31.21
9,0.03,0.227105,0.90711,3.263512,6.3571,248.384,31.146
10,0.0175,0.243107,0.905769,3.335074,6.3616,248.206,31.124


{'eval_loss': 0.22133655846118927,
 'eval_roc_auc_macro': 0.9079048683106317,
 'eval_log_loss': 3.09042823599602,
 'eval_runtime': 6.3974,
 'eval_samples_per_second': 246.82,
 'eval_steps_per_second': 30.95,
 'epoch': 20.0}

In [None]:
predictions = trainer.predict(tokenized_test)
y_pred_proba = torch.sigmoid(torch.tensor(predictions.predictions)).numpy()


In [None]:
# Paso 1: Asegúrate de que no haya valores nulos en la columna 'plot'
dataTesting['plot'] = dataTesting['plot'].fillna('')

# Paso 2: Tokenizar el texto de test
tokenized_dataTesting = bert_tokenizer(
    list(dataTesting['plot']),
    padding=True,
    truncation=True,
    max_length=512,
    return_tensors="pt"
)


In [None]:
import torch

# 1. Detectar el dispositivo (GPU o CPU)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 2. Enviar el modelo al dispositivo
model.to(device)

# 3. Asegurarse de que los tensores estén también en el mismo dispositivo
tokenized_dataTesting = tokenized_dataTesting.to(device)  # si es un Dataset con tensores


In [None]:
import torch
from torch.utils.data import DataLoader, TensorDataset
# Convert tokenized_dataTesting dictionary to a TensorDataset
# Ensure all tensors are on the correct device
test_input_ids = tokenized_dataTesting['input_ids'].to(device)
test_attention_mask = tokenized_dataTesting['attention_mask'].to(device)
# Add other potential keys like 'token_type_ids' if they exist and are needed by the model
# For simplicity, we'll assume only input_ids and attention_mask are necessary based on the traceback context
test_dataset_tensor = TensorDataset(test_input_ids, test_attention_mask)

# Define a batch size for prediction (can be larger than training batch size if only inference is done)
# Start with a moderate size and reduce if OOM still occurs
prediction_batch_size = 64 # Or 64, 128 depending on available GPU memory

# Create a DataLoader
test_dataloader = DataLoader(test_dataset_tensor, batch_size=prediction_batch_size)

# List to store predictions
all_predictions = []

# Perform predictions in batches
with torch.no_grad():
    for batch in test_dataloader:
        # Unpack the batch and move to the device
        batch_input_ids, batch_attention_mask = batch

        # Make prediction for the current batch
        outputs = model(input_ids=batch_input_ids, attention_mask=batch_attention_mask)

        # Apply sigmoid and append to the list
        batch_probs = torch.sigmoid(outputs.logits).cpu().numpy()
        all_predictions.append(batch_probs)

# Concatenate predictions from all batches
y_pred_proba = np.concatenate(all_predictions, axis=0)



In [None]:
# Paso 4: Crear DataFrame de resultados con las columnas adecuadas
cols = ['p_Action', 'p_Adventure', 'p_Animation', 'p_Biography', 'p_Comedy', 'p_Crime', 'p_Documentary', 'p_Drama', 'p_Family',
        'p_Fantasy', 'p_Film-Noir', 'p_History', 'p_Horror', 'p_Music', 'p_Musical', 'p_Mystery', 'p_News', 'p_Romance',
        'p_Sci-Fi', 'p_Short', 'p_Sport', 'p_Thriller', 'p_War', 'p_Western']

# Paso 5: Recortar dataTesting para que coincidan los índices si es necesario
# This might not be strictly necessary if y_pred_proba has the same length as dataTesting,
# but keeping it for safety.
dataTesting_aligned = dataTesting.iloc[:len(y_pred_proba)]

# Paso 6: Crear DataFrame con probabilidades
res = pd.DataFrame(y_pred_proba, index=dataTesting_aligned.index, columns=cols)
res.index.name = 'ID'

# Paso 7 (opcional): Ver resultado
print(res.head())

    p_Action  p_Adventure  p_Animation  p_Biography  p_Comedy   p_Crime  \
ID                                                                        
1   0.003666     0.007259     0.000678     0.001306  0.000589  0.007125   
4   0.096450     0.004693     0.003053     0.938300  0.068513  0.973406   
5   0.006944     0.001181     0.002746     0.034599  0.000663  0.972333   
6   0.451207     0.069753     0.000248     0.000624  0.001610  0.011990   
7   0.001709     0.002856     0.002071     0.016185  0.014372  0.002712   

    p_Documentary   p_Drama  p_Family  p_Fantasy  ...  p_Musical  p_Mystery  \
ID                                                ...                         
1        0.000448  0.953342  0.001331   0.038539  ...   0.008203   0.314237   
4        0.014501  0.995690  0.006145   0.002335  ...   0.019394   0.001893   
5        0.011166  0.996515  0.002098   0.003489  ...   0.002344   0.473030   
6        0.000138  0.965599  0.000165   0.003346  ...   0.000870   0.006078   


In [None]:
res.to_csv("pred_genres_text_RDS.csv", index_label='ID')
res.head()

Unnamed: 0_level_0,p_Action,p_Adventure,p_Animation,p_Biography,p_Comedy,p_Crime,p_Documentary,p_Drama,p_Family,p_Fantasy,...,p_Musical,p_Mystery,p_News,p_Romance,p_Sci-Fi,p_Short,p_Sport,p_Thriller,p_War,p_Western
ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1,0.003666,0.007259,0.000678,0.001306,0.000589,0.007125,0.000448,0.953342,0.001331,0.038539,...,0.008203,0.314237,0.000795,0.976204,0.005423,0.001258,0.000438,0.899192,0.007881,0.003512
4,0.09645,0.004693,0.003053,0.9383,0.068513,0.973406,0.014501,0.99569,0.006145,0.002335,...,0.019394,0.001893,0.00706,0.040046,0.004602,0.009837,0.024653,0.040639,0.004927,0.007282
5,0.006944,0.001181,0.002746,0.034599,0.000663,0.972333,0.011166,0.996515,0.002098,0.003489,...,0.002344,0.47303,0.002996,0.002923,0.002967,0.006004,0.001584,0.986482,0.004587,0.003598
6,0.451207,0.069753,0.000248,0.000624,0.00161,0.01199,0.000138,0.965599,0.000165,0.003346,...,0.00087,0.006078,0.000161,0.100906,0.001582,0.000484,0.000162,0.977483,0.043121,0.002333
7,0.001709,0.002856,0.002071,0.016185,0.014372,0.002712,0.005404,0.081729,0.00256,0.010231,...,0.008549,0.040971,0.002951,0.106505,0.852708,0.014283,0.006552,0.085953,0.002748,0.0051


In [None]:
from google.colab import files
files.download("pred_genres_text_RDS.csv")

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

### Refinacion
calibracion hiperparametros

In [None]:
from transformers import BertForSequenceClassification

def model_init():
    return BertForSequenceClassification.from_pretrained(
        'bert-base-uncased',
        num_labels=len(mlb.classes_),       # número de géneros
        problem_type="multi_label_classification"
    )


In [None]:
from transformers import Trainer, TrainingArguments
from transformers.trainer_utils import IntervalStrategy
import optuna

def hp_space(trial):
    return {
        "learning_rate": trial.suggest_float("learning_rate", 1e-5, 5e-5, log=True),
        "num_train_epochs": trial.suggest_int("num_train_epochs", 3, 6),
        "per_device_train_batch_size": trial.suggest_categorical("per_device_train_batch_size", [8, 16, 32]),
        "weight_decay": trial.suggest_float("weight_decay", 0.0, 0.3),
        "warmup_steps": trial.suggest_int("warmup_steps", 0, 1000),
    }

In [None]:
training_args = TrainingArguments(
    output_dir="./bert_genre_optuna",
    eval_strategy=IntervalStrategy.EPOCH,
    save_strategy=IntervalStrategy.EPOCH,
    logging_dir="./logs",
    logging_steps=10,
    load_best_model_at_end=True,
    metric_for_best_model="roc_auc_macro",
    greater_is_better=True,
)


In [None]:
trainer = Trainer(
    model=model,
    model_init=model_init,
    args=training_args,
    train_dataset=tokenized_train,
    eval_dataset=tokenized_test,
    compute_metrics=compute_metrics,
    tokenizer=bert_tokenizer
)
best_run = trainer.hyperparameter_search(
    direction="maximize",
    backend="optuna",
    hp_space=hp_space,
    n_trials=10,            # Número de pruebas
    study_name="hp_optuna",
    storage=None,           # O bien una URI de sqlite para persistir la búsqueda
    load_if_exists=True
)

print("Mejores hiperparámetros encontrados:", best_run.hyperparameters)


[I 2025-05-26 17:19:10,899] A new study created in memory with name: hp_optuna
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mjhon971014[0m ([33mjhon971014-universidad-de-los-andes[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2371,0.222444,0.846627,3.093286,6.1628,256.213,32.128
2,0.1965,0.190838,0.891308,2.768471,6.1707,255.886,32.087
3,0.1382,0.179521,0.906315,2.491331,6.1965,254.823,31.954
4,0.1097,0.182588,0.905463,2.905177,6.2234,253.718,31.815
5,0.1002,0.181432,0.907424,2.642548,6.1892,255.122,31.991
6,0.0916,0.183186,0.906271,2.682702,6.2423,252.953,31.719


[I 2025-05-26 17:28:00,561] Trial 0 finished with value: 3.5889723619735943 and parameters: {'learning_rate': 3.1449904984156505e-05, 'num_train_epochs': 6, 'per_device_train_batch_size': 8, 'weight_decay': 0.02345047974845148, 'warmup_steps': 564}. Best is trial 0 with value: 3.5889723619735943.
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


0,1
eval/log_loss,█▄▁▆▃▃
eval/loss,█▃▁▂▁▂
eval/roc_auc_macro,▁▆████
eval/runtime,▁▂▄▆▃█
eval/samples_per_second,█▇▅▃▆▁
eval/steps_per_second,█▇▅▃▆▁
train/epoch,▁▁▂▂▂▂▂▂▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▅▅▆▆▆▆▇▇▇▇▇█████
train/global_step,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▃▃▃▄▄▄▄▄▄▄▄▅▅▅▅▆▆▆▆▆▇▇▇███
train/grad_norm,█▄▁▂▃▂▁▃▂▅▇▂▅█▄▃▃▂▄▅▃▁▇▄▄▇▂▂▂▃▅▁▃▂▃▃▂▁▂▅
train/learning_rate,▃▄▅▆▇████▇▇▇▇▆▆▆▆▅▅▅▅▅▅▄▄▃▃▃▃▂▂▂▂▂▂▂▁▁▁▁

0,1
eval/log_loss,2.6827
eval/loss,0.18319
eval/roc_auc_macro,0.90627
eval/runtime,6.2423
eval/samples_per_second,252.953
eval/steps_per_second,31.719
total_flos,4985623555153920.0
train/epoch,6.0
train/global_step,4740.0
train/grad_norm,0.84036


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2296,0.222427,0.853789,3.043476,6.2106,254.244,31.881
2,0.2008,0.194286,0.888775,2.812272,6.1933,254.952,31.97
3,0.1602,0.187313,0.898263,2.647278,6.1862,255.246,32.007


[I 2025-05-26 17:32:06,739] Trial 1 finished with value: 3.5455410974931523 and parameters: {'learning_rate': 2.2593484742042505e-05, 'num_train_epochs': 3, 'per_device_train_batch_size': 8, 'weight_decay': 0.28555931778374605, 'warmup_steps': 167}. Best is trial 0 with value: 3.5889723619735943.
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


0,1
eval/log_loss,█▄▁
eval/loss,█▂▁
eval/roc_auc_macro,▁▇█
eval/runtime,█▃▁
eval/samples_per_second,▁▆█
eval/steps_per_second,▁▆█
train/epoch,▁▁▁▁▂▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇█
train/global_step,▁▁▂▂▂▂▂▂▂▃▃▃▃▃▃▃▄▄▄▄▅▅▅▅▅▆▆▆▇▇▇▇▇▇██████
train/grad_norm,▅█▄▂▁▁▂▂▂▂▂▂▂▂▄▃▄▂▂▄▂▁▂▂▂▃▂▁▂▂▃▂▃▃▁▃▂▃▄▄
train/learning_rate,▁▃▄▆▇████▇▇▇▇▇▇▇▆▆▆▅▅▅▅▄▄▄▄▃▃▃▃▃▃▂▂▂▂▁▁▁

0,1
eval/log_loss,2.64728
eval/loss,0.18731
eval/roc_auc_macro,0.89826
eval/runtime,6.1862
eval/samples_per_second,255.246
eval/steps_per_second,32.007
total_flos,2492811777576960.0
train/epoch,3.0
train/global_step,2370.0
train/grad_norm,1.53009


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2949,0.295312,0.641197,4.132791,6.1925,254.986,31.974
2,0.2196,0.221927,0.849756,2.970719,6.1981,254.757,31.945
3,0.1898,0.196208,0.888449,2.58456,6.1906,255.064,31.984
4,0.138,0.184962,0.900057,2.753807,6.1774,255.609,32.052
5,0.1235,0.180187,0.903904,2.541317,6.196,254.842,31.956
6,0.1182,0.181265,0.90261,2.650752,6.2179,253.946,31.844


[I 2025-05-26 17:39:29,296] Trial 2 finished with value: 3.5533627455737213 and parameters: {'learning_rate': 3.3765330803788265e-05, 'num_train_epochs': 6, 'per_device_train_batch_size': 16, 'weight_decay': 0.0014922761821634211, 'warmup_steps': 893}. Best is trial 0 with value: 3.5889723619735943.
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


0,1
eval/log_loss,█▃▁▂▁▁
eval/loss,█▄▂▁▁▁
eval/roc_auc_macro,▁▇████
eval/runtime,▄▅▃▁▄█
eval/samples_per_second,▅▄▆█▅▁
eval/steps_per_second,▅▄▆█▅▁
train/epoch,▁▁▂▂▂▂▂▂▂▃▃▃▃▃▃▃▄▄▄▅▅▅▆▆▆▆▆▆▆▇▇▇▇▇▇█████
train/global_step,▁▁▁▁▂▂▃▃▃▃▃▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▆▆▇▇▇██
train/grad_norm,██▆▅▄▁▁▁▁▂▃▃▃▁▄▆▄█▃▃▆▅▃▄▃▁▆▃▄▅▅▂▃█▃▄▂▃▁▂
train/learning_rate,▁▁▁▃▃▄▅▆▆▆▇▇██████▇▇▆▆▆▅▅▅▅▄▃▃▃▃▃▂▂▂▂▁▁▁

0,1
eval/log_loss,2.65075
eval/loss,0.18127
eval/roc_auc_macro,0.90261
eval/runtime,6.2179
eval/samples_per_second,253.946
eval/steps_per_second,31.844
total_flos,4985623555153920.0
train/epoch,6.0
train/global_step,2370.0
train/grad_norm,0.57879


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2837,0.283587,0.682911,3.97327,6.2064,254.415,31.903
2,0.2104,0.212223,0.863272,2.888537,6.1999,254.682,31.936
3,0.1816,0.189796,0.893132,2.573087,6.2928,250.922,31.465
4,0.1273,0.182529,0.902042,2.734643,6.2421,252.958,31.72
5,0.11,0.179849,0.903605,2.563795,6.2286,253.509,31.789
6,0.1067,0.180971,0.903238,2.668581,6.2936,250.891,31.461


[I 2025-05-26 17:46:52,810] Trial 3 finished with value: 3.5718193859166694 and parameters: {'learning_rate': 4.005864681248332e-05, 'num_train_epochs': 6, 'per_device_train_batch_size': 16, 'weight_decay': 0.13235853361009453, 'warmup_steps': 798}. Best is trial 0 with value: 3.5889723619735943.
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


0,1
eval/log_loss,█▃▁▂▁▂
eval/loss,█▃▂▁▁▁
eval/roc_auc_macro,▁▇████
eval/runtime,▁▁█▄▃█
eval/samples_per_second,██▁▅▆▁
eval/steps_per_second,██▁▅▆▁
train/epoch,▁▂▂▂▂▂▂▃▃▃▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▇▇▇▇▇▇████
train/global_step,▁▁▂▂▂▂▂▂▂▂▃▃▃▃▃▄▄▄▅▅▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇▇██
train/grad_norm,██▆█▄▃▁▁▁▃▃▂▂▄▅▅▂▃▄▄▃▄▃▂▂▂▃▂▅▂▂▂▂▃▂▁▃▁▃▂
train/learning_rate,▁▁▂▂▂▃▃▃▄▄▅▅▆▇▇██▇▇▇▆▅▅▅▅▅▅▄▄▄▄▃▃▃▃▃▂▂▁▁

0,1
eval/log_loss,2.66858
eval/loss,0.18097
eval/roc_auc_macro,0.90324
eval/runtime,6.2936
eval/samples_per_second,250.891
eval/steps_per_second,31.461
total_flos,4985623555153920.0
train/epoch,6.0
train/global_step,2370.0
train/grad_norm,0.63377


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2628,0.270045,0.746896,3.922199,6.2073,254.379,31.898
2,0.2178,0.205739,0.873783,2.864446,6.2279,253.538,31.793
3,0.1731,0.192722,0.892069,2.709204,6.2147,254.076,31.86


[I 2025-05-26 17:50:59,817] Trial 4 finished with value: 3.60127248817985 and parameters: {'learning_rate': 2.0083818050590054e-05, 'num_train_epochs': 3, 'per_device_train_batch_size': 8, 'weight_decay': 0.10136103065810532, 'warmup_steps': 992}. Best is trial 4 with value: 3.60127248817985.
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


0,1
eval/log_loss,█▂▁
eval/loss,█▂▁
eval/roc_auc_macro,▁▇█
eval/runtime,▁█▄
eval/samples_per_second,█▁▅
eval/steps_per_second,█▁▅
train/epoch,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▄▄▄▅▅▅▅▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇▇▇█
train/global_step,▁▁▁▁▂▂▂▂▂▂▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇▇███
train/grad_norm,▅▅█▇▇▃▁▂▁▁▂▂▃▂▁▂▂▂▂▅▃▁▄▃▁▃▂▂▂▄▂▃▂▃▁▂▂▁▃▂
train/learning_rate,▁▁▂▂▃▄▄▅▅▅▆▆▇████▇▇▇▆▆▆▅▅▄▄▄▄▄▄▄▃▂▂▁▁▁▁▁

0,1
eval/log_loss,2.7092
eval/loss,0.19272
eval/roc_auc_macro,0.89207
eval/runtime,6.2147
eval/samples_per_second,254.076
eval/steps_per_second,31.86
total_flos,2492811777576960.0
train/epoch,3.0
train/global_step,2370.0
train/grad_norm,1.46166


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.238,0.231311,0.835601,3.204718,6.2682,251.908,31.588
2,0.2058,0.198089,0.882837,2.798812,6.2317,253.381,31.773
3,0.1602,0.187977,0.89874,2.583803,6.3915,247.046,30.979
4,0.1415,0.183794,0.899596,2.644717,6.2421,252.959,31.72


[I 2025-05-26 17:56:28,618] Trial 5 finished with value: 3.5443136630978627 and parameters: {'learning_rate': 1.9807425470547966e-05, 'num_train_epochs': 4, 'per_device_train_batch_size': 8, 'weight_decay': 0.05941190264260788, 'warmup_steps': 331}. Best is trial 4 with value: 3.60127248817985.
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


0,1
eval/log_loss,█▃▁▂
eval/loss,█▃▂▁
eval/roc_auc_macro,▁▆██
eval/runtime,▃▁█▁
eval/samples_per_second,▆█▁█
eval/steps_per_second,▆█▁█
train/epoch,▁▁▁▁▁▂▂▂▂▂▂▂▂▂▃▃▃▄▄▄▄▄▄▄▄▄▅▅▅▅▆▆▇▇▇▇▇▇██
train/global_step,▁▁▁▁▂▂▂▂▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇███
train/grad_norm,▆▇▅▂▂▁▃▃▁▅▄▂▂▃▂▃▂▄▅▂▂▃▃▅▃▄█▂▃▃▃▅▁█▃▄▄▃▂▂
train/learning_rate,▃▆▇▇█████▇▇▇▇▇▆▆▆▆▅▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▁

0,1
eval/log_loss,2.64472
eval/loss,0.18379
eval/roc_auc_macro,0.8996
eval/runtime,6.2421
eval/samples_per_second,252.959
eval/steps_per_second,31.72
total_flos,3323749036769280.0
train/epoch,4.0
train/global_step,3160.0
train/grad_norm,2.30395


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2683,0.2672,0.748737,3.842223,6.1819,255.421,32.029


[I 2025-05-26 17:57:42,174] Trial 6 pruned. 
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


0,1
eval/log_loss,▁
eval/loss,▁
eval/roc_auc_macro,▁
eval/runtime,▁
eval/samples_per_second,▁
eval/steps_per_second,▁
train/epoch,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇████
train/global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇████
train/grad_norm,██▇▆▆▆▆▆▆██▄▅▅▅▂▂▂▃▂▅▅▁▂▁▁▁▂▁▁▁▂▂▁▁▂▃▂▁
train/learning_rate,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▅▅▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███

0,1
eval/log_loss,3.84222
eval/loss,0.2672
eval/roc_auc_macro,0.74874
eval/runtime,6.1819
eval/samples_per_second,255.421
eval/steps_per_second,32.029
train/epoch,1.0
train/global_step,395.0
train/grad_norm,0.42489
train/learning_rate,3e-05


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2238,0.229075,0.846007,3.168189,6.1908,255.057,31.983


[I 2025-05-26 17:58:55,777] Trial 7 pruned. 
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


0,1
eval/log_loss,▁
eval/loss,▁
eval/roc_auc_macro,▁
eval/runtime,▁
eval/samples_per_second,▁
eval/steps_per_second,▁
train/epoch,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇████
train/global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▅▅▅▅▅▆▆▆▆▆▆▇▇▇▇▇████
train/grad_norm,▆▇▅▅█▄▄▂▂▂▂▂▁▁▁▁▁▂▁▁▁▂▁▂▂▂▁▂▂▃▂▃▃▁▃▂▄▂▃
train/learning_rate,▁▂▂▃▄▅▅▆▇█████████▇▇▇▇▇▇▇▇▇▇▇▇▇▇▆▆▆▆▆▆▆

0,1
eval/log_loss,3.16819
eval/loss,0.22908
eval/roc_auc_macro,0.84601
eval/runtime,6.1908
eval/samples_per_second,255.057
eval/steps_per_second,31.983
train/epoch,1.0
train/global_step,395.0
train/grad_norm,0.73373
train/learning_rate,3e-05


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2129,0.202299,0.878883,2.804229,6.2149,254.068,31.859


[I 2025-05-26 18:00:17,950] Trial 8 pruned. 
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


0,1
eval/log_loss,▁
eval/loss,▁
eval/roc_auc_macro,▁
eval/runtime,▁
eval/samples_per_second,▁
eval/steps_per_second,▁
train/epoch,▁▁▁▂▂▂▂▂▂▂▃▃▃▃▃▃▄▄▄▄▅▅▅▅▅▅▅▅▆▆▆▆▇▇▇▇▇███
train/global_step,▁▁▁▂▂▂▂▂▂▃▃▃▃▃▄▄▄▄▄▄▄▄▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇███
train/grad_norm,█▂▂▁▁▁▁▁▁▂▂▂▁▂▁▃▃▂▅▂▃▃▂▂▂▂▂▂▃▂▃▅▅▃▃▂▅▃▁▃
train/learning_rate,▁▇███████████████▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇▇

0,1
eval/log_loss,2.80423
eval/loss,0.2023
eval/roc_auc_macro,0.87888
eval/runtime,6.2149
eval/samples_per_second,254.068
eval/steps_per_second,31.859
train/epoch,1.0
train/global_step,790.0
train/grad_norm,1.09242
train/learning_rate,4e-05


Epoch,Training Loss,Validation Loss,Roc Auc Macro,Log Loss,Runtime,Samples Per Second,Steps Per Second
1,0.2804,0.280329,0.700256,3.980766,6.1818,255.426,32.029
2,0.2293,0.230455,0.840178,3.400738,6.2406,253.022,31.728
3,0.2034,0.211604,0.873057,2.941467,6.2406,253.021,31.728
4,0.178,0.198758,0.881549,2.873212,6.2294,253.475,31.785
5,0.1682,0.19399,0.888929,2.766629,6.2386,253.101,31.738
6,0.1612,0.192532,0.888837,2.743866,6.2631,252.112,31.614


[I 2025-05-26 18:07:12,657] Trial 9 finished with value: 3.6327026218329994 and parameters: {'learning_rate': 2.745429372377183e-05, 'num_train_epochs': 6, 'per_device_train_batch_size': 32, 'weight_decay': 0.17875462925006133, 'warmup_steps': 90}. Best is trial 9 with value: 3.6327026218329994.


Mejores hiperparámetros encontrados: {'learning_rate': 2.745429372377183e-05, 'num_train_epochs': 6, 'per_device_train_batch_size': 32, 'weight_decay': 0.17875462925006133, 'warmup_steps': 90}


### regresion logistica

In [None]:
from sklearn.feature_extraction.text import TfidfVectorizer

vect= TfidfVectorizer(analyzer=lemmatize_and_clean, max_features=10000, ngram_range=(1, 2))
X = vect.fit_transform(dataTraining['plot'])
X.shape

(7895, 10000)

In [None]:
dataTraining['genres'] = dataTraining['genres'].map(lambda x: ast.literal_eval(x) if isinstance(x, str) else x)
le = MultiLabelBinarizer()
y = le.fit_transform(dataTraining['genres'])

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.multiclass import OneVsRestClassifier

model = OneVsRestClassifier(LogisticRegression(max_iter=1000))
model.fit(X_train, y_train)


In [None]:
from sklearn.metrics import classification_report, roc_auc_score

y_pred_prob = model.predict_proba(X_test)
y_pred_bin = (y_pred_prob > 0.5).astype(int)  # Umbral ajustable

print(classification_report(y_test, y_pred_bin, target_names=le.classes_))
print("ROC AUC macro:", roc_auc_score(y_test, y_pred_prob, average="macro"))
print("ROC AUC micro:", roc_auc_score(y_test, y_pred_prob, average="micro"))


              precision    recall  f1-score   support

      Action       0.76      0.11      0.19       423
   Adventure       0.96      0.07      0.13       340
   Animation       0.00      0.00      0.00        99
   Biography       0.00      0.00      0.00       130
      Comedy       0.76      0.47      0.58      1028
       Crime       0.88      0.21      0.34       468
 Documentary       0.95      0.16      0.27       129
       Drama       0.68      0.72      0.70      1283
      Family       0.00      0.00      0.00       252
     Fantasy       1.00      0.02      0.04       243
   Film-Noir       0.00      0.00      0.00        57
     History       0.00      0.00      0.00        80
      Horror       0.87      0.09      0.16       300
       Music       1.00      0.04      0.08       123
     Musical       0.00      0.00      0.00        97
     Mystery       1.00      0.02      0.04       242
        News       0.00      0.00      0.00         3
     Romance       0.81    