#Celda 0 — Título y descripción (Markdown)

# 05 - Sequence Models (aplicado al dataset de partidos internacionales)
Este cuaderno explora múltiples arquitecturas de *sequence models* aplicadas al dataset:
"International football results (1872–2017)" (Kaggle).

Secciones:
- 5.0 Cross-validation in time series
- 5.1 Recurrent Neural Networks (SimpleRNN)
- 5.2 LSTM and GRU
- 5.3 Truncated BPTT (demostración)
- 5.4 Text processing (breve, aplicado a columnas text)
- 5.5 Sequence generation (ejemplo didáctico)
- 5.6 Bidirectional RNNs
- 5.7 ELMo (opcional, TF-Hub)
- 5.8 Transformer (basado en MultiHeadAttention)
- 5.9 CNN-LSTM architectures

El notebook prepara los datos, genera secuencias por partido (últimos k partidos por equipo) y entrena modelos comparables.


#Celda 1 — Librerías, montaje Drive y (opcional) Kaggle

In [1]:
# 1. Librerías y montaje de Drive
import os, glob, json
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint, ReduceLROnPlateau

from sklearn.model_selection import train_test_split, TimeSeriesSplit
from sklearn.metrics import classification_report, confusion_matrix

# Montar Drive (necesario para guardar/leer .npy y modelos)
from google.colab import drive
drive.mount('/content/drive')

# Opcional: instalar kaggle si necesitas descargar directamente el dataset
!pip install -q kaggle


Mounted at /content/drive


#Celda 2 — Descargar dataset desde Kaggle (si no lo tienes)

In [3]:
# DESCARGA DESDE KAGGLE (ejecuta solo si no tienes results.csv en Drive)
# 1) Sube kaggle.json a Colab
# 2) Ejecuta este bloque una vez

# Copiar kaggle.json a la ubicación .kaggle si lo subiste en la raíz de Colab
if os.path.exists("kaggle.json"):
    os.makedirs("/root/.kaggle", exist_ok=True)
    !cp kaggle.json /root/.kaggle/
    !chmod 600 /root/.kaggle/kaggle.json

# Descargar y descomprimir (si no existe ya)
data_zip = "/content/data/international-football-results-from-1872-to-2017.zip"
if not os.path.exists("/content/data/results.csv"):
    !kaggle datasets download -d martj42/international-football-results-from-1872-to-2017 -p /content/data
    !unzip -o /content/data/international-football-results-from-1872-to-2017.zip -d /content/data


Dataset URL: https://www.kaggle.com/datasets/martj42/international-football-results-from-1872-to-2017
License(s): CC0-1.0
Downloading international-football-results-from-1872-to-2017.zip to /content/data
  0% 0.00/1.16M [00:00<?, ?B/s]
100% 1.16M/1.16M [00:00<00:00, 865MB/s]
Archive:  /content/data/international-football-results-from-1872-to-2017.zip
  inflating: /content/data/former_names.csv  
  inflating: /content/data/goalscorers.csv  
  inflating: /content/data/results.csv  
  inflating: /content/data/shootouts.csv  


#Celda 3 — Cargar CSV y vista rápida

In [4]:
# Cargar CSV (ajusta la ruta si la tienes en Drive)
csv_paths = [
    "/content/data/results.csv",
    "/content/drive/MyDrive/results.csv",
    "/content/drive/MyDrive/football_project/results.csv"
]
csv_path = None
for p in csv_paths:
    if os.path.exists(p):
        csv_path = p
        break
if csv_path is None:
    raise FileNotFoundError("No se encontró results.csv. Descárgalo o coloca una copia en Drive.")
print("Usando:", csv_path)

df = pd.read_csv(csv_path, parse_dates=['date'])
print(df.shape)
df.head()


Usando: /content/data/results.csv
(48532, 9)


Unnamed: 0,date,home_team,away_team,home_score,away_score,tournament,city,country,neutral
0,1872-11-30,Scotland,England,0,0,Friendly,Glasgow,Scotland,False
1,1873-03-08,England,Scotland,4,2,Friendly,London,England,False
2,1874-03-07,Scotland,England,2,1,Friendly,Glasgow,Scotland,False
3,1875-03-06,England,Scotland,2,2,Friendly,London,England,False
4,1876-03-04,Scotland,England,3,0,Friendly,Glasgow,Scotland,False


#Celda 4 — Definir etiqueta y y limpieza breve

In [5]:
# Crear target:
# 0 = HomeWin, 1 = Draw, 2 = AwayWin
def result_label(row):
    if row['home_score'] > row['away_score']:
        return 0
    elif row['home_score'] == row['away_score']:
        return 1
    else:
        return 2

df = df.dropna(subset=['home_score','away_score','date','home_team','away_team']).copy()
df['result'] = df.apply(result_label, axis=1)
df['year'] = df['date'].dt.year

print("Clases (counts):")
print(df['result'].value_counts())


Clases (counts):
result
0    23797
2    13700
1    11035
Name: count, dtype: int64


#Celda 5 — Construir secuencias por partido (ventana k)

In [6]:
# Idea: para cada partido, construimos una secuencia que concatena los últimos k goal_diffs
# de home y de away. Otras features pueden añadirse.
k = 6  # ventana (ajusta si deseas)
# Ordenar por fecha
df = df.sort_values('date').reset_index(drop=True)

# Calculamos goal_diff por partido desde la perspectiva de cada equipo
df['home_goal_diff'] = df['home_score'] - df['away_score']
df['away_goal_diff'] = -df['home_goal_diff']  # perspectiva visitante

# Construir historial por equipo
teams = pd.unique(pd.concat([df['home_team'], df['away_team']]))
history = {t: [] for t in teams}

seqs = []
statics = []
labels = []
indices_kept = []

# Para reproducibilidad: iteramos en orden cronológico
for idx, row in df.iterrows():
    home = row['home_team']
    away = row['away_team']
    date = row['date']
    # last k diffs for home and away
    h_hist = history[home][-k:]
    a_hist = history[away][-k:]
    # pad left with zeros if insuficientes
    h_hist_pad = [0.0]*(k - len(h_hist)) + h_hist
    a_hist_pad = [0.0]*(k - len(a_hist)) + a_hist
    # concatenar -> shape (2*k,)
    seq = np.array(h_hist_pad + a_hist_pad, dtype=np.float32)
    # static features: year (num), neutral (if exists), tournament (encoded later)
    neutral = 0
    if 'neutral' in df.columns:
        neutral = int(bool(row['neutral']))
    # country maybe useful: encode later; for now include year and neutral
    static = np.array([row['year'], neutral], dtype=np.float32)
    seqs.append(seq)
    statics.append(static)
    labels.append(row['result'])
    indices_kept.append(idx)
    # update history: append goal diff to both teams' histories
    history[home].append(row['home_goal_diff'])
    history[away].append(row['away_goal_diff'])

X_seq = np.array(seqs)   # shape (N, 2k)
X_static = np.array(statics)
y = np.array(labels, dtype=np.int32)

# reshape X_seq para modelos que esperan (N, timesteps, channels)
X_seq = X_seq.reshape(X_seq.shape[0], 2*k, 1)

print("Formas resultantes:", X_seq.shape, X_static.shape, y.shape)


Formas resultantes: (48532, 12, 1) (48532, 2) (48532,)


#Celda 6 — Escalado, codificación y split temporal

In [7]:
# Escalar X_static
from sklearn.preprocessing import StandardScaler
scaler_static = StandardScaler()
X_static_scaled = scaler_static.fit_transform(X_static)

# Train/test split por tiempo: usar corte por año para respetar la secuencia temporal
# Ejemplo: entrenar con partidos <= 2010, test > 2010  (ajusta según tamaño)
cut_year = 2010
mask_train = df.loc[indices_kept, 'year'] <= cut_year
# si esto produce sets vacíos, fallback a split aleatorio
if mask_train.sum() < 100:
    Xs_train, Xs_test, Xst_train, Xst_test, y_train, y_test = train_test_split(
        X_seq, X_static_scaled, y, test_size=0.2, random_state=42, stratify=y)
else:
    Xs_train = X_seq[mask_train.values]
    Xs_test  = X_seq[~mask_train.values]
    Xst_train = X_static_scaled[mask_train.values]
    Xst_test  = X_static_scaled[~mask_train.values]
    y_train = y[mask_train.values]
    y_test = y[~mask_train.values]

print("Train/test sizes:", Xs_train.shape[0], Xs_test.shape[0])


Train/test sizes: 34441 14091


#Celda 7 — Guardar arrays a Drive para reproducibilidad

In [8]:
artifacts_dir = "/content/drive/MyDrive/football_project/artifacts"
os.makedirs(artifacts_dir, exist_ok=True)
np.save(os.path.join(artifacts_dir, "X_seq_5k.npy"), X_seq)
np.save(os.path.join(artifacts_dir, "X_static_5k.npy"), X_static_scaled)
np.save(os.path.join(artifacts_dir, "y_5k.npy"), y)
print("Guardado en:", artifacts_dir)


Guardado en: /content/drive/MyDrive/football_project/artifacts


##=== 5.0 Cross-validation in time series (celda explicativa + código) ===

In [9]:
# 5.0 Cross-validation for time series: ejemplo con TimeSeriesSplit
from sklearn.model_selection import TimeSeriesSplit
tscv = TimeSeriesSplit(n_splits=5)
# Demostración rápida con el dataset de training (usa indices)
for fold, (train_idx, val_idx) in enumerate(tscv.split(Xs_train)):
    print(f"Fold {fold}: train {len(train_idx)} val {len(val_idx)}")
# Nota: para modelos DL se suele usar validación temporal manual (ej. corte por fecha)


Fold 0: train 5741 val 5740
Fold 1: train 11481 val 5740
Fold 2: train 17221 val 5740
Fold 3: train 22961 val 5740
Fold 4: train 28701 val 5740


##=== 5.1 SimpleRNN (celda modelo + entrenamiento) ===

In [10]:
# 5.1 SimpleRNN baseline
from tensorflow.keras.layers import SimpleRNN

n_classes = len(np.unique(y))
input_seq = keras.Input(shape=Xs_train.shape[1:], name="seq_in")
x = SimpleRNN(64)(input_seq)
x = layers.Dropout(0.3)(x)

input_static = keras.Input(shape=(Xst_train.shape[1],), name="static_in")
s = layers.Dense(32, activation="relu")(input_static)

concat = layers.concatenate([x, s])
out = layers.Dense(64, activation="relu")(concat)
out = layers.Dense(n_classes, activation="softmax")(out)

model_rnn = keras.Model([input_seq, input_static], out)
model_rnn.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model_rnn.summary()

cb = [EarlyStopping(patience=5, restore_best_weights=True),
      ModelCheckpoint("/content/drive/MyDrive/football_project/models/rnn_best.keras", save_best_only=True)]

hist_rnn = model_rnn.fit([Xs_train, Xst_train], y_train, validation_split=0.2, epochs=30, batch_size=128, callbacks=cb)


Epoch 1/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 5ms/step - accuracy: 0.4822 - loss: 1.0303 - val_accuracy: 0.5237 - val_loss: 0.9958
Epoch 2/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.5341 - loss: 0.9818 - val_accuracy: 0.5263 - val_loss: 0.9879
Epoch 3/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 5ms/step - accuracy: 0.5351 - loss: 0.9773 - val_accuracy: 0.5272 - val_loss: 0.9863
Epoch 4/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 5ms/step - accuracy: 0.5350 - loss: 0.9740 - val_accuracy: 0.5298 - val_loss: 0.9845
Epoch 5/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 8ms/step - accuracy: 0.5396 - loss: 0.9757 - val_accuracy: 0.5262 - val_loss: 0.9853
Epoch 6/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.5403 - loss: 0.9736 - val_accuracy: 0.5268 - val_loss: 0.9875
Epoch 7/30
[1m216/216[0m 

##=== 5.2 LSTM y GRU (dos modelos) ===

In [11]:
# 5.2 LSTM
from tensorflow.keras.layers import LSTM, GRU

# LSTM model
seq_in = keras.Input(shape=Xs_train.shape[1:], name="seq")
x = LSTM(64)(seq_in)
x = layers.Dropout(0.3)(x)
stat_in = keras.Input(shape=(Xst_train.shape[1],), name="stat")
s = layers.Dense(32, activation="relu")(stat_in)
c = layers.concatenate([x, s])
h = layers.Dense(64, activation="relu")(c)
out = layers.Dense(n_classes, activation="softmax")(h)
model_lstm = keras.Model([seq_in, stat_in], out)
model_lstm.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model_lstm.summary()

cb_lstm = [EarlyStopping(patience=5, restore_best_weights=True),
           ModelCheckpoint("/content/drive/MyDrive/football_project/models/lstm_best.keras", save_best_only=True)]
hist_lstm = model_lstm.fit([Xs_train, Xst_train], y_train, validation_split=0.2, epochs=50, batch_size=128, callbacks=cb_lstm)

# GRU model (más ligero)
seq_in2 = keras.Input(shape=Xs_train.shape[1:], name="seq2")
x2 = GRU(64)(seq_in2)
x2 = layers.Dropout(0.3)(x2)
stat_in2 = keras.Input(shape=(Xst_train.shape[1],), name="stat2")
s2 = layers.Dense(32, activation="relu")(stat_in2)
c2 = layers.concatenate([x2, s2])
h2 = layers.Dense(64, activation="relu")(c2)
out2 = layers.Dense(n_classes, activation="softmax")(h2)
model_gru = keras.Model([seq_in2, stat_in2], out2)
model_gru.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model_gru.summary()

cb_gru = [EarlyStopping(patience=5, restore_best_weights=True),
          ModelCheckpoint("/content/drive/MyDrive/football_project/models/gru_best.keras", save_best_only=True)]
hist_gru = model_gru.fit([Xs_train, Xst_train], y_train, validation_split=0.2, epochs=50, batch_size=128, callbacks=cb_gru)


Epoch 1/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 13ms/step - accuracy: 0.5100 - loss: 1.0076 - val_accuracy: 0.5191 - val_loss: 0.9870
Epoch 2/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 11ms/step - accuracy: 0.5356 - loss: 0.9753 - val_accuracy: 0.5246 - val_loss: 0.9839
Epoch 3/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 13ms/step - accuracy: 0.5346 - loss: 0.9773 - val_accuracy: 0.5253 - val_loss: 0.9825
Epoch 4/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 14ms/step - accuracy: 0.5350 - loss: 0.9737 - val_accuracy: 0.5261 - val_loss: 0.9822
Epoch 5/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 11ms/step - accuracy: 0.5344 - loss: 0.9777 - val_accuracy: 0.5246 - val_loss: 0.9850
Epoch 6/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 11ms/step - accuracy: 0.5325 - loss: 0.9751 - val_accuracy: 0.5226 - val_loss: 0.9834
Epoch 7/50
[1m216/216

Epoch 1/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 15ms/step - accuracy: 0.5109 - loss: 1.0074 - val_accuracy: 0.5239 - val_loss: 0.9864
Epoch 2/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 14ms/step - accuracy: 0.5319 - loss: 0.9840 - val_accuracy: 0.5216 - val_loss: 0.9879
Epoch 3/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 15ms/step - accuracy: 0.5320 - loss: 0.9781 - val_accuracy: 0.5246 - val_loss: 0.9853
Epoch 4/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 13ms/step - accuracy: 0.5372 - loss: 0.9775 - val_accuracy: 0.5236 - val_loss: 0.9841
Epoch 5/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 13ms/step - accuracy: 0.5333 - loss: 0.9786 - val_accuracy: 0.5252 - val_loss: 0.9829
Epoch 6/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 14ms/step - accuracy: 0.5362 - loss: 0.9761 - val_accuracy: 0.5237 - val_loss: 0.9856
Epoch 7/50
[1m216/216

##=== 5.3 Truncated BPTT (demostración conceptual) ===

In [12]:
# 5.3 Truncated BPTT: demostración simple.
# En secuencias largas se puede dividir en "segments" y hacer stateful training.
# Aquí haremos una demo conceptual con LSTM stateful usando batch_size fijo.

# ADVERTENCIA: stateful=True requiere manejo cuidadoso de batches y reset_states()
# Este bloque es didáctico y no se recomienda ejecutarlo en producción sin adaptar los datos.

print("Truncated BPTT: revisión teórica y opciones en Keras (stateful LSTM, sequences windows).")


Truncated BPTT: revisión teórica y opciones en Keras (stateful LSTM, sequences windows).


##=== 5.4 Text processing (breve demo) ===

In [13]:
# 5.4 Procesamiento de texto aplicado (ej: torneo o city como texto)
from tensorflow.keras.preprocessing.text import Tokenizer
texts = df['tournament'].fillna("NA").astype(str).values[:2000]  # demo subset
tok = Tokenizer(num_words=5000, oov_token="<OOV>")
tok.fit_on_texts(texts)
seqs_txt = tok.texts_to_sequences(texts)
from tensorflow.keras.preprocessing.sequence import pad_sequences
padded = pad_sequences(seqs_txt, maxlen=10)
print("Ejemplo tokenizado y padded shape:", padded.shape)


Ejemplo tokenizado y padded shape: (2000, 10)


##=== 5.5 Sequence generation (ejemplo simple de char-RNN) ===

In [14]:
# 5.5 Generación de secuencias - ejemplo didáctico (caracteres)
# Dataset pequeño de demostración
sample_text = "football machine learning example"
chars = sorted(list(set(sample_text)))
char2idx = {c:i for i,c in enumerate(chars)}
idx2char = {i:c for c,i in char2idx.items()}

# preparar dataset (char to char)
seq_len = 10
Xg, yg = [], []
for i in range(len(sample_text)-seq_len):
    s = sample_text[i:i+seq_len]
    t = sample_text[i+seq_len]
    Xg.append([char2idx[c] for c in s])
    yg.append(char2idx[t])
Xg = np.array(Xg); yg = np.array(yg)
# pequeño modelo char-RNN
model_char = keras.Sequential([
    layers.Embedding(input_dim=len(chars), output_dim=8, input_length=seq_len),
    layers.LSTM(64),
    layers.Dense(len(chars), activation='softmax')
])
model_char.compile(loss='sparse_categorical_crossentropy', optimizer='adam')
model_char.fit(Xg, yg, epochs=50, verbose=0)
print("Char-RNN entrenado (demo).")




Char-RNN entrenado (demo).


##=== 5.6 Bidirectional RNNs ===

In [15]:
# 5.6 Bidirectional LSTM
from tensorflow.keras.layers import Bidirectional
seq_in = keras.Input(shape=Xs_train.shape[1:], name="seq_bi")
x = Bidirectional(LSTM(64))(seq_in)
x = layers.Dropout(0.3)(x)
stat_in = keras.Input(shape=(Xst_train.shape[1],), name="stat_bi")
s = layers.Dense(32, activation='relu')(stat_in)
c = layers.concatenate([x, s])
o = layers.Dense(n_classes, activation='softmax')(layers.Dense(64, activation='relu')(c))

model_bi = keras.Model([seq_in, stat_in], o)
model_bi.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model_bi.summary()

cb_bi = [EarlyStopping(patience=5, restore_best_weights=True),
         ModelCheckpoint("/content/drive/MyDrive/football_project/models/bi_lstm_best.keras", save_best_only=True)]
hist_bi = model_bi.fit([Xs_train, Xst_train], y_train, validation_split=0.2, epochs=30, batch_size=128, callbacks=cb_bi)


Epoch 1/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 23ms/step - accuracy: 0.5209 - loss: 0.9980 - val_accuracy: 0.5181 - val_loss: 0.9889
Epoch 2/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 18ms/step - accuracy: 0.5350 - loss: 0.9735 - val_accuracy: 0.5217 - val_loss: 0.9839
Epoch 3/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 17ms/step - accuracy: 0.5397 - loss: 0.9728 - val_accuracy: 0.5245 - val_loss: 0.9871
Epoch 4/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 22ms/step - accuracy: 0.5338 - loss: 0.9760 - val_accuracy: 0.5221 - val_loss: 0.9858
Epoch 5/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 17ms/step - accuracy: 0.5331 - loss: 0.9772 - val_accuracy: 0.5202 - val_loss: 0.9860
Epoch 6/30
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 19ms/step - accuracy: 0.5378 - loss: 0.9715 - val_accuracy: 0.5230 - val_loss: 0.9842
Epoch 7/30
[1m216/216

##=== 5.7 ELMo (opcional) ===

In [16]:
# 5.7 ELMo (opcional) - requiere internet y tensorflow_hub
# NOTA: TF-Hub y ELMo pueden requerir versiones específicas de TF y mucho tiempo/espacio.
print("ELMo es opcional. Para usarlo, instala tensorflow-hub y carga el modelo contextual.")
# Ejemplo (no ejecutarlo si no tienes recursos):
# !pip install -q tensorflow-hub
# import tensorflow_hub as hub
# elmo = hub.load("https://tfhub.dev/google/elmo/3")
# emb = elmo.signatures['default'](tf.constant(["This is a test"]))['elmo']


ELMo es opcional. Para usarlo, instala tensorflow-hub y carga el modelo contextual.


##=== 5.8 Transformer (mini-implementación) ===

In [17]:
# 5.8 Transformer-like small model usando MultiHeadAttention
from tensorflow.keras.layers import MultiHeadAttention, LayerNormalization, Add

def transformer_block(x, head_size=32, num_heads=2, ff_dim=64, dropout=0.1):
    attn_out = MultiHeadAttention(num_heads=num_heads, key_dim=head_size)(x, x)
    attn_out = Dropout(dropout)(attn_out)
    out1 = LayerNormalization(epsilon=1e-6)(Add()([x, attn_out]))
    ff = layers.Dense(ff_dim, activation="relu")(out1)
    ff = layers.Dense(x.shape[-1])(ff)
    ff = Dropout(dropout)(ff)
    return LayerNormalization(epsilon=1e-6)(Add()([out1, ff]))

# Preparar input: transformer expects shape (batch, timesteps, features)
# Xs_train currently (batch, timesteps, 1) - good
seq_input = keras.Input(shape=Xs_train.shape[1:], name="tr_seq")  # (timesteps, 1)
x = layers.Dense(32)(seq_input)  # project to d_model=32
x = transformer_block(x, head_size=16, num_heads=2, ff_dim=64)
x = layers.GlobalAveragePooling1D()(x)

static_in = keras.Input(shape=(Xst_train.shape[1],), name="tr_stat")
s = layers.Dense(32, activation="relu")(static_in)
concat = layers.concatenate([x, s])
h = layers.Dense(64, activation="relu")(concat)
out = layers.Dense(n_classes, activation="softmax")(h)
model_tr = keras.Model([seq_input, static_in], out)
model_tr.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model_tr.summary()

cb_tr = [EarlyStopping(patience=5, restore_best_weights=True),
         ModelCheckpoint("/content/drive/MyDrive/football_project/models/transformer_best.keras", save_best_only=True)]
hist_tr = model_tr.fit([Xs_train, Xst_train], y_train, validation_split=0.2, epochs=50, batch_size=128, callbacks=cb_tr)


NameError: name 'Dropout' is not defined

##=== 5.9 CNN-LSTM (arquitectura combinada) ===

In [18]:
# 5.9 CNN-LSTM: Conv1D -> MaxPool -> LSTM -> concat static
seq_in = keras.Input(shape=Xs_train.shape[1:], name="cnn_lstm_seq")
c = layers.Conv1D(32, 3, activation="relu", padding="same")(seq_in)
c = layers.MaxPooling1D(2)(c)
c = layers.Conv1D(64, 3, activation="relu", padding="same")(c)
c = layers.LSTM(64)(c)
c = layers.Dropout(0.3)(c)

stat_in = keras.Input(shape=(Xst_train.shape[1],), name="cnn_lstm_stat")
s = layers.Dense(32, activation="relu")(stat_in)
m = layers.concatenate([c, s])
h = layers.Dense(64, activation="relu")(m)
out = layers.Dense(n_classes, activation="softmax")(h)

model_cnn_lstm = keras.Model([seq_in, stat_in], out)
model_cnn_lstm.compile(optimizer="adam", loss="sparse_categorical_crossentropy", metrics=["accuracy"])
model_cnn_lstm.summary()

cb_cnnlstm = [EarlyStopping(patience=5, restore_best_weights=True),
              ModelCheckpoint("/content/drive/MyDrive/football_project/models/cnn_lstm_best.keras", save_best_only=True)]
hist_cnnlstm = model_cnn_lstm.fit([Xs_train, Xst_train], y_train, validation_split=0.2, epochs=50, batch_size=128, callbacks=cb_cnnlstm)


Epoch 1/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 13ms/step - accuracy: 0.5091 - loss: 1.0114 - val_accuracy: 0.5187 - val_loss: 0.9881
Epoch 2/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 10ms/step - accuracy: 0.5368 - loss: 0.9755 - val_accuracy: 0.5239 - val_loss: 0.9880
Epoch 3/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 14ms/step - accuracy: 0.5343 - loss: 0.9776 - val_accuracy: 0.5239 - val_loss: 0.9894
Epoch 4/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 10ms/step - accuracy: 0.5327 - loss: 0.9766 - val_accuracy: 0.5249 - val_loss: 0.9824
Epoch 5/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 10ms/step - accuracy: 0.5415 - loss: 0.9703 - val_accuracy: 0.5275 - val_loss: 0.9798
Epoch 6/50
[1m216/216[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 11ms/step - accuracy: 0.5451 - loss: 0.9658 - val_accuracy: 0.5298 - val_loss: 0.9797
Epoch 7/50
[1m216/216

##Celda final — Comparativa rápida de resultados (imprime accuracy de test donde se haya guardado)

In [19]:
# Cargar modelos guardados (si existen) y evaluar en test
models_to_check = {
    "RNN": "/content/drive/MyDrive/football_project/models/rnn_best.keras",
    "LSTM": "/content/drive/MyDrive/football_project/models/lstm_best.keras",
    "GRU": "/content/drive/MyDrive/football_project/models/gru_best.keras",
    "BiLSTM": "/content/drive/MyDrive/football_project/models/bi_lstm_best.keras",
    "TRANSFORMER": "/content/drive/MyDrive/football_project/models/transformer_best.keras",
    "CNN-LSTM": "/content/drive/MyDrive/football_project/models/cnn_lstm_best.keras"
}

for name, path in models_to_check.items():
    if os.path.exists(path):
        m = keras.models.load_model(path)
        loss, acc = m.evaluate([Xs_test, Xst_test], y_test, verbose=0)
        print(f"{name}: Test acc = {acc:.4f}")
    else:
        print(f"{name}: no encontrado en {path}")


RNN: Test acc = 0.5313
LSTM: Test acc = 0.5279
GRU: Test acc = 0.5292
BiLSTM: Test acc = 0.5280
TRANSFORMER: no encontrado en /content/drive/MyDrive/football_project/models/transformer_best.keras
CNN-LSTM: Test acc = 0.5321
