# Entrenamiento de Modelo Simple para TFLite
## Modelo compatible con tflite-runtime (sin operaciones Flex)

Este notebook entrena un modelo **sin LSTM** para garantizar compatibilidad total con TFLite:

**Características:**
- ✅ Solo capas Dense (sin LSTM)
- ✅ 100% compatible con tflite-runtime
- ✅ No requiere TensorFlow completo en producción
- ✅ Usa las mismas 4 features (ts, hr, p0, hora_decimal)

**Trade-off:**
- ⚠️ Menos precisión que LSTM (no captura patrones temporales complejos)
- ✅ Mucho más ligero y rápido

In [1]:
# Imports necesarios
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Flatten
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from tensorflow.keras.optimizers import Adam
import tensorflow as tf
import joblib
import warnings
warnings.filterwarnings('ignore')

print(f"TensorFlow version: {tf.__version__}")

TensorFlow version: 2.20.0


## 1. Cargar y explorar datos

In [2]:
# Cargar datos del dataset completo
csv_path = "dataset_ml.csv"
df = pd.read_csv(csv_path, sep=";", decimal=".", encoding="utf-8")

# Limpiar nombres de columnas
df.columns = df.columns.str.strip()

print(f"✅ Datos cargados")
print(f"   Total de registros: {len(df):,}")
print(f"   Columnas: {list(df.columns)}")
print(f"\n📊 Primeras filas:")
df.head(10)

✅ Datos cargados
   Total de registros: 3,549,541
   Columnas: ['momento', 'ts', 'td', 'tMin12Horas', 'tMax12Horas', 'tMin24Horas', 'hr', 'p0', 'qfe1', 'qfe2', 'qff', 'qnh', 'tPromedio24h', 'deltaTemp1h', 'deltaPresion1h', 'humedadRelativaCambio']

📊 Primeras filas:


Unnamed: 0,momento,ts,td,tMin12Horas,tMax12Horas,tMin24Horas,hr,p0,qfe1,qfe2,qff,qnh,tPromedio24h,deltaTemp1h,deltaPresion1h,humedadRelativaCambio
0,2019-01-01 01:00:00,21.9,3.5,21.0,32.3,32.9,30.0,950.9,951.0,951.184,1010.0,1011.9,22.683607,-1.7,0.5,1.2
1,2019-01-01 01:01:00,21.8,3.5,21.0,32.3,32.9,30.0,950.9,951.0,951.184,1010.0,1011.9,22.669355,-1.8,0.5,1.2
2,2019-01-01 01:02:00,21.8,3.5,21.0,32.3,32.9,30.0,950.9,951.2,951.184,1010.2,1012.1,22.655556,-1.8,0.5,1.0
3,2019-01-01 01:03:00,21.8,3.6,21.0,32.3,32.9,30.1,950.9,951.1,951.184,1010.1,1012.0,22.642187,-1.7,0.5,1.0
4,2019-01-01 01:04:00,21.8,3.5,21.0,32.3,32.9,30.1,950.9,951.1,951.184,1010.1,1012.0,22.629231,-1.7,0.5,0.9
5,2019-01-01 01:05:00,21.7,3.5,21.0,32.3,32.9,30.1,950.9,951.1,951.184,1010.1,1012.0,22.615152,-1.7,0.4,1.0
6,2019-01-01 01:06:00,21.7,3.3,21.0,32.3,32.9,29.7,950.9,951.0,951.184,1010.1,1011.9,22.601493,-1.7,0.4,0.8
7,2019-01-01 01:07:00,21.8,3.3,21.0,32.3,32.9,29.7,951.0,951.1,951.184,1010.1,1012.0,22.589706,-1.6,0.6,0.7
8,2019-01-01 01:08:00,21.8,3.3,21.0,32.3,32.9,29.9,951.0,951.1,951.184,1010.1,1012.0,22.578261,-1.6,0.5,1.0
9,2019-01-01 01:09:00,21.7,3.5,21.0,32.3,32.9,30.1,951.0,951.2,951.184,1010.2,1012.1,22.565714,-1.6,0.5,1.1


## 2. Preparar datos

In [3]:
# Convertir momento a hora decimal
df["momento"] = pd.to_datetime(df["momento"], errors='coerce')
df["hora_decimal"] = df["momento"].dt.hour + df["momento"].dt.minute / 60.0
df = df.drop(columns=["momento"])

# Seleccionar features (igual que el modelo LSTM)
feature_cols = ['ts', 'hr', 'p0', 'hora_decimal']

# Verificar columnas
missing = [col for col in feature_cols if col not in df.columns]
if missing:
    print(f"❌ Columnas faltantes: {missing}")
else:
    print(f"✅ Todas las columnas disponibles")

# Extraer y limpiar
df_selected = df[feature_cols].copy()
df_selected = df_selected.apply(pd.to_numeric, errors='coerce')
df_selected = df_selected.replace([np.inf, -np.inf], np.nan).dropna()

print(f"\n✅ Datos preparados")
print(f"   Registros después de limpieza: {len(df_selected):,}")
print(f"   Features: {feature_cols}")

data = df_selected.values
print(f"   Shape: {data.shape}")

✅ Todas las columnas disponibles

✅ Datos preparados
   Registros después de limpieza: 3,549,541
   Features: ['ts', 'hr', 'p0', 'hora_decimal']
   Shape: (3549541, 4)


In [4]:
# Escalar datos
scaler = StandardScaler()
data_scaled = scaler.fit_transform(data)

print("✅ Datos escalados con StandardScaler")
print(f"   Shape: {data_scaled.shape}")

# Guardar scaler
scaler_tflite_path = "scaler_4_features_tflite.pkl"
joblib.dump(scaler, scaler_tflite_path)
print(f"\n💾 Scaler guardado: {scaler_tflite_path}")

✅ Datos escalados con StandardScaler
   Shape: (3549541, 4)

💾 Scaler guardado: scaler_4_features_tflite.pkl


## 3. Crear secuencias (SIN LSTM, pero manteniendo estructura temporal)

In [5]:
def crear_secuencias_flatten(datos, n_pasos, columna_objetivo=0):
    """
    Crea secuencias aplanadas para modelo Dense (sin LSTM).
    
    Args:
        datos: Array escalado (n_samples, n_features)
        n_pasos: Número de timesteps
        columna_objetivo: Índice de columna a predecir (0 = ts)
    
    Returns:
        X: Array aplanado (n_sequences, n_pasos * n_features)
        y: Valores objetivo (n_sequences,)
    """
    X, y = [], []
    for i in range(n_pasos, len(datos)):
        # Tomar ventana y aplanarla
        ventana = datos[i - n_pasos:i].flatten()
        X.append(ventana)
        y.append(datos[i, columna_objetivo])
    return np.array(X), np.array(y)

# Configuración
n_pasos = 24  # Mismo que modelo LSTM
columna_ts = 0

X, y = crear_secuencias_flatten(data_scaled, n_pasos, columna_ts)

print(f"✅ Secuencias creadas (aplanadas para Dense)")
print(f"   X shape: {X.shape} (secuencias, features_aplanadas)")
print(f"   y shape: {y.shape}")
print(f"   Features por muestra: {X.shape[1]} = {n_pasos} timesteps × 4 features")

✅ Secuencias creadas (aplanadas para Dense)
   X shape: (3549517, 96) (secuencias, features_aplanadas)
   y shape: (3549517,)
   Features por muestra: 96 = 24 timesteps × 4 features


## 4. Dividir datos

In [6]:
# División 80/20
split_ratio = 0.8
n_train = int(len(X) * split_ratio)

X_train, X_val = X[:n_train], X[n_train:]
y_train, y_val = y[:n_train], y[n_train:]

print(f"✅ División completada")
print(f"   Entrenamiento: {len(X_train):,} ({split_ratio*100:.0f}%)")
print(f"   Validación:    {len(X_val):,} ({(1-split_ratio)*100:.0f}%)")

✅ División completada
   Entrenamiento: 2,839,613 (80%)
   Validación:    709,904 (20%)


## 5. Construir modelo Dense (compatible con TFLite puro)

In [7]:
# Modelo completamente Dense (sin LSTM)
model = Sequential([
    Dense(128, activation='relu', input_shape=(X.shape[1],)),
    Dropout(0.2),
    Dense(64, activation='relu'),
    Dropout(0.2),
    Dense(32, activation='relu'),
    Dense(1)  # Salida: temperatura
])

model.compile(
    optimizer=Adam(learning_rate=0.001),
    loss='mse',
    metrics=['mae']
)

print("✅ Modelo Dense construido (100% compatible con TFLite)")
model.summary()

✅ Modelo Dense construido (100% compatible con TFLite)


## 6. Entrenar modelo (2 épocas para prueba rápida)

In [8]:
# Callbacks
early_stop = EarlyStopping(
    monitor='val_loss',
    patience=10,
    restore_best_weights=True,
    verbose=1
)

checkpoint = ModelCheckpoint(
    'modelo_simple_tflite_best.h5',
    monitor='val_loss',
    save_best_only=True,
    verbose=1
)

# Entrenamiento
print("🚀 Iniciando entrenamiento (2 épocas)...\n")
history = model.fit(
    X_train, y_train,
    validation_data=(X_val, y_val),
    epochs=2,  # Solo 2 épocas para prueba
    batch_size=32,
    callbacks=[early_stop, checkpoint],
    verbose=1
)

print("\n✅ Entrenamiento completado")

🚀 Iniciando entrenamiento (2 épocas)...

Epoch 1/2
[1m88730/88738[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 1ms/step - loss: 0.0062 - mae: 0.0477
Epoch 1: val_loss improved from None to 0.08933, saving model to modelo_simple_tflite_best.h5




[1m88738/88738[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m141s[0m 2ms/step - loss: 0.0029 - mae: 0.0348 - val_loss: 0.0893 - val_mae: 0.2702
Epoch 2/2
[1m88733/88738[0m [32m━━━━━━━━━━━━━━━━━━━[0m[37m━[0m [1m0s[0m 1ms/step - loss: 0.0014 - mae: 0.0247
Epoch 2: val_loss did not improve from 0.08933
[1m88738/88738[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m149s[0m 2ms/step - loss: 0.0013 - mae: 0.0237 - val_loss: 0.0957 - val_mae: 0.2843
Restoring model weights from the end of the best epoch: 1.

✅ Entrenamiento completado


## 7. Evaluar modelo

In [9]:
# Evaluar
loss, mae = model.evaluate(X_val, y_val, verbose=0)
print(f"📊 RESULTADOS EN VALIDACIÓN")
print(f"   Loss (MSE): {loss:.6f}")
print(f"   MAE:        {mae:.6f}")

# Predicciones
y_pred_scaled = model.predict(X_val, verbose=0)

# Invertir escalado
dummy_pred = np.zeros((len(y_pred_scaled), 4))
dummy_pred[:, 0] = y_pred_scaled.flatten()
y_pred_real = scaler.inverse_transform(dummy_pred)[:, 0]

dummy_true = np.zeros((len(y_val), 4))
dummy_true[:, 0] = y_val
y_true_real = scaler.inverse_transform(dummy_true)[:, 0]

mae_real = np.mean(np.abs(y_pred_real - y_true_real))
print(f"\n🌡️  MAE en escala real: {mae_real:.3f} °C")

📊 RESULTADOS EN VALIDACIÓN
   Loss (MSE): 0.089333
   MAE:        0.270160

🌡️  MAE en escala real: 2.109 °C


## 8. Guardar modelo Keras

In [10]:
# Guardar modelo .h5
model_h5_path = "modelo_simple_tflite.h5"
model.save(model_h5_path)
print(f"✅ Modelo guardado: {model_h5_path}")



✅ Modelo guardado: modelo_simple_tflite.h5


## 9. Convertir a TFLite (100% compatible)

In [11]:
# Convertir a TFLite puro (sin Flex ops)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS]

tflite_model = converter.convert()

# Guardar .tflite
tflite_path = "modelo_simple_tflite.tflite"
with open(tflite_path, 'wb') as f:
    f.write(tflite_model)

# Comparar tamaños
import os
size_h5 = os.path.getsize(model_h5_path) / 1024  # KB
size_tflite = os.path.getsize(tflite_path) / 1024  # KB

print(f"✅ Conversión a TFLite completada!")
print(f"   📦 .h5:     {size_h5:.2f} KB")
print(f"   📦 .tflite: {size_tflite:.2f} KB")
print(f"   🎯 Reducción: {((size_h5 - size_tflite) / size_h5 * 100):.1f}%")
print(f"\n🎉 Modelo 100% compatible con tflite-runtime (sin TensorFlow)")

INFO:tensorflow:Assets written to: C:\Users\benja\AppData\Local\Temp\tmp_hjcscbi\assets


INFO:tensorflow:Assets written to: C:\Users\benja\AppData\Local\Temp\tmp_hjcscbi\assets


Saved artifact at 'C:\Users\benja\AppData\Local\Temp\tmp_hjcscbi'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 96), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 1), dtype=tf.float32, name=None)
Captures:
  1695795968528: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1695795969872: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1695795968336: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1695795969488: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1695795969680: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1695795968144: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1695795968912: TensorSpec(shape=(), dtype=tf.resource, name=None)
  1695795969104: TensorSpec(shape=(), dtype=tf.resource, name=None)
✅ Conversión a TFLite completada!
   📦 .h5:     304.48 KB
   📦 .tflite: 27.89 KB
   🎯 Reducción: 90.8%

🎉 Modelo 100% compatible con tflite-runtime (sin

## 10. Verificar compatibilidad TFLite

In [12]:
# Probar modelo TFLite
interpreter = tf.lite.Interpreter(model_path=tflite_path)
interpreter.allocate_tensors()

input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

print("✅ Modelo TFLite verificado")
print(f"\n📋 Detalles del modelo:")
print(f"   Input shape:  {input_details[0]['shape']}")
print(f"   Output shape: {output_details[0]['shape']}")
print(f"   Input dtype:  {input_details[0]['dtype']}")

# Hacer una predicción de prueba
test_input = X_val[0:1].astype(np.float32)
interpreter.set_tensor(input_details[0]['index'], test_input)
interpreter.invoke()
tflite_pred = interpreter.get_tensor(output_details[0]['index'])[0][0]

# Comparar con predicción Keras
keras_pred = model.predict(test_input, verbose=0)[0][0]

print(f"\n🔮 Predicción de prueba:")
print(f"   Keras:  {keras_pred:.6f}")
print(f"   TFLite: {tflite_pred:.6f}")
print(f"   Diff:   {abs(keras_pred - tflite_pred):.6f}")
print(f"\n✅ Modelo funcionando correctamente!")

✅ Modelo TFLite verificado

📋 Detalles del modelo:
   Input shape:  [ 1 96]
   Output shape: [1 1]
   Input dtype:  <class 'numpy.float32'>

🔮 Predicción de prueba:
   Keras:  -0.548105
   TFLite: -0.547578
   Diff:   0.000528

✅ Modelo funcionando correctamente!


## 11. Resumen final

In [13]:
print("="*60)
print("🎉 MODELO TFLITE COMPATIBLE CREADO EXITOSAMENTE")
print("="*60)
print(f"\n📦 Archivos generados:")
print(f"   1. {model_h5_path}")
print(f"   2. {tflite_path}")
print(f"   3. {scaler_tflite_path}")
print(f"\n✅ Características:")
print(f"   • Arquitectura: Dense (sin LSTM)")
print(f"   • Compatible: tflite-runtime (sin TensorFlow)")
print(f"   • Features: 4 (ts, hr, p0, hora_decimal)")
print(f"   • Timesteps: {n_pasos}")
print(f"   • MAE: {mae_real:.3f} °C")
print(f"\n🚀 Listo para usar en Raspberry Pi con solo tflite-runtime")
print("="*60)

🎉 MODELO TFLITE COMPATIBLE CREADO EXITOSAMENTE

📦 Archivos generados:
   1. modelo_simple_tflite.h5
   2. modelo_simple_tflite.tflite
   3. scaler_4_features_tflite.pkl

✅ Características:
   • Arquitectura: Dense (sin LSTM)
   • Compatible: tflite-runtime (sin TensorFlow)
   • Features: 4 (ts, hr, p0, hora_decimal)
   • Timesteps: 24
   • MAE: 2.109 °C

🚀 Listo para usar en Raspberry Pi con solo tflite-runtime
