# Action PixelBytes: Catching Insights in Unified Multimodal Sequences

## Description

**Action-PixelBytes** est un modèle conçu pour générer simultanément du texte, des images, des animations pixel par pixel et des actions-états sous forme de séquences. L'objectif de ce projet est d'explorer un embedding unifié qui permet une génération multimodale cohérente, facilitant ainsi l'interaction entre différentes formes de données.

## Dataset

Pour ce projet, nous utilisons le dataset **PixelBytes-PokemonSprites**. Contrairement à la version précédente, **PixelBytes-Pokemon**, cette version est structurée à la volée pour un format unifié. Cela signifie que les données sont préparées de manière à être directement utilisées pour l'entraînement du modèle, ce qui simplifie le processus d'embedding.

## Tokenizer

Le tokenizer joue un rôle dans la préparation des données pour le modèle. Voici comment il fonctionne pour chaque type de donnée :

### Traitement du Texte
Le texte est d'abord normalisé et converti en minuscules. Ensuite, il est encodé en format ASCII pour garantir que tous les caractères sont traités de manière uniforme. Cette étape permet de simplifier le texte avant de le transformer en une séquence de tokens.

### Traitement des Images
Les images, y compris les GIFs, sont traitées en plusieurs étapes. Chaque image est convertie en un espace colorimétrique LAB, qui est plus adapté à certaines analyses d'image. Ensuite, chaque frame d'une image animée est quantifiée selon une palette de couleurs prédéfinie, ce qui permet de réduire la complexité des données tout en préservant les informations essentielles.

### Traitement des Actions-États
Les actions-états sont normalisées pour assurer que toutes les valeurs sont sur la même échelle. Cela facilite la comparaison et l'analyse des états d'action. Les états sont ensuite quantifiés selon un ensemble prédéfini d'états d'action, ce qui permet au modèle de mieux comprendre les relations entre différentes actions.

### Création de Séquences
Les séquences sont créées en utilisant un contexte spatial et temporel. Cela signifie que pour chaque séquence, le modèle prend en compte non seulement l'entrée actuelle, mais aussi les entrées précédentes. Cela permet de générer des entrées de plusieurs éléments qui contiennent des informations pertinentes pour la tâche à accomplir.

## État du Projet

Le code est encore en cours de développement. Bien que les principales fonctionnalités du tokenizer et du traitement des données soient implémentées, des améliorations et des optimisations sont à venir. L'objectif est de rendre le modèle plus robuste et efficace pour la génération multimodale.

## Prochaines Étapes

- Finaliser l'architecture du modèle.
- Implémenter l'entraînement et l'évaluation.
- Optimiser les performances du modèle.
- Effectuer des tests approfondis sur différents types de données multimodales.


In [1]:
#!pip install -q mamba-ssm causal-conv1d ## for GPU (Mambapy included)
!pip install -q git+https://github.com/fabienfrfr/PixelBytes.git@main

In [2]:
# only in kaggle for HF
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
hf_token = user_secrets.get_secret("HF_TOKEN")
# no warning msg during train
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
# our approach
from pixelbytes import *

In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.cuda.amp import autocast, GradScaler

def count_parameters_in_k(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad) / 1000

In [4]:
from datasets import load_dataset
#hf_dataset = load_dataset("ffurfaro/PixelBytes-PokemonAll")['train'].train_test_split(test_size=0.1, seed=42)
hf_dataset = load_dataset("ffurfaro/PixelBytes-OptimalControl")['train'].train_test_split(test_size=0.1, seed=42)
train_ds, val_ds = hf_dataset['train'], hf_dataset['test']

Downloading readme:   0%|          | 0.00/373 [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/8.79M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/7058 [00:00<?, ? examples/s]

In [5]:
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
DATA_REDUCTION = {"image":2, "audio":2} # important pour la proportion des data en AR (overfitting audio risk)
tokenizer = ActionPixelBytesTokenizer(data_slicing=DATA_REDUCTION)
# Paramètres
VOCAB_SIZE = tokenizer.vocab_size
EMBED_SIZE = 128
HIDDEN_SIZE = 512
NUM_LAYERS = 2
PXBY_DIM = 6 # tokenizer
AR = True
DIFFUSION = True
BIDIRECTION = True
MODEL_TYPE = "lstm"
BATCH_SIZE = 32
EPOCHS = 100
LEARNING_RATE = 0.001
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
ACCUMULATION_STEPS = 4
SEQ_LENGTH = 1024
STRIDE = 512

config = ModelConfig(vocab_size=VOCAB_SIZE, embed_size=EMBED_SIZE, hidden_size=HIDDEN_SIZE, bidirectionnal=BIDIRECTION,
                          num_layers=NUM_LAYERS, pxby_dim=PXBY_DIM, auto_regressive=AR, diffusion=DIFFUSION, model_type=MODEL_TYPE)
config

ModelConfig {
  "auto_regressive": true,
  "bidirection": true,
  "diffusion": true,
  "embed_size": 126,
  "hidden_size": 256,
  "num_diffusion_steps": 5,
  "num_layers": 2,
  "pxby_dim": 6,
  "pxby_emb": 21,
  "transformers_version": "4.44.0",
  "vocab_size": 151
}

In [6]:
# Initialisation du modèle
model = aPxBySequenceModel(config).to(DEVICE)
print(f"Le modèle a {count_parameters_in_k(model):.2f}k paramètres entraînables.")
# Parametre d'entrainement
optimizer = optim.AdamW(model.parameters(), lr=LEARNING_RATE)
criterion = nn.CrossEntropyLoss(ignore_index=-100)
scaler = GradScaler() if torch.cuda.is_available() else None

Le modèle a 2831.34k paramètres entraînables.


In [7]:
# Préparation des données
def dataloading(ds):
    dataset = TokenPxByDataset(ds, tokenizer, SEQ_LENGTH, STRIDE)
    return DataLoader(dataset, batch_size=BATCH_SIZE, collate_fn=collate_fn, shuffle=True)
train_dataloader, val_dataloader = dataloading(train_ds), dataloading(val_ds)

In [8]:
# Entraînement
model.train_model(train_dataloader, val_dataloader, optimizer, criterion, DEVICE, scaler, EPOCHS, ACCUMULATION_STEPS)

Evaluating: 100%|██████████| 23/23 [00:02<00:00,  9.15it/s]


Validation Loss: 5.0129, Validation Accuracy: 0.0026


Training: 100%|██████████| 199/199 [00:34<00:00,  5.75it/s]


Epoch 1/100, Train Loss: 1.6482, Train Accuracy: 0.7015


Training: 100%|██████████| 199/199 [00:32<00:00,  6.19it/s]


Epoch 2/100, Train Loss: 1.1524, Train Accuracy: 0.7252


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 3/100, Train Loss: 1.0775, Train Accuracy: 0.7344


Training: 100%|██████████| 199/199 [00:32<00:00,  6.21it/s]


Epoch 4/100, Train Loss: 0.9711, Train Accuracy: 0.7470


Training: 100%|██████████| 199/199 [00:31<00:00,  6.25it/s]


Epoch 5/100, Train Loss: 0.9075, Train Accuracy: 0.7581


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.38it/s]


Validation Loss: 0.8845, Validation Accuracy: 0.7644
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:33<00:00,  5.92it/s]


Epoch 6/100, Train Loss: 0.8614, Train Accuracy: 0.7714


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 7/100, Train Loss: 0.7908, Train Accuracy: 0.7867


Training: 100%|██████████| 199/199 [00:32<00:00,  6.16it/s]


Epoch 8/100, Train Loss: 0.7308, Train Accuracy: 0.8000


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 9/100, Train Loss: 0.6401, Train Accuracy: 0.8292


Training: 100%|██████████| 199/199 [00:31<00:00,  6.25it/s]


Epoch 10/100, Train Loss: 0.5493, Train Accuracy: 0.8572


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.31it/s]


Validation Loss: 0.5021, Validation Accuracy: 0.8738
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.26it/s]


Epoch 11/100, Train Loss: 0.4589, Train Accuracy: 0.8882


Training: 100%|██████████| 199/199 [00:31<00:00,  6.22it/s]


Epoch 12/100, Train Loss: 0.3844, Train Accuracy: 0.9114


Training: 100%|██████████| 199/199 [00:31<00:00,  6.25it/s]


Epoch 13/100, Train Loss: 0.3288, Train Accuracy: 0.9261


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 14/100, Train Loss: 0.2872, Train Accuracy: 0.9345


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 15/100, Train Loss: 0.2573, Train Accuracy: 0.9390


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.29it/s]


Validation Loss: 0.2480, Validation Accuracy: 0.9403
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 16/100, Train Loss: 0.2389, Train Accuracy: 0.9410


Training: 100%|██████████| 199/199 [00:32<00:00,  6.19it/s]


Epoch 17/100, Train Loss: 0.2236, Train Accuracy: 0.9426


Training: 100%|██████████| 199/199 [00:31<00:00,  6.27it/s]


Epoch 18/100, Train Loss: 0.2132, Train Accuracy: 0.9436


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 19/100, Train Loss: 0.2051, Train Accuracy: 0.9442


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 20/100, Train Loss: 0.1969, Train Accuracy: 0.9451


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.26it/s]


Validation Loss: 0.1967, Validation Accuracy: 0.9450
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 21/100, Train Loss: 0.1912, Train Accuracy: 0.9456


Training: 100%|██████████| 199/199 [00:31<00:00,  6.22it/s]


Epoch 22/100, Train Loss: 0.1864, Train Accuracy: 0.9461


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 23/100, Train Loss: 0.1826, Train Accuracy: 0.9464


Training: 100%|██████████| 199/199 [00:31<00:00,  6.27it/s]


Epoch 24/100, Train Loss: 0.1778, Train Accuracy: 0.9470


Training: 100%|██████████| 199/199 [00:31<00:00,  6.22it/s]


Epoch 25/100, Train Loss: 0.1745, Train Accuracy: 0.9473


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.29it/s]


Validation Loss: 0.1734, Validation Accuracy: 0.9474
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:32<00:00,  6.21it/s]


Epoch 26/100, Train Loss: 0.1715, Train Accuracy: 0.9477


Training: 100%|██████████| 199/199 [00:32<00:00,  6.20it/s]


Epoch 27/100, Train Loss: 0.1689, Train Accuracy: 0.9479


Training: 100%|██████████| 199/199 [00:32<00:00,  6.16it/s]


Epoch 28/100, Train Loss: 0.1659, Train Accuracy: 0.9484


Training: 100%|██████████| 199/199 [00:32<00:00,  6.22it/s]


Epoch 29/100, Train Loss: 0.1642, Train Accuracy: 0.9485


Training: 100%|██████████| 199/199 [00:32<00:00,  6.22it/s]


Epoch 30/100, Train Loss: 0.1615, Train Accuracy: 0.9490


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.27it/s]


Validation Loss: 0.1589, Validation Accuracy: 0.9491
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.28it/s]


Epoch 31/100, Train Loss: 0.1591, Train Accuracy: 0.9494


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 32/100, Train Loss: 0.1575, Train Accuracy: 0.9495


Training: 100%|██████████| 199/199 [00:31<00:00,  6.25it/s]


Epoch 33/100, Train Loss: 0.1551, Train Accuracy: 0.9499


Training: 100%|██████████| 199/199 [00:31<00:00,  6.29it/s]


Epoch 34/100, Train Loss: 0.1537, Train Accuracy: 0.9502


Training: 100%|██████████| 199/199 [00:32<00:00,  6.19it/s]


Epoch 35/100, Train Loss: 0.1516, Train Accuracy: 0.9505


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.25it/s]


Validation Loss: 0.1514, Validation Accuracy: 0.9506
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 36/100, Train Loss: 0.1505, Train Accuracy: 0.9506


Training: 100%|██████████| 199/199 [00:31<00:00,  6.28it/s]


Epoch 37/100, Train Loss: 0.1485, Train Accuracy: 0.9510


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 38/100, Train Loss: 0.1472, Train Accuracy: 0.9512


Training: 100%|██████████| 199/199 [00:32<00:00,  6.20it/s]


Epoch 39/100, Train Loss: 0.1457, Train Accuracy: 0.9515


Training: 100%|██████████| 199/199 [00:32<00:00,  6.21it/s]


Epoch 40/100, Train Loss: 0.1439, Train Accuracy: 0.9519


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.24it/s]


Validation Loss: 0.1432, Validation Accuracy: 0.9520
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:32<00:00,  6.19it/s]


Epoch 41/100, Train Loss: 0.1425, Train Accuracy: 0.9521


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 42/100, Train Loss: 0.1415, Train Accuracy: 0.9523


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 43/100, Train Loss: 0.1397, Train Accuracy: 0.9527


Training: 100%|██████████| 199/199 [00:31<00:00,  6.26it/s]


Epoch 44/100, Train Loss: 0.1391, Train Accuracy: 0.9528


Training: 100%|██████████| 199/199 [00:31<00:00,  6.32it/s]


Epoch 45/100, Train Loss: 0.1377, Train Accuracy: 0.9531


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.29it/s]


Validation Loss: 0.1382, Validation Accuracy: 0.9531
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:32<00:00,  6.16it/s]


Epoch 46/100, Train Loss: 0.1363, Train Accuracy: 0.9534


Training: 100%|██████████| 199/199 [00:32<00:00,  6.17it/s]


Epoch 47/100, Train Loss: 0.1354, Train Accuracy: 0.9536


Training: 100%|██████████| 199/199 [00:32<00:00,  6.21it/s]


Epoch 48/100, Train Loss: 0.1345, Train Accuracy: 0.9538


Training: 100%|██████████| 199/199 [00:31<00:00,  6.22it/s]


Epoch 49/100, Train Loss: 0.1333, Train Accuracy: 0.9541


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 50/100, Train Loss: 0.1319, Train Accuracy: 0.9545


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.27it/s]


Validation Loss: 0.1318, Validation Accuracy: 0.9544
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 51/100, Train Loss: 0.1313, Train Accuracy: 0.9546


Training: 100%|██████████| 199/199 [00:32<00:00,  6.10it/s]


Epoch 52/100, Train Loss: 0.1303, Train Accuracy: 0.9549


Training: 100%|██████████| 199/199 [00:31<00:00,  6.29it/s]


Epoch 53/100, Train Loss: 0.1295, Train Accuracy: 0.9550


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 54/100, Train Loss: 0.1285, Train Accuracy: 0.9553


Training: 100%|██████████| 199/199 [00:32<00:00,  6.16it/s]


Epoch 55/100, Train Loss: 0.1279, Train Accuracy: 0.9554


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.28it/s]


Validation Loss: 0.1271, Validation Accuracy: 0.9556
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.22it/s]


Epoch 56/100, Train Loss: 0.1269, Train Accuracy: 0.9556


Training: 100%|██████████| 199/199 [00:32<00:00,  6.22it/s]


Epoch 57/100, Train Loss: 0.1261, Train Accuracy: 0.9559


Training: 100%|██████████| 199/199 [00:31<00:00,  6.26it/s]


Epoch 58/100, Train Loss: 0.1258, Train Accuracy: 0.9560


Training: 100%|██████████| 199/199 [00:31<00:00,  6.22it/s]


Epoch 59/100, Train Loss: 0.1249, Train Accuracy: 0.9562


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 60/100, Train Loss: 0.1236, Train Accuracy: 0.9566


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.26it/s]


Validation Loss: 0.1228, Validation Accuracy: 0.9567
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 61/100, Train Loss: 0.1232, Train Accuracy: 0.9567


Training: 100%|██████████| 199/199 [00:31<00:00,  6.26it/s]


Epoch 62/100, Train Loss: 0.1229, Train Accuracy: 0.9567


Training: 100%|██████████| 199/199 [00:31<00:00,  6.26it/s]


Epoch 63/100, Train Loss: 0.1220, Train Accuracy: 0.9569


Training: 100%|██████████| 199/199 [00:31<00:00,  6.30it/s]


Epoch 64/100, Train Loss: 0.1213, Train Accuracy: 0.9572


Training: 100%|██████████| 199/199 [00:31<00:00,  6.22it/s]


Epoch 65/100, Train Loss: 0.1205, Train Accuracy: 0.9574


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.24it/s]


Validation Loss: 0.1204, Validation Accuracy: 0.9574
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 66/100, Train Loss: 0.1199, Train Accuracy: 0.9576


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 67/100, Train Loss: 0.1196, Train Accuracy: 0.9576


Training: 100%|██████████| 199/199 [00:31<00:00,  6.22it/s]


Epoch 68/100, Train Loss: 0.1189, Train Accuracy: 0.9578


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 69/100, Train Loss: 0.1184, Train Accuracy: 0.9580


Training: 100%|██████████| 199/199 [00:32<00:00,  6.18it/s]


Epoch 70/100, Train Loss: 0.1180, Train Accuracy: 0.9580


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.26it/s]


Validation Loss: 0.1180, Validation Accuracy: 0.9580
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:32<00:00,  6.22it/s]


Epoch 71/100, Train Loss: 0.1175, Train Accuracy: 0.9582


Training: 100%|██████████| 199/199 [00:32<00:00,  6.19it/s]


Epoch 72/100, Train Loss: 0.1168, Train Accuracy: 0.9584


Training: 100%|██████████| 199/199 [00:31<00:00,  6.26it/s]


Epoch 73/100, Train Loss: 0.1165, Train Accuracy: 0.9585


Training: 100%|██████████| 199/199 [00:31<00:00,  6.29it/s]


Epoch 74/100, Train Loss: 0.1159, Train Accuracy: 0.9586


Training: 100%|██████████| 199/199 [00:31<00:00,  6.25it/s]


Epoch 75/100, Train Loss: 0.1152, Train Accuracy: 0.9588


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.23it/s]


Validation Loss: 0.1140, Validation Accuracy: 0.9589
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:32<00:00,  6.14it/s]


Epoch 76/100, Train Loss: 0.1153, Train Accuracy: 0.9588


Training: 100%|██████████| 199/199 [00:32<00:00,  6.20it/s]


Epoch 77/100, Train Loss: 0.1147, Train Accuracy: 0.9589


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 78/100, Train Loss: 0.1139, Train Accuracy: 0.9592


Training: 100%|██████████| 199/199 [00:31<00:00,  6.23it/s]


Epoch 79/100, Train Loss: 0.1136, Train Accuracy: 0.9592


Training: 100%|██████████| 199/199 [00:32<00:00,  6.13it/s]


Epoch 80/100, Train Loss: 0.1130, Train Accuracy: 0.9595


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.23it/s]


Validation Loss: 0.1122, Validation Accuracy: 0.9594
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:32<00:00,  6.21it/s]


Epoch 81/100, Train Loss: 0.1126, Train Accuracy: 0.9595


Training: 100%|██████████| 199/199 [00:32<00:00,  6.17it/s]


Epoch 82/100, Train Loss: 0.1123, Train Accuracy: 0.9596


Training: 100%|██████████| 199/199 [00:31<00:00,  6.27it/s]


Epoch 83/100, Train Loss: 0.1118, Train Accuracy: 0.9598


Training: 100%|██████████| 199/199 [00:32<00:00,  6.15it/s]


Epoch 84/100, Train Loss: 0.1115, Train Accuracy: 0.9599


Training: 100%|██████████| 199/199 [00:32<00:00,  6.11it/s]


Epoch 85/100, Train Loss: 0.1107, Train Accuracy: 0.9601


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.24it/s]


Validation Loss: 0.1103, Validation Accuracy: 0.9599
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:32<00:00,  6.20it/s]


Epoch 86/100, Train Loss: 0.1105, Train Accuracy: 0.9602


Training: 100%|██████████| 199/199 [00:32<00:00,  6.17it/s]


Epoch 87/100, Train Loss: 0.1101, Train Accuracy: 0.9603


Training: 100%|██████████| 199/199 [00:32<00:00,  6.22it/s]


Epoch 88/100, Train Loss: 0.1098, Train Accuracy: 0.9603


Training: 100%|██████████| 199/199 [00:32<00:00,  6.13it/s]


Epoch 89/100, Train Loss: 0.1095, Train Accuracy: 0.9604


Training: 100%|██████████| 199/199 [00:32<00:00,  6.22it/s]


Epoch 90/100, Train Loss: 0.1086, Train Accuracy: 0.9607


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.23it/s]


Validation Loss: 0.1083, Validation Accuracy: 0.9604
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:32<00:00,  6.15it/s]


Epoch 91/100, Train Loss: 0.1088, Train Accuracy: 0.9606


Training: 100%|██████████| 199/199 [00:32<00:00,  6.14it/s]


Epoch 92/100, Train Loss: 0.1089, Train Accuracy: 0.9605


Training: 100%|██████████| 199/199 [00:32<00:00,  6.19it/s]


Epoch 93/100, Train Loss: 0.1078, Train Accuracy: 0.9609


Training: 100%|██████████| 199/199 [00:32<00:00,  6.19it/s]


Epoch 94/100, Train Loss: 0.1075, Train Accuracy: 0.9610


Training: 100%|██████████| 199/199 [00:32<00:00,  6.15it/s]


Epoch 95/100, Train Loss: 0.1068, Train Accuracy: 0.9612


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.26it/s]


Validation Loss: 0.1076, Validation Accuracy: 0.9609
Model saved to /kaggle/working/lstm_autoregressive_best


Training: 100%|██████████| 199/199 [00:31<00:00,  6.24it/s]


Epoch 96/100, Train Loss: 0.1070, Train Accuracy: 0.9611


Training: 100%|██████████| 199/199 [00:32<00:00,  6.14it/s]


Epoch 97/100, Train Loss: 0.1063, Train Accuracy: 0.9612


Training: 100%|██████████| 199/199 [00:32<00:00,  6.19it/s]


Epoch 98/100, Train Loss: 0.1062, Train Accuracy: 0.9613


Training: 100%|██████████| 199/199 [00:32<00:00,  6.21it/s]


Epoch 99/100, Train Loss: 0.1057, Train Accuracy: 0.9614


Training: 100%|██████████| 199/199 [00:32<00:00,  6.07it/s]


Epoch 100/100, Train Loss: 0.1057, Train Accuracy: 0.9614


Evaluating: 100%|██████████| 23/23 [00:02<00:00, 11.26it/s]

Validation Loss: 0.1052, Validation Accuracy: 0.9614
Model saved to /kaggle/working/lstm_autoregressive_best
Model saved to /kaggle/working/lstm_autoregressive_last





In [9]:
import re
from huggingface_hub import HfApi, create_repo, whoami
def push_model_to_hub(repo_name, model_dir, token, subfolder=None):
    api = HfApi(token=token)
    subfolder = re.sub(r'[^a-zA-Z0-9]+', '_', subfolder).strip('_').lower()

    try:
        create_repo(repo_name, token=token, repo_type="model", exist_ok=True)
        username = whoami(token=token)['name']
        repo_id = f"{username}/{repo_name}"
        print(f"Repository '{repo_id}' created or already exists.")
    except Exception as e:
        print(f"Error creating repository: {e}")
        return
    
    api.upload_folder(
        folder_path=model_dir,
        repo_id=repo_id,
        repo_type="model",
        path_in_repo=subfolder,
        ignore_patterns=[".*"],  # Ignorer les fichiers cachés
        create_pr=False  # Créer directement dans la branche principale
    )
    print(f"Model pushed successfully to {repo_name}, subfolder: {subfolder}")
!ls

lstm_autoregressive_best  lstm_autoregressive_last  training_metrics.csv


In [10]:
# save model
#push_model_to_hub("aPixelBytes-PokemonLSTM", "lstm_autoregressive_last", hf_token, subfolder="lstm_autoregressive2_last")
#push_model_to_hub("aPixelBytes-PokemonLSTM", "lstm_autoregressive_best", hf_token, subfolder="lstm_autoregressive2_best")
#push_model_to_hub("aPixelBytes-OptimalControl", "lstm_autoregressive_last", hf_token, subfolder="lstm_autoregressive_last")
#push_model_to_hub("aPixelBytes-OptimalControl", "lstm_autoregressive_best", hf_token, subfolder="lstm_autoregressive_best")
push_model_to_hub("aPixelBytes-OptimalControl", "lstm_autoregressive_last", hf_token, subfolder="bilstm_ARDM_last")
push_model_to_hub("aPixelBytes-OptimalControl", "lstm_autoregressive_best", hf_token, subfolder="bilstm_ARDM_best")

Repository 'ffurfaro/aPixelBytes-OptimalControl' created or already exists.


model.safetensors:   0%|          | 0.00/11.3M [00:00<?, ?B/s]

Model pushed successfully to aPixelBytes-OptimalControl, subfolder: bilstm_ardm_last
Repository 'ffurfaro/aPixelBytes-OptimalControl' created or already exists.
Model pushed successfully to aPixelBytes-OptimalControl, subfolder: bilstm_ardm_best


In [None]:
# Test de génération
test_input = next(iter(dataloader))['input_ids'][:1].to(DEVICE)
generated = model.generate(test_input, max_length=100)
print("Generated sequence:", generated)

# Entraînement

lstm_autoregressive_best  lstm_autoregressive_last  training_metrics.csv

model.train_model(train_dataloader, val_dataloader, optimizer, criterion, DEVICE, scaler, EPOCHS, ACCUMULATION_STEPS)
Evaluating: 100%|██████████| 107/107 [00:12<00:00,  8.51it/s]
Validation Loss: 5.0095, Validation Accuracy: 0.0343
Training: 100%|██████████| 1030/1030 [02:05<00:00,  8.21it/s]
Epoch 1/100, Train Loss: 1.3395, Train Accuracy: 0.6827
Training: 100%|██████████| 1030/1030 [02:08<00:00,  8.04it/s]
Epoch 2/100, Train Loss: 0.6533, Train Accuracy: 0.8272
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.97it/s]
Epoch 3/100, Train Loss: 0.5496, Train Accuracy: 0.8461
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.96it/s]
Epoch 4/100, Train Loss: 0.5046, Train Accuracy: 0.8543
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.95it/s]
Epoch 5/100, Train Loss: 0.4811, Train Accuracy: 0.8584
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.69it/s]
Validation Loss: 0.4970, Validation Accuracy: 0.8551
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.95it/s]
Epoch 6/100, Train Loss: 0.4554, Train Accuracy: 0.8637
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.97it/s]
Epoch 7/100, Train Loss: 0.4366, Train Accuracy: 0.8676
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.97it/s]
Epoch 8/100, Train Loss: 0.4214, Train Accuracy: 0.8712
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.93it/s]
Epoch 9/100, Train Loss: 0.4086, Train Accuracy: 0.8744
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.93it/s]
Epoch 10/100, Train Loss: 0.4001, Train Accuracy: 0.8767
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.68it/s]
Validation Loss: 0.4442, Validation Accuracy: 0.8665
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.93it/s]
Epoch 11/100, Train Loss: 0.4114, Train Accuracy: 0.8739
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.92it/s]
Epoch 12/100, Train Loss: 0.3986, Train Accuracy: 0.8782
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.77it/s]
Epoch 13/100, Train Loss: 0.3906, Train Accuracy: 0.8806
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 14/100, Train Loss: 0.3767, Train Accuracy: 0.8838
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 15/100, Train Loss: 0.3667, Train Accuracy: 0.8866
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.11it/s]
Validation Loss: 0.4228, Validation Accuracy: 0.8730
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 16/100, Train Loss: 0.3594, Train Accuracy: 0.8887
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.79it/s]
Epoch 17/100, Train Loss: 0.3547, Train Accuracy: 0.8901
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 18/100, Train Loss: 0.3473, Train Accuracy: 0.8923
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 19/100, Train Loss: 0.3431, Train Accuracy: 0.8938
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.77it/s]
Epoch 20/100, Train Loss: 0.3380, Train Accuracy: 0.8952
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.14it/s]
Validation Loss: 0.4059, Validation Accuracy: 0.8782
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.72it/s]
Epoch 21/100, Train Loss: 0.3315, Train Accuracy: 0.8972
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 22/100, Train Loss: 0.3269, Train Accuracy: 0.8987
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.87it/s]
Epoch 23/100, Train Loss: 0.3225, Train Accuracy: 0.9001
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 24/100, Train Loss: 0.3179, Train Accuracy: 0.9015
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.88it/s]
Epoch 25/100, Train Loss: 0.3153, Train Accuracy: 0.9025
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.62it/s]
Validation Loss: 0.4018, Validation Accuracy: 0.8815
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 26/100, Train Loss: 0.3134, Train Accuracy: 0.9033
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 27/100, Train Loss: 0.3072, Train Accuracy: 0.9051
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 28/100, Train Loss: 0.3038, Train Accuracy: 0.9062
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 29/100, Train Loss: 0.3060, Train Accuracy: 0.9057
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 30/100, Train Loss: 0.2966, Train Accuracy: 0.9086
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.67it/s]
Validation Loss: 0.3882, Validation Accuracy: 0.8862
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 31/100, Train Loss: 0.2935, Train Accuracy: 0.9095
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 32/100, Train Loss: 0.2909, Train Accuracy: 0.9103
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 33/100, Train Loss: 0.2883, Train Accuracy: 0.9111
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 34/100, Train Loss: 0.2852, Train Accuracy: 0.9120
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 35/100, Train Loss: 0.2829, Train Accuracy: 0.9127
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.65it/s]
Validation Loss: 0.3858, Validation Accuracy: 0.8879
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 36/100, Train Loss: 0.2801, Train Accuracy: 0.9137
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 37/100, Train Loss: 0.2783, Train Accuracy: 0.9142
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 38/100, Train Loss: 0.2753, Train Accuracy: 0.9152
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.86it/s]
Epoch 39/100, Train Loss: 0.2754, Train Accuracy: 0.9153
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 40/100, Train Loss: 0.2724, Train Accuracy: 0.9161
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.14it/s]
Validation Loss: 0.3877, Validation Accuracy: 0.8886
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 41/100, Train Loss: 0.2698, Train Accuracy: 0.9170
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.74it/s]
Epoch 42/100, Train Loss: 0.2839, Train Accuracy: 0.9130
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 43/100, Train Loss: 0.2672, Train Accuracy: 0.9179
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 44/100, Train Loss: 0.2652, Train Accuracy: 0.9184
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 45/100, Train Loss: 0.2636, Train Accuracy: 0.9190
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.12it/s]
Validation Loss: 0.3898, Validation Accuracy: 0.8888
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.74it/s]
Epoch 46/100, Train Loss: 0.2620, Train Accuracy: 0.9194
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 47/100, Train Loss: 0.2603, Train Accuracy: 0.9200
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.70it/s]
Epoch 48/100, Train Loss: 0.2655, Train Accuracy: 0.9184
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 49/100, Train Loss: 0.2571, Train Accuracy: 0.9210
Training: 100%|██████████| 1030/1030 [02:11<00:00,  7.85it/s]
Epoch 50/100, Train Loss: 0.2561, Train Accuracy: 0.9214
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.60it/s]
Validation Loss: 0.3978, Validation Accuracy: 0.8887
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 51/100, Train Loss: 0.2562, Train Accuracy: 0.9212
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 52/100, Train Loss: 0.2532, Train Accuracy: 0.9223
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 53/100, Train Loss: 0.2546, Train Accuracy: 0.9218
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 54/100, Train Loss: 0.2512, Train Accuracy: 0.9229
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 55/100, Train Loss: 0.2495, Train Accuracy: 0.9234
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.70it/s]
Validation Loss: 0.4078, Validation Accuracy: 0.8869
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 56/100, Train Loss: 0.2493, Train Accuracy: 0.9235
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 57/100, Train Loss: 0.2477, Train Accuracy: 0.9240
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 58/100, Train Loss: 0.2463, Train Accuracy: 0.9245
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 59/100, Train Loss: 0.2547, Train Accuracy: 0.9220
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 60/100, Train Loss: 0.2438, Train Accuracy: 0.9253
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.65it/s]
Validation Loss: 0.4084, Validation Accuracy: 0.8883
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 61/100, Train Loss: 0.2430, Train Accuracy: 0.9255
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 62/100, Train Loss: 0.2424, Train Accuracy: 0.9257
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 63/100, Train Loss: 0.2415, Train Accuracy: 0.9260
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 64/100, Train Loss: 0.2405, Train Accuracy: 0.9263
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 65/100, Train Loss: 0.2394, Train Accuracy: 0.9267
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.70it/s]
Validation Loss: 0.4150, Validation Accuracy: 0.8877
Training: 100%|██████████| 1030/1030 [02:11<00:00,  7.86it/s]
Epoch 66/100, Train Loss: 0.2384, Train Accuracy: 0.9270
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 67/100, Train Loss: 0.2385, Train Accuracy: 0.9269
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.77it/s]
Epoch 68/100, Train Loss: 0.2368, Train Accuracy: 0.9276
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 69/100, Train Loss: 0.2359, Train Accuracy: 0.9279
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.70it/s]
Epoch 70/100, Train Loss: 0.2670, Train Accuracy: 0.9185
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.10it/s]
Validation Loss: 0.4157, Validation Accuracy: 0.8872
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.74it/s]
Epoch 71/100, Train Loss: 0.2360, Train Accuracy: 0.9279
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 72/100, Train Loss: 0.2346, Train Accuracy: 0.9284
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 73/100, Train Loss: 0.2337, Train Accuracy: 0.9286
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 74/100, Train Loss: 0.2332, Train Accuracy: 0.9288
Training: 100%|██████████| 1030/1030 [02:14<00:00,  7.67it/s]
Epoch 75/100, Train Loss: 0.2320, Train Accuracy: 0.9292
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.16it/s]
Validation Loss: 0.4266, Validation Accuracy: 0.8876
Training: 100%|██████████| 1030/1030 [02:11<00:00,  7.82it/s]
Epoch 76/100, Train Loss: 0.2317, Train Accuracy: 0.9292
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.86it/s]
Epoch 77/100, Train Loss: 0.2309, Train Accuracy: 0.9295
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.87it/s]
Epoch 78/100, Train Loss: 0.2304, Train Accuracy: 0.9297
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 79/100, Train Loss: 0.2314, Train Accuracy: 0.9294
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 80/100, Train Loss: 0.2283, Train Accuracy: 0.9305
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.66it/s]
Validation Loss: 0.4402, Validation Accuracy: 0.8862
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 81/100, Train Loss: 0.2324, Train Accuracy: 0.9290
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 82/100, Train Loss: 0.2266, Train Accuracy: 0.9311
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.88it/s]
Epoch 83/100, Train Loss: 0.2263, Train Accuracy: 0.9311
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.87it/s]
Epoch 84/100, Train Loss: 0.2262, Train Accuracy: 0.9311
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 85/100, Train Loss: 0.2256, Train Accuracy: 0.9313
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.65it/s]
Validation Loss: 0.4557, Validation Accuracy: 0.8851
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 86/100, Train Loss: 0.2255, Train Accuracy: 0.9313
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 87/100, Train Loss: 0.2242, Train Accuracy: 0.9318
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 88/100, Train Loss: 0.2237, Train Accuracy: 0.9320
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 89/100, Train Loss: 0.2239, Train Accuracy: 0.9319
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.88it/s]
Epoch 90/100, Train Loss: 0.2226, Train Accuracy: 0.9323
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.67it/s]
Validation Loss: 0.4570, Validation Accuracy: 0.8849
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 91/100, Train Loss: 0.2248, Train Accuracy: 0.9317
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 92/100, Train Loss: 0.2220, Train Accuracy: 0.9326
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.80it/s]
Epoch 93/100, Train Loss: 0.2211, Train Accuracy: 0.9329
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 94/100, Train Loss: 0.2207, Train Accuracy: 0.9330
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 95/100, Train Loss: 0.2200, Train Accuracy: 0.9332
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.10it/s]
Validation Loss: 0.4651, Validation Accuracy: 0.8841
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 96/100, Train Loss: 0.2195, Train Accuracy: 0.9334
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 97/100, Train Loss: 0.2921, Train Accuracy: 0.9125
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 98/100, Train Loss: 0.2313, Train Accuracy: 0.9291
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 99/100, Train Loss: 0.2237, Train Accuracy: 0.9319
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.77it/s]
Epoch 100/100, Train Loss: 0.2211, Train Accuracy: 0.9329
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.09it/s]
Validation Loss: 0.4519, Validation Accuracy: 0.8852
Model saved to /kaggle/working/lstm_autoregressive_last