# Action PixelBytes: Catching Insights in Unified Multimodal Sequences

## Description

**Action-PixelBytes** est un modèle conçu pour générer simultanément du texte, des images, des animations pixel par pixel et des actions-états sous forme de séquences. L'objectif de ce projet est d'explorer un embedding unifié qui permet une génération multimodale cohérente, facilitant ainsi l'interaction entre différentes formes de données.

## Dataset

Pour ce projet, nous utilisons le dataset **PixelBytes-PokemonSprites**. Contrairement à la version précédente, **PixelBytes-Pokemon**, cette version est structurée à la volée pour un format unifié. Cela signifie que les données sont préparées de manière à être directement utilisées pour l'entraînement du modèle, ce qui simplifie le processus d'embedding.

## Tokenizer

Le tokenizer joue un rôle dans la préparation des données pour le modèle. Voici comment il fonctionne pour chaque type de donnée :

### Traitement du Texte
Le texte est d'abord normalisé et converti en minuscules. Ensuite, il est encodé en format ASCII pour garantir que tous les caractères sont traités de manière uniforme. Cette étape permet de simplifier le texte avant de le transformer en une séquence de tokens.

### Traitement des Images
Les images, y compris les GIFs, sont traitées en plusieurs étapes. Chaque image est convertie en un espace colorimétrique LAB, qui est plus adapté à certaines analyses d'image. Ensuite, chaque frame d'une image animée est quantifiée selon une palette de couleurs prédéfinie, ce qui permet de réduire la complexité des données tout en préservant les informations essentielles.

### Traitement des Actions-États
Les actions-états sont normalisées pour assurer que toutes les valeurs sont sur la même échelle. Cela facilite la comparaison et l'analyse des états d'action. Les états sont ensuite quantifiés selon un ensemble prédéfini d'états d'action, ce qui permet au modèle de mieux comprendre les relations entre différentes actions.

### Création de Séquences
Les séquences sont créées en utilisant un contexte spatial et temporel. Cela signifie que pour chaque séquence, le modèle prend en compte non seulement l'entrée actuelle, mais aussi les entrées précédentes. Cela permet de générer des entrées de plusieurs éléments qui contiennent des informations pertinentes pour la tâche à accomplir.

## État du Projet

Le code est encore en cours de développement. Bien que les principales fonctionnalités du tokenizer et du traitement des données soient implémentées, des améliorations et des optimisations sont à venir. L'objectif est de rendre le modèle plus robuste et efficace pour la génération multimodale.

## Prochaines Étapes

- Finaliser l'architecture du modèle.
- Implémenter l'entraînement et l'évaluation.
- Optimiser les performances du modèle.
- Effectuer des tests approfondis sur différents types de données multimodales.


In [1]:
#!pip install -q mamba-ssm causal-conv1d ## for GPU (Mambapy included)
!pip install -q git+https://github.com/fabienfrfr/PixelBytes.git@main

In [2]:
# only in kaggle for HF
from kaggle_secrets import UserSecretsClient
user_secrets = UserSecretsClient()
hf_token = user_secrets.get_secret("HF_TOKEN")
# no warning msg during train
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)
# our approach
from pixelbytes import *

In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.cuda.amp import autocast, GradScaler

def count_parameters_in_k(model):
    return sum(p.numel() for p in model.parameters() if p.requires_grad) / 1000

In [4]:
from datasets import load_dataset
#hf_dataset = load_dataset("ffurfaro/PixelBytes-PokemonAll")['train'].train_test_split(test_size=0.1, seed=42)
hf_dataset = load_dataset("ffurfaro/PixelBytes-OptimalControl")['train'].train_test_split(test_size=0.1, seed=42)
train_ds, val_ds = hf_dataset['train'], hf_dataset['test']

Downloading readme:   0%|          | 0.00/372 [00:00<?, ?B/s]

Downloading data:   0%|          | 0.00/12.2M [00:00<?, ?B/s]

Generating train split:   0%|          | 0/9370 [00:00<?, ? examples/s]

In [5]:
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
DATA_REDUCTION = {"image":2, "audio":1} # important pour la proportion des data en AR (overfitting audio risk)
#tokenizer = ActionPixelBytesTokenizer(data_slicing=DATA_REDUCTION)
MIN_BYTES = [b'\x00', b'\t', b'\n', b'0', b'1']
MIN_PALETTE = generate_palette(num_colors=5)
ACTION_STATE= generate_action_space(141)
tokenizer = ActionPixelBytesTokenizer(BYTES=MIN_BYTES, PALETTE=MIN_PALETTE, ACTION_STATE=ACTION_STATE, data_slicing=DATA_REDUCTION)
# Model parameter
VOCAB_SIZE = tokenizer.vocab_size
EMBED_SIZE = 128
HIDDEN_SIZE = 256 #512
NUM_LAYERS = 2
PXBY_DIM = 6 # tokenizer
OBJECTIVE = 2 # 0=predict, 1=autoregressive, 2=diffusion
BIDIRECTION = True
MODEL_TYPE = "lstm"
# Train parameter
BATCH_SIZE = 32
EPOCHS = 100
LEARNING_RATE = 0.001
DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
ACCUMULATION_STEPS = 4
SEQ_LENGTH = 512 # 1024
STRIDE = 256 # 512

config = ModelConfig(vocab_size=VOCAB_SIZE, embed_size=EMBED_SIZE, hidden_size=HIDDEN_SIZE, bidirection=BIDIRECTION,
                          num_layers=NUM_LAYERS, pxby_dim=PXBY_DIM, objective=OBJECTIVE, model_type=MODEL_TYPE)
config

ModelConfig {
  "bidirection": true,
  "custom_model": null,
  "embed_size": 126,
  "hidden_size": 128,
  "num_diffusion_steps": 5,
  "num_layers": 2,
  "objective": "diffusion",
  "pxby_dim": 6,
  "pxby_emb": 21,
  "transformers_version": "4.44.0",
  "vocab_size": 151
}

In [6]:
# Préparation des données
train_token_ds = TokenPxByDataset(train_ds, tokenizer, SEQ_LENGTH, STRIDE)
val_token_ds = TokenPxByDataset(val_ds, tokenizer, SEQ_LENGTH, STRIDE)
# dataloading
train_dataloader = DataLoader(train_token_ds, batch_size=BATCH_SIZE, collate_fn=collate_fn, shuffle=True)
val_dataloader = DataLoader(val_token_ds, batch_size=BATCH_SIZE, collate_fn=collate_fn, shuffle=True)

In [7]:
unbalance_problem = True # if only audio or text --> if only video : FALSE ! (not adapted if few image --> need to adapt)

In [11]:
# Initialisation du modèle
model = aPxBySequenceModel(config).to(DEVICE)
print(f"Le modèle a {count_parameters_in_k(model):.2f}k paramètres entraînables.")
# Parametre d'entrainement
weights = torch.ones(config.vocab_size, device=DEVICE) # num_classes
if unbalance_problem :
    #unique_classes, counts = torch.unique(train_token_ds.tokenized_data[0][0]["input_ids"], return_counts=True)
    #unique_classes, counts = torch.unique(train_token_ds.tokenized_data[0][0]["labels"], return_counts=True) # if predictive
    #weights[0] = 1./(2*counts[0]/counts[1:].float().max())
    weights[0] = 1./120 # approximative (but uniform)
    print('[INFO] Minimize calculus weight of 0')
else : print('[INFO] Verify if balanced problem')
optimizer = optim.AdamW(model.parameters(), lr=LEARNING_RATE)
criterion = nn.CrossEntropyLoss(weight=weights, ignore_index=-100)
scaler = GradScaler() if torch.cuda.is_available() else None

Le modèle a 893.42k paramètres entraînables.
[INFO] Minimize calculus weight of 0


In [12]:
# Entraînement
model.train_model(train_dataloader, val_dataloader, EPOCHS, optimizer, criterion, ACCUMULATION_STEPS)

Evaluating: 100%|██████████| 206/206 [00:04<00:00, 45.46it/s]


Validation Loss: 5.0165, Validation Accuracy: 0.0080


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.42it/s]


Epoch 1/100, Train Loss: 1.9344, Train Accuracy: 0.8459


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.45it/s]


Epoch 2/100, Train Loss: 0.8384, Train Accuracy: 0.9151


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.42it/s]


Epoch 3/100, Train Loss: 0.6836, Train Accuracy: 0.9256


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.41it/s]


Epoch 4/100, Train Loss: 0.5754, Train Accuracy: 0.9343


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.41it/s]


Epoch 5/100, Train Loss: 0.4781, Train Accuracy: 0.9425


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.22it/s]


Validation Loss: 0.4524, Validation Accuracy: 0.9450
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.37it/s]


Epoch 6/100, Train Loss: 0.4181, Train Accuracy: 0.9481


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.37it/s]


Epoch 7/100, Train Loss: 0.3759, Train Accuracy: 0.9523


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.36it/s]


Epoch 8/100, Train Loss: 0.3472, Train Accuracy: 0.9552


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.40it/s]


Epoch 9/100, Train Loss: 0.3277, Train Accuracy: 0.9571


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.31it/s]


Epoch 10/100, Train Loss: 0.3108, Train Accuracy: 0.9589


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.04it/s]


Validation Loss: 0.3149, Validation Accuracy: 0.9592
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.50it/s]


Epoch 11/100, Train Loss: 0.2955, Train Accuracy: 0.9607


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.38it/s]


Epoch 12/100, Train Loss: 0.2898, Train Accuracy: 0.9614


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.38it/s]


Epoch 13/100, Train Loss: 0.2798, Train Accuracy: 0.9626


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.42it/s]


Epoch 14/100, Train Loss: 0.2709, Train Accuracy: 0.9637


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.42it/s]


Epoch 15/100, Train Loss: 0.2654, Train Accuracy: 0.9644


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 44.86it/s]


Validation Loss: 0.2647, Validation Accuracy: 0.9652
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.51it/s]


Epoch 16/100, Train Loss: 0.2611, Train Accuracy: 0.9650


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.47it/s]


Epoch 17/100, Train Loss: 0.2548, Train Accuracy: 0.9657


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.46it/s]


Epoch 18/100, Train Loss: 0.2510, Train Accuracy: 0.9662


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.53it/s]


Epoch 19/100, Train Loss: 0.2476, Train Accuracy: 0.9665


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.49it/s]


Epoch 20/100, Train Loss: 0.2445, Train Accuracy: 0.9670


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.34it/s]


Validation Loss: 0.2540, Validation Accuracy: 0.9669
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.42it/s]


Epoch 21/100, Train Loss: 0.2413, Train Accuracy: 0.9675


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.51it/s]


Epoch 22/100, Train Loss: 0.2405, Train Accuracy: 0.9675


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.48it/s]


Epoch 23/100, Train Loss: 0.2368, Train Accuracy: 0.9679


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.46it/s]


Epoch 24/100, Train Loss: 0.2333, Train Accuracy: 0.9683


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.41it/s]


Epoch 25/100, Train Loss: 0.2326, Train Accuracy: 0.9684


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.26it/s]


Validation Loss: 0.2396, Validation Accuracy: 0.9678
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.41it/s]


Epoch 26/100, Train Loss: 0.2284, Train Accuracy: 0.9689


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.45it/s]


Epoch 27/100, Train Loss: 0.2305, Train Accuracy: 0.9687


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.46it/s]


Epoch 28/100, Train Loss: 0.2270, Train Accuracy: 0.9692


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.49it/s]


Epoch 29/100, Train Loss: 0.2247, Train Accuracy: 0.9694


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.49it/s]


Epoch 30/100, Train Loss: 0.2236, Train Accuracy: 0.9695


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.16it/s]


Validation Loss: 0.2241, Validation Accuracy: 0.9699
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.49it/s]


Epoch 31/100, Train Loss: 0.2227, Train Accuracy: 0.9697


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.44it/s]


Epoch 32/100, Train Loss: 0.2212, Train Accuracy: 0.9699


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.45it/s]


Epoch 33/100, Train Loss: 0.2182, Train Accuracy: 0.9703


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.57it/s]


Epoch 34/100, Train Loss: 0.2197, Train Accuracy: 0.9701


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.49it/s]


Epoch 35/100, Train Loss: 0.2165, Train Accuracy: 0.9705


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.37it/s]


Validation Loss: 0.2211, Validation Accuracy: 0.9701
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.58it/s]


Epoch 36/100, Train Loss: 0.2159, Train Accuracy: 0.9706


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.46it/s]


Epoch 37/100, Train Loss: 0.2168, Train Accuracy: 0.9705


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.56it/s]


Epoch 38/100, Train Loss: 0.2145, Train Accuracy: 0.9708


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.45it/s]


Epoch 39/100, Train Loss: 0.2140, Train Accuracy: 0.9709


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.48it/s]


Epoch 40/100, Train Loss: 0.2136, Train Accuracy: 0.9709


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.22it/s]


Validation Loss: 0.2190, Validation Accuracy: 0.9707
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.49it/s]


Epoch 41/100, Train Loss: 0.2104, Train Accuracy: 0.9714


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.41it/s]


Epoch 42/100, Train Loss: 0.2117, Train Accuracy: 0.9712


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.55it/s]


Epoch 43/100, Train Loss: 0.2109, Train Accuracy: 0.9713


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.48it/s]


Epoch 44/100, Train Loss: 0.2105, Train Accuracy: 0.9714


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.52it/s]


Epoch 45/100, Train Loss: 0.2070, Train Accuracy: 0.9718


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.54it/s]


Validation Loss: 0.2116, Validation Accuracy: 0.9715
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.51it/s]


Epoch 46/100, Train Loss: 0.2085, Train Accuracy: 0.9716


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.50it/s]


Epoch 47/100, Train Loss: 0.2049, Train Accuracy: 0.9722


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.47it/s]


Epoch 48/100, Train Loss: 0.2115, Train Accuracy: 0.9715


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.52it/s]


Epoch 49/100, Train Loss: 0.2068, Train Accuracy: 0.9718


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.47it/s]


Epoch 50/100, Train Loss: 0.2066, Train Accuracy: 0.9719


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 45.98it/s]


Validation Loss: 0.2130, Validation Accuracy: 0.9712


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.56it/s]


Epoch 51/100, Train Loss: 0.2057, Train Accuracy: 0.9721


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.44it/s]


Epoch 52/100, Train Loss: 0.2065, Train Accuracy: 0.9719


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.49it/s]


Epoch 53/100, Train Loss: 0.2037, Train Accuracy: 0.9723


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.49it/s]


Epoch 54/100, Train Loss: 0.2063, Train Accuracy: 0.9720


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.53it/s]


Epoch 55/100, Train Loss: 0.2043, Train Accuracy: 0.9722


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.32it/s]


Validation Loss: 0.2044, Validation Accuracy: 0.9724
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.42it/s]


Epoch 56/100, Train Loss: 0.2020, Train Accuracy: 0.9725


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.36it/s]


Epoch 57/100, Train Loss: 0.2033, Train Accuracy: 0.9723


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.43it/s]


Epoch 58/100, Train Loss: 0.2019, Train Accuracy: 0.9724


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.41it/s]


Epoch 59/100, Train Loss: 0.2011, Train Accuracy: 0.9726


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.42it/s]


Epoch 60/100, Train Loss: 0.2001, Train Accuracy: 0.9727


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.44it/s]


Validation Loss: 0.2055, Validation Accuracy: 0.9725


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.48it/s]


Epoch 61/100, Train Loss: 0.2002, Train Accuracy: 0.9727


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.28it/s]


Epoch 62/100, Train Loss: 0.1974, Train Accuracy: 0.9731


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.44it/s]


Epoch 63/100, Train Loss: 0.2003, Train Accuracy: 0.9727


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.33it/s]


Epoch 64/100, Train Loss: 0.1979, Train Accuracy: 0.9730


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.33it/s]


Epoch 65/100, Train Loss: 0.1974, Train Accuracy: 0.9731


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 45.91it/s]


Validation Loss: 0.2090, Validation Accuracy: 0.9720


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.30it/s]


Epoch 66/100, Train Loss: 0.1981, Train Accuracy: 0.9730


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.38it/s]


Epoch 67/100, Train Loss: 0.1985, Train Accuracy: 0.9729


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.50it/s]


Epoch 68/100, Train Loss: 0.1974, Train Accuracy: 0.9731


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.46it/s]


Epoch 69/100, Train Loss: 0.1943, Train Accuracy: 0.9734


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.46it/s]


Epoch 70/100, Train Loss: 0.1982, Train Accuracy: 0.9727


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.30it/s]


Validation Loss: 0.2011, Validation Accuracy: 0.9725
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.45it/s]


Epoch 71/100, Train Loss: 0.1972, Train Accuracy: 0.9730


Training: 100%|██████████| 1852/1852 [01:53<00:00, 16.37it/s]


Epoch 72/100, Train Loss: 0.1969, Train Accuracy: 0.9731


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.56it/s]


Epoch 73/100, Train Loss: 0.1971, Train Accuracy: 0.9731


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.53it/s]


Epoch 74/100, Train Loss: 0.1982, Train Accuracy: 0.9730


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.58it/s]


Epoch 75/100, Train Loss: 0.1951, Train Accuracy: 0.9733


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.24it/s]


Validation Loss: 0.2056, Validation Accuracy: 0.9725


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.50it/s]


Epoch 76/100, Train Loss: 0.1942, Train Accuracy: 0.9734


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.42it/s]


Epoch 77/100, Train Loss: 0.1955, Train Accuracy: 0.9734


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.44it/s]


Epoch 78/100, Train Loss: 0.1949, Train Accuracy: 0.9734


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.58it/s]


Epoch 79/100, Train Loss: 0.1945, Train Accuracy: 0.9734


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.49it/s]


Epoch 80/100, Train Loss: 0.1939, Train Accuracy: 0.9735


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.29it/s]


Validation Loss: 0.2004, Validation Accuracy: 0.9729
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.51it/s]


Epoch 81/100, Train Loss: 0.1966, Train Accuracy: 0.9731


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.51it/s]


Epoch 82/100, Train Loss: 0.1912, Train Accuracy: 0.9738


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.47it/s]


Epoch 83/100, Train Loss: 0.1933, Train Accuracy: 0.9735


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.41it/s]


Epoch 84/100, Train Loss: 0.1951, Train Accuracy: 0.9733


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.48it/s]


Epoch 85/100, Train Loss: 0.1915, Train Accuracy: 0.9737


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.28it/s]


Validation Loss: 0.1951, Validation Accuracy: 0.9734
Model saved to /kaggle/working/lstm_diffusion_best


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.40it/s]


Epoch 86/100, Train Loss: 0.1940, Train Accuracy: 0.9735


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.48it/s]


Epoch 87/100, Train Loss: 0.1929, Train Accuracy: 0.9735


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.54it/s]


Epoch 88/100, Train Loss: 0.1916, Train Accuracy: 0.9737


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.48it/s]


Epoch 89/100, Train Loss: 0.1964, Train Accuracy: 0.9732


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.50it/s]


Epoch 90/100, Train Loss: 0.1924, Train Accuracy: 0.9736


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.35it/s]


Validation Loss: 0.2001, Validation Accuracy: 0.9730


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.45it/s]


Epoch 91/100, Train Loss: 0.1922, Train Accuracy: 0.9736


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.42it/s]


Epoch 92/100, Train Loss: 0.1928, Train Accuracy: 0.9736


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.56it/s]


Epoch 93/100, Train Loss: 0.1904, Train Accuracy: 0.9739


Training: 100%|██████████| 1852/1852 [01:51<00:00, 16.56it/s]


Epoch 94/100, Train Loss: 0.1903, Train Accuracy: 0.9739


Training: 100%|██████████| 1852/1852 [01:52<00:00, 16.48it/s]


Epoch 95/100, Train Loss: 0.1925, Train Accuracy: 0.9737


Evaluating: 100%|██████████| 206/206 [00:04<00:00, 46.34it/s]


Validation Loss: 0.1975, Validation Accuracy: 0.9732


Training:  53%|█████▎    | 978/1852 [00:59<00:51, 17.11it/s]IOPub message rate exceeded.
The notebook server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--NotebookApp.iopub_msg_rate_limit`.

Current values:
NotebookApp.iopub_msg_rate_limit=1000.0 (msgs/sec)
NotebookApp.rate_limit_window=3.0 (secs)



In [None]:
import re
from huggingface_hub import HfApi, create_repo, whoami
def push_model_to_hub(repo_name, model_dir, token, subfolder=None):
    api = HfApi(token=token)
    subfolder = re.sub(r'[^a-zA-Z0-9]+', '_', subfolder).strip('_').lower()

    try:
        create_repo(repo_name, token=token, repo_type="model", exist_ok=True)
        username = whoami(token=token)['name']
        repo_id = f"{username}/{repo_name}"
        print(f"Repository '{repo_id}' created or already exists.")
    except Exception as e:
        print(f"Error creating repository: {e}")
        return
    
    api.upload_folder(
        folder_path=model_dir,
        repo_id=repo_id,
        repo_type="model",
        path_in_repo=subfolder,
        ignore_patterns=[".*"],  # Ignorer les fichiers cachés
        create_pr=False  # Créer directement dans la branche principale
    )
    print(f"Model pushed successfully to {repo_name}, subfolder: {subfolder}")
!ls

In [16]:
# save model
#push_model_to_hub("aPixelBytes-Pokemon", "lstm_autoregressive_last", hf_token, subfolder="lstm_autoregressive2_last")
#push_model_to_hub("aPixelBytes-Pokemon", "lstm_autoregressive_best", hf_token, subfolder="lstm_autoregressive2_best")
#push_model_to_hub("aPixelBytes-OptimalControl", "lstm_autoregressive_last", hf_token, subfolder="lstm_autoregressive_last")
#push_model_to_hub("aPixelBytes-OptimalControl", "lstm_autoregressive_best", hf_token, subfolder="lstm_autoregressive_best")
push_model_to_hub("aPixelBytes-OptimalControl", "lstm_diffusion_last", hf_token, subfolder="bilstm_diffusion_last")
push_model_to_hub("aPixelBytes-OptimalControl", "lstm_diffusion_best", hf_token, subfolder="bilstm_diffusion_best")

No files have been modified since last commit. Skipping to prevent empty commit.


Repository 'ffurfaro/aPixelBytes-OptimalControl' created or already exists.
Model pushed successfully to aPixelBytes-OptimalControl, subfolder: bilstm_diffusion_last
Repository 'ffurfaro/aPixelBytes-OptimalControl' created or already exists.


No files have been modified since last commit. Skipping to prevent empty commit.


Model pushed successfully to aPixelBytes-OptimalControl, subfolder: bilstm_diffusion_best


In [None]:
# Test de génération
test_input = next(iter(dataloader))['input_ids'][:1].to(DEVICE)
generated = model.generate(test_input, max_length=100)
print("Generated sequence:", generated)

# Entraînement

lstm_autoregressive_best  lstm_autoregressive_last  training_metrics.csv

model.train_model(train_dataloader, val_dataloader, optimizer, criterion, DEVICE, scaler, EPOCHS, ACCUMULATION_STEPS)
Evaluating: 100%|██████████| 107/107 [00:12<00:00,  8.51it/s]
Validation Loss: 5.0095, Validation Accuracy: 0.0343
Training: 100%|██████████| 1030/1030 [02:05<00:00,  8.21it/s]
Epoch 1/100, Train Loss: 1.3395, Train Accuracy: 0.6827
Training: 100%|██████████| 1030/1030 [02:08<00:00,  8.04it/s]
Epoch 2/100, Train Loss: 0.6533, Train Accuracy: 0.8272
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.97it/s]
Epoch 3/100, Train Loss: 0.5496, Train Accuracy: 0.8461
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.96it/s]
Epoch 4/100, Train Loss: 0.5046, Train Accuracy: 0.8543
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.95it/s]
Epoch 5/100, Train Loss: 0.4811, Train Accuracy: 0.8584
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.69it/s]
Validation Loss: 0.4970, Validation Accuracy: 0.8551
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.95it/s]
Epoch 6/100, Train Loss: 0.4554, Train Accuracy: 0.8637
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.97it/s]
Epoch 7/100, Train Loss: 0.4366, Train Accuracy: 0.8676
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.97it/s]
Epoch 8/100, Train Loss: 0.4214, Train Accuracy: 0.8712
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.93it/s]
Epoch 9/100, Train Loss: 0.4086, Train Accuracy: 0.8744
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.93it/s]
Epoch 10/100, Train Loss: 0.4001, Train Accuracy: 0.8767
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.68it/s]
Validation Loss: 0.4442, Validation Accuracy: 0.8665
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:09<00:00,  7.93it/s]
Epoch 11/100, Train Loss: 0.4114, Train Accuracy: 0.8739
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.92it/s]
Epoch 12/100, Train Loss: 0.3986, Train Accuracy: 0.8782
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.77it/s]
Epoch 13/100, Train Loss: 0.3906, Train Accuracy: 0.8806
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 14/100, Train Loss: 0.3767, Train Accuracy: 0.8838
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 15/100, Train Loss: 0.3667, Train Accuracy: 0.8866
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.11it/s]
Validation Loss: 0.4228, Validation Accuracy: 0.8730
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 16/100, Train Loss: 0.3594, Train Accuracy: 0.8887
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.79it/s]
Epoch 17/100, Train Loss: 0.3547, Train Accuracy: 0.8901
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 18/100, Train Loss: 0.3473, Train Accuracy: 0.8923
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 19/100, Train Loss: 0.3431, Train Accuracy: 0.8938
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.77it/s]
Epoch 20/100, Train Loss: 0.3380, Train Accuracy: 0.8952
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.14it/s]
Validation Loss: 0.4059, Validation Accuracy: 0.8782
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.72it/s]
Epoch 21/100, Train Loss: 0.3315, Train Accuracy: 0.8972
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 22/100, Train Loss: 0.3269, Train Accuracy: 0.8987
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.87it/s]
Epoch 23/100, Train Loss: 0.3225, Train Accuracy: 0.9001
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 24/100, Train Loss: 0.3179, Train Accuracy: 0.9015
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.88it/s]
Epoch 25/100, Train Loss: 0.3153, Train Accuracy: 0.9025
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.62it/s]
Validation Loss: 0.4018, Validation Accuracy: 0.8815
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 26/100, Train Loss: 0.3134, Train Accuracy: 0.9033
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 27/100, Train Loss: 0.3072, Train Accuracy: 0.9051
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 28/100, Train Loss: 0.3038, Train Accuracy: 0.9062
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 29/100, Train Loss: 0.3060, Train Accuracy: 0.9057
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 30/100, Train Loss: 0.2966, Train Accuracy: 0.9086
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.67it/s]
Validation Loss: 0.3882, Validation Accuracy: 0.8862
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 31/100, Train Loss: 0.2935, Train Accuracy: 0.9095
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 32/100, Train Loss: 0.2909, Train Accuracy: 0.9103
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 33/100, Train Loss: 0.2883, Train Accuracy: 0.9111
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 34/100, Train Loss: 0.2852, Train Accuracy: 0.9120
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 35/100, Train Loss: 0.2829, Train Accuracy: 0.9127
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.65it/s]
Validation Loss: 0.3858, Validation Accuracy: 0.8879
Model saved to /kaggle/working/lstm_autoregressive_best
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 36/100, Train Loss: 0.2801, Train Accuracy: 0.9137
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 37/100, Train Loss: 0.2783, Train Accuracy: 0.9142
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 38/100, Train Loss: 0.2753, Train Accuracy: 0.9152
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.86it/s]
Epoch 39/100, Train Loss: 0.2754, Train Accuracy: 0.9153
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 40/100, Train Loss: 0.2724, Train Accuracy: 0.9161
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.14it/s]
Validation Loss: 0.3877, Validation Accuracy: 0.8886
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 41/100, Train Loss: 0.2698, Train Accuracy: 0.9170
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.74it/s]
Epoch 42/100, Train Loss: 0.2839, Train Accuracy: 0.9130
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 43/100, Train Loss: 0.2672, Train Accuracy: 0.9179
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 44/100, Train Loss: 0.2652, Train Accuracy: 0.9184
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 45/100, Train Loss: 0.2636, Train Accuracy: 0.9190
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.12it/s]
Validation Loss: 0.3898, Validation Accuracy: 0.8888
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.74it/s]
Epoch 46/100, Train Loss: 0.2620, Train Accuracy: 0.9194
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 47/100, Train Loss: 0.2603, Train Accuracy: 0.9200
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.70it/s]
Epoch 48/100, Train Loss: 0.2655, Train Accuracy: 0.9184
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.78it/s]
Epoch 49/100, Train Loss: 0.2571, Train Accuracy: 0.9210
Training: 100%|██████████| 1030/1030 [02:11<00:00,  7.85it/s]
Epoch 50/100, Train Loss: 0.2561, Train Accuracy: 0.9214
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.60it/s]
Validation Loss: 0.3978, Validation Accuracy: 0.8887
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 51/100, Train Loss: 0.2562, Train Accuracy: 0.9212
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 52/100, Train Loss: 0.2532, Train Accuracy: 0.9223
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 53/100, Train Loss: 0.2546, Train Accuracy: 0.9218
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 54/100, Train Loss: 0.2512, Train Accuracy: 0.9229
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 55/100, Train Loss: 0.2495, Train Accuracy: 0.9234
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.70it/s]
Validation Loss: 0.4078, Validation Accuracy: 0.8869
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 56/100, Train Loss: 0.2493, Train Accuracy: 0.9235
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 57/100, Train Loss: 0.2477, Train Accuracy: 0.9240
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 58/100, Train Loss: 0.2463, Train Accuracy: 0.9245
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 59/100, Train Loss: 0.2547, Train Accuracy: 0.9220
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 60/100, Train Loss: 0.2438, Train Accuracy: 0.9253
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.65it/s]
Validation Loss: 0.4084, Validation Accuracy: 0.8883
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 61/100, Train Loss: 0.2430, Train Accuracy: 0.9255
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 62/100, Train Loss: 0.2424, Train Accuracy: 0.9257
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 63/100, Train Loss: 0.2415, Train Accuracy: 0.9260
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 64/100, Train Loss: 0.2405, Train Accuracy: 0.9263
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.91it/s]
Epoch 65/100, Train Loss: 0.2394, Train Accuracy: 0.9267
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.70it/s]
Validation Loss: 0.4150, Validation Accuracy: 0.8877
Training: 100%|██████████| 1030/1030 [02:11<00:00,  7.86it/s]
Epoch 66/100, Train Loss: 0.2384, Train Accuracy: 0.9270
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 67/100, Train Loss: 0.2385, Train Accuracy: 0.9269
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.77it/s]
Epoch 68/100, Train Loss: 0.2368, Train Accuracy: 0.9276
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 69/100, Train Loss: 0.2359, Train Accuracy: 0.9279
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.70it/s]
Epoch 70/100, Train Loss: 0.2670, Train Accuracy: 0.9185
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.10it/s]
Validation Loss: 0.4157, Validation Accuracy: 0.8872
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.74it/s]
Epoch 71/100, Train Loss: 0.2360, Train Accuracy: 0.9279
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 72/100, Train Loss: 0.2346, Train Accuracy: 0.9284
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 73/100, Train Loss: 0.2337, Train Accuracy: 0.9286
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 74/100, Train Loss: 0.2332, Train Accuracy: 0.9288
Training: 100%|██████████| 1030/1030 [02:14<00:00,  7.67it/s]
Epoch 75/100, Train Loss: 0.2320, Train Accuracy: 0.9292
Evaluating: 100%|██████████| 107/107 [00:14<00:00,  7.16it/s]
Validation Loss: 0.4266, Validation Accuracy: 0.8876
Training: 100%|██████████| 1030/1030 [02:11<00:00,  7.82it/s]
Epoch 76/100, Train Loss: 0.2317, Train Accuracy: 0.9292
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.86it/s]
Epoch 77/100, Train Loss: 0.2309, Train Accuracy: 0.9295
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.87it/s]
Epoch 78/100, Train Loss: 0.2304, Train Accuracy: 0.9297
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 79/100, Train Loss: 0.2314, Train Accuracy: 0.9294
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 80/100, Train Loss: 0.2283, Train Accuracy: 0.9305
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.66it/s]
Validation Loss: 0.4402, Validation Accuracy: 0.8862
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 81/100, Train Loss: 0.2324, Train Accuracy: 0.9290
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 82/100, Train Loss: 0.2266, Train Accuracy: 0.9311
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.88it/s]
Epoch 83/100, Train Loss: 0.2263, Train Accuracy: 0.9311
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.87it/s]
Epoch 84/100, Train Loss: 0.2262, Train Accuracy: 0.9311
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 85/100, Train Loss: 0.2256, Train Accuracy: 0.9313
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.65it/s]
Validation Loss: 0.4557, Validation Accuracy: 0.8851
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 86/100, Train Loss: 0.2255, Train Accuracy: 0.9313
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 87/100, Train Loss: 0.2242, Train Accuracy: 0.9318
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 88/100, Train Loss: 0.2237, Train Accuracy: 0.9320
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 89/100, Train Loss: 0.2239, Train Accuracy: 0.9319
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.88it/s]
Epoch 90/100, Train Loss: 0.2226, Train Accuracy: 0.9323
Evaluating: 100%|██████████| 107/107 [00:13<00:00,  7.67it/s]
Validation Loss: 0.4570, Validation Accuracy: 0.8849
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.90it/s]
Epoch 91/100, Train Loss: 0.2248, Train Accuracy: 0.9317
Training: 100%|██████████| 1030/1030 [02:10<00:00,  7.89it/s]
Epoch 92/100, Train Loss: 0.2220, Train Accuracy: 0.9326
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.80it/s]
Epoch 93/100, Train Loss: 0.2211, Train Accuracy: 0.9329
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 94/100, Train Loss: 0.2207, Train Accuracy: 0.9330
Training: 100%|██████████| 1030/1030 [02:13<00:00,  7.73it/s]
Epoch 95/100, Train Loss: 0.2200, Train Accuracy: 0.9332
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.10it/s]
Validation Loss: 0.4651, Validation Accuracy: 0.8841
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 96/100, Train Loss: 0.2195, Train Accuracy: 0.9334
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 97/100, Train Loss: 0.2921, Train Accuracy: 0.9125
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.75it/s]
Epoch 98/100, Train Loss: 0.2313, Train Accuracy: 0.9291
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.76it/s]
Epoch 99/100, Train Loss: 0.2237, Train Accuracy: 0.9319
Training: 100%|██████████| 1030/1030 [02:12<00:00,  7.77it/s]
Epoch 100/100, Train Loss: 0.2211, Train Accuracy: 0.9329
Evaluating: 100%|██████████| 107/107 [00:15<00:00,  7.09it/s]
Validation Loss: 0.4519, Validation Accuracy: 0.8852
Model saved to /kaggle/working/lstm_autoregressive_last