# Exploration du modèle Chronos-2

Ce notebook permet d'explorer la structure du modèle Chronos-2 depuis Hugging Face.


## 1. Installation et imports


In [2]:
# Configuration SSL pour macOS
import ssl
import certifi

# Utiliser les certificats de certifi
ssl._create_default_https_context = ssl._create_unverified_context

print("✓ Configuration SSL appliquée")


✓ Configuration SSL appliquée


In [1]:
import torch
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from transformers import AutoModel, AutoConfig
from huggingface_hub import list_repo_files, model_info

print(f"PyTorch version: {torch.__version__}")
print(f"CUDA disponible: {torch.cuda.is_available()}")


PyTorch version: 2.9.1
CUDA disponible: False


## 2. À propos de Chronos-2

**Chronos-2** est un modèle de fondation pour la prévision de séries temporelles développé par Amazon. 

Il supporte :
- Prévisions **univariées** (une seule série)
- Prévisions **multivariées** (plusieurs séries simultanément)
- Prévisions **informées par des covariables** (variables externes)

Tout cela dans une architecture unique et unifiée !


In [2]:
# Utiliser le modèle Chronos-2 officiel
model_name = "amazon/chronos-2"

print(f"Modèle sélectionné: {model_name}")
print("Chronos-2 est la dernière version du modèle Amazon pour la prévision de séries temporelles")


Modèle sélectionné: amazon/chronos-2
Chronos-2 est la dernière version du modèle Amazon pour la prévision de séries temporelles


## 3. Informations sur le modèle depuis Hugging Face


In [3]:
# Récupérer les informations du modèle
info = model_info(model_name)

print(f"Nom du modèle: {info.modelId}")
print(f"Auteur: {info.author}")
print(f"Tags: {info.tags}")
print(f"Pipeline: {info.pipeline_tag}")
print(f"Dernière modification: {info.lastModified}")


Nom du modèle: amazon/chronos-2
Auteur: amazon
Tags: ['chronos-forecasting', 'safetensors', 't5', 'time series', 'forecasting', 'foundation models', 'pretrained models', 'time-series-forecasting', 'dataset:autogluon/chronos_datasets', 'dataset:Salesforce/GiftEvalPretrain', 'arxiv:2403.07815', 'arxiv:2510.15821', 'license:apache-2.0', 'region:us']
Pipeline: time-series-forecasting
Dernière modification: 2025-11-05 10:32:11+00:00


In [4]:
# Lister les fichiers du repository
files = list_repo_files(model_name)
print("\nFichiers dans le repository:")
for file in files:
    print(f"  - {file}")



Fichiers dans le repository:
  - .gitattributes
  - README.md
  - config.json
  - model.safetensors


## 4. Chargement de la configuration du modèle


In [5]:
# Charger la configuration sans télécharger les poids
config = AutoConfig.from_pretrained(model_name)

print("Configuration du modèle:")
print(f"  - Architecture: {config.model_type}")
print(f"  - Nombre de couches: {config.num_layers}")
print(f"  - Nombre de têtes d'attention: {config.num_heads}")
print(f"  - Dimension du modèle: {config.d_model}")
print(f"  - Dimension FFN: {config.d_ff}")
print(f"  - Taille du vocabulaire: {config.vocab_size}")
print(f"  - Longueur max: {config.n_positions if hasattr(config, 'n_positions') else 'N/A'}")


config.json: 0.00B [00:00, ?B/s]

Configuration du modèle:
  - Architecture: t5
  - Nombre de couches: 12
  - Nombre de têtes d'attention: 12
  - Dimension du modèle: 768
  - Dimension FFN: 3072
  - Taille du vocabulaire: 2
  - Longueur max: N/A


In [6]:
# Afficher toute la configuration
print("\nConfiguration complète:")
print(config)



Configuration complète:
T5Config {
  "architectures": [
    "Chronos2Model"
  ],
  "chronos_config": {
    "context_length": 8192,
    "input_patch_size": 16,
    "input_patch_stride": 16,
    "max_output_patches": 64,
    "output_patch_size": 16,
    "quantiles": [
      0.01,
      0.05,
      0.1,
      0.15,
      0.2,
      0.25,
      0.3,
      0.35,
      0.4,
      0.45,
      0.5,
      0.55,
      0.6,
      0.65,
      0.7,
      0.75,
      0.8,
      0.85,
      0.9,
      0.95,
      0.99
    ],
    "time_encoding_scale": 8192,
    "use_arcsinh": true,
    "use_reg_token": true
  },
  "chronos_pipeline_class": "Chronos2Pipeline",
  "classifier_dropout": 0.0,
  "d_ff": 3072,
  "d_kv": 64,
  "d_model": 768,
  "dense_act_fn": "relu",
  "dropout_rate": 0.1,
  "dtype": "float32",
  "eos_token_id": 1,
  "feed_forward_proj": "relu",
  "initializer_factor": 0.05,
  "is_encoder_decoder": true,
  "is_gated_act": false,
  "layer_norm_epsilon": 1e-06,
  "model_type": "t5",
  "num_d

## 5. Téléchargement et chargement du modèle

**Attention**: Le téléchargement peut prendre du temps selon la taille du modèle et votre connexion internet.


In [7]:
# Charger le modèle complet Chronos-2
print("Téléchargement du modèle Chronos-2 en cours...")
print("Note: Le téléchargement peut prendre plusieurs minutes...")
model = AutoModel.from_pretrained(model_name, trust_remote_code=True)
print("Modèle Chronos-2 chargé avec succès!")


Téléchargement du modèle Chronos-2 en cours...
Note: Le téléchargement peut prendre plusieurs minutes...


model.safetensors:   0%|          | 0.00/478M [00:00<?, ?B/s]

Some weights of T5Model were not initialized from the model checkpoint at amazon/chronos-2 and are newly initialized: ['decoder.block.0.layer.0.SelfAttention.k.weight', 'decoder.block.0.layer.0.SelfAttention.o.weight', 'decoder.block.0.layer.0.SelfAttention.q.weight', 'decoder.block.0.layer.0.SelfAttention.relative_attention_bias.weight', 'decoder.block.0.layer.0.SelfAttention.v.weight', 'decoder.block.0.layer.0.layer_norm.weight', 'decoder.block.0.layer.1.EncDecAttention.k.weight', 'decoder.block.0.layer.1.EncDecAttention.o.weight', 'decoder.block.0.layer.1.EncDecAttention.q.weight', 'decoder.block.0.layer.1.EncDecAttention.v.weight', 'decoder.block.0.layer.1.layer_norm.weight', 'decoder.block.0.layer.2.DenseReluDense.wi.weight', 'decoder.block.0.layer.2.DenseReluDense.wo.weight', 'decoder.block.0.layer.2.layer_norm.weight', 'decoder.block.1.layer.0.SelfAttention.k.weight', 'decoder.block.1.layer.0.SelfAttention.o.weight', 'decoder.block.1.layer.0.SelfAttention.q.weight', 'decoder.blo

Modèle Chronos-2 chargé avec succès!


## 6. Exploration de la structure du modèle


In [8]:
# Afficher l'architecture du modèle
print("Architecture du modèle:")
print(model)


Architecture du modèle:
T5Model(
  (shared): Embedding(2, 768)
  (encoder): T5Stack(
    (embed_tokens): Embedding(2, 768)
    (block): ModuleList(
      (0): T5Block(
        (layer): ModuleList(
          (0): T5LayerSelfAttention(
            (SelfAttention): T5Attention(
              (q): Linear(in_features=768, out_features=768, bias=False)
              (k): Linear(in_features=768, out_features=768, bias=False)
              (v): Linear(in_features=768, out_features=768, bias=False)
              (o): Linear(in_features=768, out_features=768, bias=False)
              (relative_attention_bias): Embedding(32, 12)
            )
            (layer_norm): T5LayerNorm()
            (dropout): Dropout(p=0.1, inplace=False)
          )
          (1): T5LayerFF(
            (DenseReluDense): T5DenseActDense(
              (wi): Linear(in_features=768, out_features=3072, bias=False)
              (wo): Linear(in_features=3072, out_features=768, bias=False)
              (dropout): Dropou

In [9]:
# Compter les paramètres
def count_parameters(model):
    total_params = sum(p.numel() for p in model.parameters())
    trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
    return total_params, trainable_params

total, trainable = count_parameters(model)
print(f"\nNombre total de paramètres: {total:,}")
print(f"Paramètres entraînables: {trainable:,}")
print(f"Taille approximative en mémoire: {total * 4 / (1024**2):.2f} MB (float32)")



Nombre total de paramètres: 198,230,784
Paramètres entraînables: 198,230,784
Taille approximative en mémoire: 756.19 MB (float32)


In [10]:
# Lister tous les modules du modèle
print("\nModules principaux:")
for name, module in model.named_children():
    print(f"  - {name}: {type(module).__name__}")



Modules principaux:
  - shared: Embedding
  - encoder: T5Stack
  - decoder: T5Stack


In [11]:
# Explorer les couches en détail
print("\nCouches du modèle (premier niveau):")
for name, param in model.named_parameters():
    if param.requires_grad:
        print(f"  - {name}: {param.shape}")



Couches du modèle (premier niveau):
  - shared.weight: torch.Size([2, 768])
  - encoder.block.0.layer.0.SelfAttention.q.weight: torch.Size([768, 768])
  - encoder.block.0.layer.0.SelfAttention.k.weight: torch.Size([768, 768])
  - encoder.block.0.layer.0.SelfAttention.v.weight: torch.Size([768, 768])
  - encoder.block.0.layer.0.SelfAttention.o.weight: torch.Size([768, 768])
  - encoder.block.0.layer.0.SelfAttention.relative_attention_bias.weight: torch.Size([32, 12])
  - encoder.block.0.layer.0.layer_norm.weight: torch.Size([768])
  - encoder.block.0.layer.1.DenseReluDense.wi.weight: torch.Size([3072, 768])
  - encoder.block.0.layer.1.DenseReluDense.wo.weight: torch.Size([768, 3072])
  - encoder.block.0.layer.1.layer_norm.weight: torch.Size([768])
  - encoder.block.1.layer.0.SelfAttention.q.weight: torch.Size([768, 768])
  - encoder.block.1.layer.0.SelfAttention.k.weight: torch.Size([768, 768])
  - encoder.block.1.layer.0.SelfAttention.v.weight: torch.Size([768, 768])
  - encoder.block

## 7. Analyse des composants clés


In [12]:
# Analyser l'encoder
if hasattr(model, 'encoder'):
    print("Structure de l'Encoder:")
    print(model.encoder)
    print(f"\nNombre de couches dans l'encoder: {len(list(model.encoder.children()))}")


Structure de l'Encoder:
T5Stack(
  (embed_tokens): Embedding(2, 768)
  (block): ModuleList(
    (0): T5Block(
      (layer): ModuleList(
        (0): T5LayerSelfAttention(
          (SelfAttention): T5Attention(
            (q): Linear(in_features=768, out_features=768, bias=False)
            (k): Linear(in_features=768, out_features=768, bias=False)
            (v): Linear(in_features=768, out_features=768, bias=False)
            (o): Linear(in_features=768, out_features=768, bias=False)
            (relative_attention_bias): Embedding(32, 12)
          )
          (layer_norm): T5LayerNorm()
          (dropout): Dropout(p=0.1, inplace=False)
        )
        (1): T5LayerFF(
          (DenseReluDense): T5DenseActDense(
            (wi): Linear(in_features=768, out_features=3072, bias=False)
            (wo): Linear(in_features=3072, out_features=768, bias=False)
            (dropout): Dropout(p=0.1, inplace=False)
            (act): ReLU()
          )
          (layer_norm): T5Laye

In [13]:
# Analyser le decoder
if hasattr(model, 'decoder'):
    print("Structure du Decoder:")
    print(model.decoder)
    print(f"\nNombre de couches dans le decoder: {len(list(model.decoder.children()))}")


Structure du Decoder:
T5Stack(
  (embed_tokens): Embedding(2, 768)
  (block): ModuleList(
    (0): T5Block(
      (layer): ModuleList(
        (0): T5LayerSelfAttention(
          (SelfAttention): T5Attention(
            (q): Linear(in_features=768, out_features=768, bias=False)
            (k): Linear(in_features=768, out_features=768, bias=False)
            (v): Linear(in_features=768, out_features=768, bias=False)
            (o): Linear(in_features=768, out_features=768, bias=False)
            (relative_attention_bias): Embedding(32, 12)
          )
          (layer_norm): T5LayerNorm()
          (dropout): Dropout(p=0.1, inplace=False)
        )
        (1): T5LayerCrossAttention(
          (EncDecAttention): T5Attention(
            (q): Linear(in_features=768, out_features=768, bias=False)
            (k): Linear(in_features=768, out_features=768, bias=False)
            (v): Linear(in_features=768, out_features=768, bias=False)
            (o): Linear(in_features=768, out_fe

## 8. Informations sur les embeddings


## 11. Exemple d'utilisation de Chronos-2 pour la prévision

Voici comment utiliser Chronos-2 pour faire des prévisions sur vos propres données.


In [3]:
import pandas as pd  # requires: pip install 'pandas[pyarrow]'
from chronos import Chronos2Pipeline

pipeline = Chronos2Pipeline.from_pretrained("amazon/chronos-2", device_map="mps")

# Load historical target values and past values of covariates
context_df = pd.read_parquet("https://autogluon.s3.amazonaws.com/datasets/timeseries/electricity_price/train.parquet")

# (Optional) Load future values of covariates
test_df = pd.read_parquet("https://autogluon.s3.amazonaws.com/datasets/timeseries/electricity_price/test.parquet")
future_df = test_df.drop(columns="target")

# Generate predictions with covariates
pred_df = pipeline.predict_df(
    context_df,
    future_df=future_df,
    prediction_length=24,  # Number of steps to forecast
    quantile_levels=[0.1, 0.5, 0.9],  # Quantiles for probabilistic forecast
    id_column="id",  # Column identifying different time series
    timestamp_column="timestamp",  # Column with datetime information
    target="target",  # Column(s) with time series values to predict
)


