# Screenshot Generator für Dokumentation

Dieses Notebook erstellt die Screenshots für die Projektarbeit:
- `screenshot-forecast-vergleich.png`: Zukunftsprognosen aller 4 Modelle
- `screenshot-backtest-vergleich.png`: Historischer Backtest mit Walk-Forward Validation (30 Tage)
- `screenshot-backtest-kurzansicht.png`: Backtest Kurzansicht (letzter Tag) für bessere Lesbarkeit

In [101]:
import os
import sys
from pathlib import Path

# Projektpfad hinzufügen
project_root = Path.cwd().parent
sys.path.insert(0, str(project_root / "src"))

# Ordner für Bilder erstellen
images_dir = project_root / "images"
images_dir.mkdir(exist_ok=True)
print(f"Images werden gespeichert in: {images_dir}")

Images werden gespeichert in: /home/daniel/dev/ai/casml4se-stonkswagen/images


In [102]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

from gw2ml.pipelines.forecast import forecast_item
from gw2ml.data.loaders import load_gw2_series

## Konfiguration

In [103]:
# Mystic Coin Item ID
ITEM_ID = 19976

# Forecast Konfiguration
CONFIG = {
    "data": {
        "days_back": 30,
        "value_column": "buy_unit_price",
    },
    "forecast": {
        "horizon": 12,  # 12 Schritte = 1 Stunde bei 5-Minuten-Intervallen
    },
    "metric": {
        "primary": "mape",
        "metrics": ["mape", "smape", "mae", "rmse"],
    },
    "models": [
        {"name": "ARIMA", "grid": None},
        {"name": "ExponentialSmoothing", "grid": None},
        {"name": "XGBoost", "grid": None},
        {"name": "Chronos2", "grid": None},
    ],
}

# Farbpalette für Modelle
MODEL_COLORS = {
    "ARIMA": "#ff7f0e",            # Orange
    "ExponentialSmoothing": "#2ca02c",  # Grün
    "XGBoost": "#d62728",          # Rot
    "Chronos2": "#9467bd",         # Lila
}
ACTUAL_COLOR = "#1f77b4"  # Blau für tatsächliche Werte

In [104]:
# Training control (default: no retrain, no grid search)
# RETRAIN = False

# Optional: force grid search (uncomment to enable; slower)
RETRAIN = True
CONFIG["train"] = {"force_grid_search": True}

# Optional: ARIMA grid suggestions (uncomment one block)
# Small (fast)
for m in CONFIG["models"]:
    if m["name"] == "ARIMA":
        m["grid"] = {
            "p": [1, 2],
            "d": [0, 1],
            "q": [1, 2],
            "seasonal_order": [(0, 0, 0, 0)],
        }
#
# # Medium (slower)
# for m in CONFIG["models"]:
#     if m["name"] == "ARIMA":
#         m["grid"] = {
#             "p": [0, 1, 2],
#             "d": [0, 1],
#             "q": [0, 1, 2],
#             "seasonal_order": [(0, 0, 0, 0)],
#         }


## Forecast generieren

In [105]:
print(f"Generiere Forecasts für Item {ITEM_ID} (Mystic Coin)...")
print(f"Modelle: {[m['name'] for m in CONFIG['models']]}")
print(f"Horizon: {CONFIG['forecast']['horizon']} Schritte (= {CONFIG['forecast']['horizon'] * 5} Minuten)")
print()

result = forecast_item(
    ITEM_ID,
    override_config=CONFIG,
    retrain=RETRAIN,
    include_backtest=True,
    include_history=True,  # Wichtig für Context-Daten im Forecast-Plot
)

models_payload = result.get("models", [])
missing_models = result.get("missing_models", [])

print(f"\nErfolgreich: {[m['model_name'] for m in models_payload]}")
if missing_models:
    print(f"Fehlend: {missing_models}")

Generiere Forecasts für Item 19976 (Mystic Coin)...
Modelle: ['ARIMA', 'ExponentialSmoothing', 'XGBoost', 'Chronos2']
Horizon: 12 Schritte (= 60 Minuten)




Non-stationary starting autoregressive parameters found. Using zeros as starting parameters.

Detected user-defined float16-like precision. For mixed precision training, recommended options are 'bf16-mixed' and '16-mixed'.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
TimeSeries is using a numeric type different from numpy.float32 or numpy.float64. Not all functionalities may work properly. It is recommended casting your data to floating point numbers before using TimeSeries.
Detected user-defined float16-like precision. For mixed precision training, recommended options are 'bf16-mixed' and '16-mixed'.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), u


Erfolgreich: ['ARIMA', 'Chronos2', 'ExponentialSmoothing', 'XGBoost']


## Screenshot 1: Forecast-Vergleich (Zukunftsprognosen)

In [106]:
# Anzahl der Actual-Datenpunkte die angezeigt werden sollen
ACTUAL_POINTS_TO_SHOW = 36

# Context/History Daten aus dem Result extrahieren
context = result.get("context") or {}
history_ts = pd.to_datetime(context.get("timestamps", []))
history_vals = context.get("values", [])

# Nur die letzten N Actual-Punkte anzeigen
if ACTUAL_POINTS_TO_SHOW > 0 and len(history_ts) > ACTUAL_POINTS_TO_SHOW:
    history_ts = history_ts[-ACTUAL_POINTS_TO_SHOW:]
    history_vals = history_vals[-ACTUAL_POINTS_TO_SHOW:]

# Letzten gültigen Actual-Punkt finden für smooth connection
last_actual = None
if len(history_ts) and len(history_vals):
    for ts, val in zip(reversed(history_ts), reversed(history_vals)):
        if val is not None and not pd.isna(val):
            last_actual = (ts, val)
            break

print(f"Zeige {len(history_ts)} Actual-Datenpunkte")
print(f"Letzter Actual-Punkt: {last_actual}")

# Daten für Future Forecast sammeln (mit smooth connection zum letzten Actual)
future_rows = []
for m in models_payload:
    model_name = m.get("model_name")
    f_ts = m.get("future", {}).get("timestamps", [])
    f_vals = m.get("future", {}).get("values", [])
    
    # Letzten Actual-Punkt am Anfang hinzufügen für smooth connection
    if last_actual is not None and len(f_ts) > 0:
        last_ts, last_val = last_actual
        if pd.to_datetime(f_ts[0]) != pd.to_datetime(last_ts):
            f_ts = [last_ts] + list(f_ts)
            f_vals = [last_val] + list(f_vals)
    
    for ts, val in zip(f_ts, f_vals):
        future_rows.append({"timestamp": ts, "value": val, "model": model_name})

df_future = pd.DataFrame(future_rows)
df_future["timestamp"] = pd.to_datetime(df_future["timestamp"])

# Plot erstellen (Styling wie in forecast_app.py)
fig_future = go.Figure()

# Actual-Linie hinzufügen (gleiche Dicke wie Forecast-Linien)
if len(history_ts) > 0:
    fig_future.add_trace(go.Scatter(
        x=history_ts,
        y=history_vals,
        mode="lines",
        name="Tatsächlich",
        line=dict(color=ACTUAL_COLOR, width=2),
    ))

# Forecast-Linien für jedes Modell
for model_name in df_future["model"].unique():
    df_model = df_future[df_future["model"] == model_name]
    fig_future.add_trace(go.Scatter(
        x=df_model["timestamp"],
        y=df_model["value"],
        mode="lines",
        name=model_name,
        line=dict(color=MODEL_COLORS.get(model_name, "#7f7f7f"), width=1.5),
    ))

fig_future.update_layout(
    title=f"Zukunftsprognose: Mystic Coin (nächste {CONFIG['forecast']['horizon'] * 5} Minuten)",
    xaxis_title="Zeit",
    yaxis_title="Preis (Kupfer)",
    template="plotly_white",
    legend=dict(orientation="h", yanchor="top", y=-0.15, xanchor="center", x=0.5),
    font=dict(size=14),
    title_font_size=18,
    width=1200,
    height=600,
    margin=dict(b=120),
)

fig_future.show()

Zeige 36 Actual-Datenpunkte
Letzter Actual-Punkt: (Timestamp('2026-01-15 20:20:00'), 21600.0)



Could not infer format, so each element will be parsed individually, falling back to `dateutil`. To ensure parsing is consistent and as-expected, please specify a format.



In [107]:
# Speichern als PNG
forecast_path = images_dir / "screenshot-forecast-vergleich.png"
fig_future.write_image(str(forecast_path), scale=2)
print(f"Gespeichert: {forecast_path}")

Gespeichert: /home/daniel/dev/ai/casml4se-stonkswagen/images/screenshot-forecast-vergleich.png


## Screenshot 2: Backtest-Vergleich (Walk-Forward Validation)

In [108]:
# Daten für Backtest sammeln
hist_rows = []
actual_rows = None

for m in models_payload:
    model_name = m.get("model_name")
    h_ts = m.get("history", {}).get("timestamps", [])
    h_fc = m.get("history", {}).get("forecast", [])
    h_act = m.get("history", {}).get("actual", [])
    
    # Actual-Werte nur einmal sammeln (sind für alle Modelle gleich)
    if h_ts and h_act and actual_rows is None:
        actual_rows = [{"timestamp": ts, "actual": val} for ts, val in zip(h_ts, h_act) if val is not None]
    
    for ts, val in zip(h_ts, h_fc):
        if val is not None:
            hist_rows.append({"timestamp": ts, "forecast": val, "model": model_name})

df_hist = pd.DataFrame(hist_rows)
df_hist["timestamp"] = pd.to_datetime(df_hist["timestamp"])

if actual_rows:
    df_actual = pd.DataFrame(actual_rows)
    df_actual["timestamp"] = pd.to_datetime(df_actual["timestamp"])

print(f"Backtest Datenpunkte: {len(df_hist)} Prognosen, {len(df_actual) if actual_rows else 0} Actual-Werte")

Backtest Datenpunkte: 360 Prognosen, 36 Actual-Werte


In [109]:
# Backtest Plot erstellen (Styling wie in forecast_app.py)
fig_backtest = go.Figure()

# Tatsächlicher Preisverlauf (blau, dicker als Forecast-Linien)
if actual_rows:
    fig_backtest.add_trace(go.Scatter(
        x=df_actual["timestamp"],
        y=df_actual["actual"],
        mode="lines",
        name="Tatsächlich",
        line=dict(color=ACTUAL_COLOR, width=3),
    ))

# Prognosen der einzelnen Modelle (durchgezogen wie in forecast_app.py)
for model_name in df_hist["model"].unique():
    df_model = df_hist[df_hist["model"] == model_name]
    fig_backtest.add_trace(go.Scatter(
        x=df_model["timestamp"],
        y=df_model["forecast"],
        mode="lines",
        name=model_name,
        line=dict(color=MODEL_COLORS.get(model_name, "#7f7f7f"), width=1.5),
    ))

fig_backtest.update_layout(
    title="Backtest: Walk-Forward Validation für Mystic Coin (30 Tage)",
    xaxis_title="Zeit",
    yaxis_title="Preis (Kupfer)",
    template="plotly_white",
    legend=dict(orientation="h", yanchor="top", y=-0.15, xanchor="center", x=0.5),
    font=dict(size=14),
    title_font_size=18,
    width=1200,
    height=600,
    margin=dict(b=120),
)

fig_backtest.show()

In [110]:
# Speichern als PNG
backtest_path = images_dir / "screenshot-backtest-vergleich.png"
fig_backtest.write_image(str(backtest_path), scale=2)
print(f"Gespeichert: {backtest_path}")

Gespeichert: /home/daniel/dev/ai/casml4se-stonkswagen/images/screenshot-backtest-vergleich.png


## Screenshot 3: Backtest-Vergleich (Kurzansicht - letzter Tag)

In [111]:
# Kurzansicht: Letzte 12 Datenpunkte (ca. 1 Stunde bei 5-Minuten-Intervallen)
# Hinweis: Der Backtest hat nur wenige Actual-Werte, daher kleinerer Ausschnitt
LAST_N_POINTS = 12

print(f"Vollständige Daten: {len(df_actual)} Actual-Werte, {len(df_hist)} Forecast-Werte")

# Letzte N Actual-Werte
df_actual_short = df_actual.tail(LAST_N_POINTS).copy()
cutoff_time = df_actual_short["timestamp"].min()
end_time = df_actual_short["timestamp"].max()

print(f"Kurzansicht Zeitraum: {cutoff_time} bis {end_time}")

# Nur Forecast-Daten im Kurzansicht-Zeitraum
df_hist_short = df_hist[
    (df_hist["timestamp"] >= cutoff_time) & 
    (df_hist["timestamp"] <= end_time)
].copy()

print(f"Gefiltert: {len(df_actual_short)} Actual-Werte, {len(df_hist_short)} Prognosen")

# Plot erstellen - NUR mit den gefilterten Daten
fig_backtest_short = go.Figure()

# Tatsächlicher Preisverlauf (dicker als Forecast-Linien)
fig_backtest_short.add_trace(go.Scatter(
    x=df_actual_short["timestamp"],
    y=df_actual_short["actual"],
    mode="lines",
    name="Tatsächlich",
    line=dict(color=ACTUAL_COLOR, width=3),
))

# Prognosen der einzelnen Modelle
models_in_short = df_hist_short["model"].unique() if len(df_hist_short) > 0 else []
print(f"Modelle in Kurzansicht: {list(models_in_short)}")

for model_name in models_in_short:
    df_model = df_hist_short[df_hist_short["model"] == model_name]
    fig_backtest_short.add_trace(go.Scatter(
        x=df_model["timestamp"],
        y=df_model["forecast"],
        mode="lines",
        name=model_name,
        line=dict(color=MODEL_COLORS.get(model_name, "#7f7f7f"), width=1.5),
    ))

# Layout
fig_backtest_short.update_layout(
    title="Backtest: Walk-Forward Validation für Mystic Coin (letzte Stunde)",
    xaxis_title="Zeit",
    yaxis_title="Preis (Kupfer)",
    template="plotly_white",
    legend=dict(orientation="h", yanchor="top", y=-0.15, xanchor="center", x=0.5),
    font=dict(size=14),
    title_font_size=18,
    width=1200,
    height=600,
    margin=dict(b=120),
)

fig_backtest_short.show()

Vollständige Daten: 36 Actual-Werte, 360 Forecast-Werte
Kurzansicht Zeitraum: 2026-01-13 21:20:00 bis 2026-01-15 17:20:00
Gefiltert: 12 Actual-Werte, 114 Prognosen
Modelle in Kurzansicht: ['ARIMA', 'Chronos2', 'ExponentialSmoothing', 'XGBoost']


In [112]:
# Speichern als PNG
backtest_short_path = images_dir / "screenshot-backtest-kurzansicht.png"
fig_backtest_short.write_image(str(backtest_short_path), scale=2)
print(f"Gespeichert: {backtest_short_path}")

Gespeichert: /home/daniel/dev/ai/casml4se-stonkswagen/images/screenshot-backtest-kurzansicht.png


## Metriken-Tabelle

In [113]:
from darts import TimeSeries
from gw2ml.metrics.registry import get_metric

metrics_data = []
for m in models_payload:
    model_name = m.get("model_name")
    h_fc = m.get("history", {}).get("forecast", [])
    h_act = m.get("history", {}).get("actual", [])
    
    row = {"Model": model_name}
    
    if h_fc and h_act and len(h_fc) == len(h_act):
        # NaN-Werte entfernen
        valid_pairs = [(f, a) for f, a in zip(h_fc, h_act) if f is not None and a is not None]
        if valid_pairs:
            fc_vals, act_vals = zip(*valid_pairs)
            ts_actual = TimeSeries.from_values(list(act_vals))
            ts_forecast = TimeSeries.from_values(list(fc_vals))
            
            for metric_name in CONFIG["metric"]["metrics"]:
                try:
                    metric_fn = get_metric(metric_name)
                    value = metric_fn(ts_actual, ts_forecast)
                    if metric_name in ["mape", "smape"]:
                        row[metric_name.upper()] = f"{value:.4f}%"
                    else:
                        row[metric_name.upper()] = f"{value:.4f}"
                except Exception as e:
                    row[metric_name.upper()] = "Error"
        else:
            for metric_name in CONFIG["metric"]["metrics"]:
                row[metric_name.upper()] = "N/A"
    else:
        for metric_name in CONFIG["metric"]["metrics"]:
            row[metric_name.upper()] = "N/A"
    
    metrics_data.append(row)

df_metrics = pd.DataFrame(metrics_data)
print("\n=== Modell-Performance Metriken ===")
print(df_metrics.to_string(index=False))


=== Modell-Performance Metriken ===
               Model    MAPE   SMAPE      MAE     RMSE
               ARIMA 0.6259% 0.6215% 131.3537 211.8202
            Chronos2 0.6168% 0.6184% 130.8125 209.7687
ExponentialSmoothing 0.6184% 0.6137% 129.7673 219.3218
             XGBoost 0.7315% 0.7329% 155.0784 226.7933


## Performance-Benchmark

Messung der Ausführungszeiten für Fit, Predict und Backtest pro Modell.

In [114]:
import time
import torch
from gw2ml.data.loaders import load_gw2_series
from gw2ml.modeling.registry import get_model
from gw2ml.pipelines.config import DEFAULT_CONFIG

# Hardware-Support pro Modell (aus gpu-optimierungen.md)
HARDWARE_SUPPORT = {
    "ARIMA": "CPU only",
    "ExponentialSmoothing": "CPU only",
    "XGBoost": "CUDA, ROCm",
    "Chronos2": "CUDA, ROCm, MPS",
}

# Daten laden für Benchmark
series_meta = load_gw2_series(
    item_id=ITEM_ID,
    days_back=CONFIG["data"]["days_back"],
    value_column=CONFIG["data"]["value_column"],
    fill_missing_dates=DEFAULT_CONFIG["data"]["fill_missing_dates"],
    resample_freq=DEFAULT_CONFIG["data"]["resample_freq"],
)
series = series_meta.series
num_points = len(series)

print(f"Benchmark mit {num_points} Datenpunkten")
print(f"GPU verfügbar: {torch.cuda.is_available()}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name(0)}")
print()

# Benchmark für jedes Modell
HORIZON = CONFIG["forecast"]["horizon"]
MODEL_NAMES = ["ARIMA", "ExponentialSmoothing", "XGBoost", "Chronos2"]

benchmark_results = []

for model_name in MODEL_NAMES:
    print(f"Benchmarking {model_name}...")
    
    try:
        # Modell erstellen
        model_class = get_model(model_name)
        model = model_class()
        
        # Device ermitteln
        device = "cpu"
        if hasattr(model, "_device"):
            device = model._device
        elif model_name in ["XGBoost", "Chronos2"] and torch.cuda.is_available():
            device = "cuda"
        
        # Fit-Zeit messen
        fit_start = time.perf_counter()
        if model_name == "Chronos2":
            model.fit(series, epochs=0)
        else:
            model.fit(series)
        fit_time = time.perf_counter() - fit_start
        
        # Predict-Zeit messen
        predict_start = time.perf_counter()
        _ = model.predict(n=HORIZON)
        predict_time = time.perf_counter() - predict_start
        
        # Backtest-Zeit messen (mit reduziertem Stride für Local Models)
        backtest_start = time.perf_counter()
        is_local = model_name in ["ARIMA", "ExponentialSmoothing"]
        stride = HORIZON * 4 if is_local else HORIZON
        retrain = is_local  # Local Models müssen retrain=True verwenden
        
        _ = model.historical_forecasts(
            series,
            start=0.9,
            forecast_horizon=HORIZON,
            stride=stride,
            retrain=retrain,
            verbose=False,
        )
        backtest_time = time.perf_counter() - backtest_start
        
        total_time = fit_time + predict_time + backtest_time
        
        benchmark_results.append({
            "Model": model_name,
            "Hardware": HARDWARE_SUPPORT.get(model_name, "?"),
            "Device": device,
            "Fit (s)": f"{fit_time:.3f}",
            "Predict (s)": f"{predict_time:.3f}",
            "Backtest (s)": f"{backtest_time:.3f}",
            "Total (s)": f"{total_time:.3f}",
            "_total_numeric": total_time,
        })
        
        print(f"  ✓ {model_name}: {total_time:.2f}s")
        
    except Exception as e:
        print(f"  ✗ {model_name}: Fehler - {e}")
        benchmark_results.append({
            "Model": model_name,
            "Hardware": HARDWARE_SUPPORT.get(model_name, "?"),
            "Device": "?",
            "Fit (s)": "Error",
            "Predict (s)": "Error",
            "Backtest (s)": "Error",
            "Total (s)": "Error",
            "_total_numeric": float("inf"),
        })

# Relativ-Spalte berechnen
if benchmark_results:
    min_time = min(r["_total_numeric"] for r in benchmark_results if r["_total_numeric"] != float("inf"))
    for r in benchmark_results:
        if r["_total_numeric"] != float("inf"):
            r["Relativ"] = f"{r['_total_numeric'] / min_time:.1f}x"
        else:
            r["Relativ"] = "N/A"
        del r["_total_numeric"]

# Ergebnisse als DataFrame
df_benchmark = pd.DataFrame(benchmark_results)
print("\n" + "="*90)
print("BENCHMARK ERGEBNISSE")
print("="*90)
print(f"\nKonfiguration:")
print(f"  Datenpunkte: {num_points}")
print(f"  Forecast Horizon: {HORIZON}")
print(f"  GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'Keine'}")
print()
print(df_benchmark.to_string(index=False))
print()

# Sortiert nach Total-Zeit für "Schnellstes Modell"
fastest = df_benchmark.sort_values("Total (s)").iloc[0]["Model"] if len(df_benchmark) > 0 else "N/A"
print(f"Schnellstes Modell: {fastest}")

Benchmark mit 8640 Datenpunkten
GPU verfügbar: True
GPU: NVIDIA GeForce RTX 3060

Benchmarking ARIMA...
  ✓ ARIMA: 10.91s
Benchmarking ExponentialSmoothing...
  ✓ ExponentialSmoothing: 0.35s
Benchmarking XGBoost...
  ✓ XGBoost: 0.33s
Benchmarking Chronos2...


Detected user-defined float16-like precision. For mixed precision training, recommended options are 'bf16-mixed' and '16-mixed'.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
TimeSeries is using a numeric type different from numpy.float32 or numpy.float64. Not all functionalities may work properly. It is recommended casting your data to floating point numbers before using TimeSeries.
Using 16bit Automatic Mixed Precision (AMP)
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
TimeSeries is using a numeric type different from numpy.float32 or numpy.float64. Not all funct

  ✓ Chronos2: 1.86s

BENCHMARK ERGEBNISSE

Konfiguration:
  Datenpunkte: 8640
  Forecast Horizon: 12
  GPU: NVIDIA GeForce RTX 3060

               Model        Hardware Device Fit (s) Predict (s) Backtest (s) Total (s) Relativ
               ARIMA        CPU only    cpu   0.647       0.002       10.258    10.907   32.7x
ExponentialSmoothing        CPU only    cpu   0.013       0.005        0.329     0.346    1.0x
             XGBoost      CUDA, ROCm   cuda   0.294       0.021        0.018     0.333    1.0x
            Chronos2 CUDA, ROCm, MPS   cuda   1.435       0.157        0.263     1.855    5.6x

Schnellstes Modell: XGBoost


## Zusammenfassung

In [115]:
print("\n" + "="*60)
print("GENERIERTE SCREENSHOTS")
print("="*60)

for img_file in images_dir.glob("screenshot-*.png"):
    print(f"  {img_file.name}")

print("\n" + "="*60)
print("METRIKEN FÜR DOKUMENTATION")
print("="*60)
print(df_metrics.to_markdown(index=False))

print("\n" + "="*60)
print("BENCHMARK FÜR DOKUMENTATION")
print("="*60)
print(df_benchmark.to_markdown(index=False))


GENERIERTE SCREENSHOTS
  screenshot-forecast-vergleich.png
  screenshot-backtest-vergleich.png
  screenshot-backtest-kurzansicht.png

METRIKEN FÜR DOKUMENTATION
| Model                | MAPE    | SMAPE   |     MAE |    RMSE |
|:---------------------|:--------|:--------|--------:|--------:|
| ARIMA                | 0.6259% | 0.6215% | 131.354 | 211.82  |
| Chronos2             | 0.6168% | 0.6184% | 130.812 | 209.769 |
| ExponentialSmoothing | 0.6184% | 0.6137% | 129.767 | 219.322 |
| XGBoost              | 0.7315% | 0.7329% | 155.078 | 226.793 |

BENCHMARK FÜR DOKUMENTATION
| Model                | Hardware        | Device   |   Fit (s) |   Predict (s) |   Backtest (s) |   Total (s) | Relativ   |
|:---------------------|:----------------|:---------|----------:|--------------:|---------------:|------------:|:----------|
| ARIMA                | CPU only        | cpu      |     0.647 |         0.002 |         10.258 |      10.907 | 32.7x     |
| ExponentialSmoothing | CPU only        | c

In [116]:
# Export der Tabellen als Markdown-Dokument
export_path = project_root / "docs" / "model-evaluation-results.md"

markdown_content = f"""# Modell-Evaluation Ergebnisse

Automatisch generiert aus `notebooks/generate_screenshots.ipynb`

## Modell-Performance Metriken

Backtest-Ergebnisse für Mystic Coin (Item ID: {ITEM_ID}) mit {CONFIG['data']['days_back']} Tagen Daten.

{df_metrics.to_markdown(index=False)}

**Legende:**
- **MAPE**: Mean Absolute Percentage Error
- **SMAPE**: Symmetric Mean Absolute Percentage Error  
- **MAE**: Mean Absolute Error
- **RMSE**: Root Mean Squared Error

## Performance-Benchmark

Ausführungszeiten für {num_points} Datenpunkte und Forecast-Horizon von {HORIZON} Schritten.

{df_benchmark.to_markdown(index=False)}

**Hardware-Konfiguration:**
- GPU: {torch.cuda.get_device_name(0) if torch.cuda.is_available() else 'Keine'}
- CUDA verfügbar: {torch.cuda.is_available()}
"""

export_path.write_text(markdown_content)
print(f"Markdown-Export gespeichert: {export_path}")

Markdown-Export gespeichert: /home/daniel/dev/ai/casml4se-stonkswagen/docs/model-evaluation-results.md
