# Financial-IA — Latent Market Intelligence Demo

**End-to-end demo of the Strate IV PPO agent** operating in latent space learned by Fin-JEPA.

This notebook:
1. Installs dependencies and clones the repo
2. Downloads pre-trained checkpoints (PPO agent + trajectory buffer)
3. Runs the agent on held-out evaluation episodes
4. Visualizes regime switching, position management, and PnL vs Buy & Hold

**No training required** — inference only (~30 seconds on CPU, ~5s on GPU).

---
> Architecture: Spherical VQ-VAE → Fin-JEPA (Mamba-2) → Stochastic Predictor → PPO Agent  
> Paper reference: LeCun (2022) *A Path Towards Autonomous Machine Intelligence* — JEPA framework

## 1. Setup

In [None]:
# Install dependencies
!pip install -q torch pytorch-lightning tslearn numpy pandas dacite pyyaml einops gymnasium stable-baselines3 matplotlib

In [None]:
import os

# Clone the repo (skip if already cloned)
if not os.path.exists('World-IA-Finance'):
    !git clone https://github.com/ElMonstroDelBrest/World-IA-Finance.git

os.chdir('World-IA-Finance')
print('Working directory:', os.getcwd())

## 2. Download Pre-trained Checkpoints

In [None]:
import urllib.request
import zipfile
from pathlib import Path

BASE_URL = "https://github.com/ElMonstroDelBrest/World-IA-Finance/releases/download/v1.0.0"

def download(url, dest):
    dest = Path(dest)
    if dest.exists():
        print(f'  {dest} already exists, skipping.')
        return
    dest.parent.mkdir(parents=True, exist_ok=True)
    print(f'  Downloading {dest.name}...')
    urllib.request.urlretrieve(url, dest)
    print(f'  Done ({dest.stat().st_size / 1e6:.1f} MB)')

# PPO best model checkpoint
print('Downloading PPO best model...')
download(
    f"{BASE_URL}/ppo_best_model.zip",
    "checkpoints/strate_iv/ppo_best_model.zip"
)

# Pre-computed trajectory buffer (JEPA latent representations)
print('Downloading trajectory buffer...')
download(
    f"{BASE_URL}/trajectory_buffer.zip",
    "/tmp/trajectory_buffer.zip"
)

# Extract trajectory buffer
buf_dir = Path('data/trajectory_buffer')
if not buf_dir.exists() or not any(buf_dir.glob('*.pt')):
    print('Extracting trajectory buffer...')
    with zipfile.ZipFile('/tmp/trajectory_buffer.zip', 'r') as z:
        z.extractall('.')
    print(f'  Extracted {len(list(buf_dir.glob("*.pt")))} episodes')
else:
    print(f'  Buffer already extracted: {len(list(buf_dir.glob("*.pt")))} episodes')

print('\nAll assets ready.')


## 3. Run Agent Demo

The PPO agent operates on **latent observations** — not raw prices. Each step, it receives:
- The JEPA context encoding of past market regimes
- A distribution of N=16 stochastic future trajectories
- Its current position and cumulative PnL

It outputs a continuous action in `[-1, 1]` (short → flat → long).

In [None]:
import sys, os
sys.path.insert(0, '.')

import matplotlib
matplotlib.use('Agg')
import numpy as np
from pathlib import Path
from IPython.display import display, Image

from stable_baselines3 import PPO
from src.strate_iv.config import load_config
from src.strate_iv.env import LatentCryptoEnv
from src.strate_iv.trajectory_buffer import TrajectoryBuffer
from scripts.demo_results import run_episode, plot_demo

# Load config and buffer
config = load_config('configs/strate_iv.yaml')
buffer = TrajectoryBuffer('data/trajectory_buffer/')
_, eval_buffer = buffer.split(val_ratio=config.buffer.val_ratio)
print(f'Eval buffer: {len(eval_buffer)} episodes')

# Load PPO best model
model_path = 'checkpoints/strate_iv/ppo_best_model.zip'
model = PPO.load(model_path)
expected_obs_dim = model.observation_space.shape[0]
print(f'PPO best model loaded — obs dim: {expected_obs_dim}')

# Create environment
env = LatentCryptoEnv(buffer=eval_buffer, config=config.env)
print(f'Environment obs dim: {env.observation_space.shape[0]}')
print(f'Window: {config.env.n_tgt} patches × {config.env.patch_len} candles = {config.env.n_tgt * config.env.patch_len}h')


In [None]:
import random
import numpy as np

# Fix obs compatibility: env produces 425 dims, model expects 416
# New obs structure (425):
#   [0:414]   h_x_pooled + future_mean + future_std + close_stats + revin_stds + delta_mu
#   [414:415] step_progress  (added after training)
#   [415:423] realized_returns x8  (added after training)
#   [423:424] position
#   [424:425] cumulative_pnl
# Original obs (416) = obs[:414] + obs[423:425]
def _compat_obs(obs):
    return np.concatenate([obs[:414], obs[423:425]])

_original_predict = model.predict
def _predict_compat(obs, **kwargs):
    return _original_predict(_compat_obs(obs), **kwargs)
model.predict = _predict_compat

# Run n_demos episodes using scripts/demo_results.py logic
Path('outputs/demo').mkdir(parents=True, exist_ok=True)
n_demos = 5
results = []

print(f'Running {n_demos} episodes...\n')
for i in range(n_demos):
    traj = run_episode(model, env, seed=None)
    out_path = f'outputs/demo/demo_{i:02d}.png'
    plot_demo(traj, out_path)
    results.append((out_path, traj))
    print(f'  Episode {i+1}: actions = {traj["actions"].round(2).tolist()}')


## 4. Results

In [None]:
from IPython.display import display, Image

for i, (out_path, traj) in enumerate(results):
    print(f'\n--- Episode {i+1} | actions = {traj["actions"].round(2).tolist()} ---')
    display(Image(filename=out_path, width=900))


## 5. Summary

### What you just saw

The agent **never sees raw prices** during inference. It operates entirely on latent representations produced by Fin-JEPA:

| Component | Role |
|---|---|
| **Spherical VQ-VAE** (Strate I) | Tokenizes OHLCV patches → discrete market regime tokens |
| **Fin-JEPA + Mamba-2** (Strate II) | Self-supervised temporal model over token sequences |
| **Stochastic Predictor** (Strate III) | Samples N=16 divergent future latent trajectories |
| **PPO Agent** (Strate IV) | Plans in latent space, outputs continuous position [-1, 1] |

### Why this matters

Classical approaches (LSTM, Transformer on raw prices) are forced to predict **every tick** — memorizing noise instead of learning structure.  
JEPA learns to predict **latent representations** of future states, ignoring unpredictable details.  
The agent then plans in this cleaner latent space, exhibiting genuine **regime switching** rather than curve-fitting.

---

**Repository:** https://github.com/ElMonstroDelBrest/World-IA-Finance  
**License:** AGPL-3.0