# Condor Game

The goal is to anticipate how asset prices will evolve by providing not a single forecasted value, but a **full probability distribution over the future price at multiple forecast horizons**.

## Probabilistic Forecasting

Probabilistic forecasting provides **a distribution of possible future values** rather than a single point estimate, allowing for uncertainty quantification. Instead of predicting only the most likely outcome, it estimates a range of potential outcomes along with their probabilities by outputting a **probability distribution**.

A probabilistic forecast models the conditional probability distribution of a future value $(Y_t)$ given past observations $(\mathcal{H}_{t-1})$. This can be expressed as:  

$$P(Y_t \mid \mathcal{H}_{t-1})$$

where $(\mathcal{H}_{t-1})$ represents the historical data up to time $(t-1)$. Instead of a single prediction $(\hat{Y}_t)$, the model estimates a full probability distribution $(f(Y_t \mid \mathcal{H}_{t-1}))$, which can take different parametric forms, such as a Gaussian:

$$Y_t \mid \mathcal{H}_{t-1} \sim \mathcal{N}(\mu_t, \sigma_t^2)$$

where $(\mu_t)$ is the predicted mean and $(\sigma_t^2)$ represents the uncertainty in the forecast.

Probabilistic forecasting can be handled through various approaches, including **variance forecasters**, **quantile forecasters**, **interval forecasters** or **distribution forecasters**, each capturing uncertainty differently.

In this notebook, we try to forecast the target location by a gaussian density function (or a mixture), the model output follows the form:

```python
[
    {
        "step": (k + 1) * step,
        "prediction": {
              "density": {
                            "name": "normal",
                            "params": {"loc": y_mean, "scale": y_var}
                          },
              "weight": weight
              }, ...
    }
    for k in range(0, horizon // step)
]
```

A **mixture density**, such as the gaussion mixture $\sum_{i=1}^{K} w_i \mathcal{N}(Y_t | \mu_i, \sigma_i^2)$ allows for capturing multi-modal distributions and approximate more complex distributions.

![proba_forecast_v3](https://github.com/Tarandro/image_broad/blob/main/proba_forecast_v3.png?raw=true)


**Probabilistic Forecasting** is particularly valuable in supply chain management. Below are some interesting resources for a deeper understanding:  

- [Probabilistic Forecasting](https://www.lokad.com/probabilistic-forecasting-definition/) ‚Äì Overview of probabilistic forecasting and its applications.  
- [Quantile Forecasting](https://www.lokad.com/quantile-regression-time-series-definition/) ‚Äì Explanation of quantile-based forecasting methods.  
- **Evaluation Metrics:**  
  - [Continuous Ranked Probability Score (CRPS)](https://www.lokad.com/continuous-ranked-probability-score/)  
  - [Cross-Entropy](https://www.lokad.com/cross-entropy-definition/)  
  - [Pinball Loss](https://www.lokad.com/pinball-loss-function-definition/)

In [23]:
import numpy as np
import pandas as pd
import os
from datetime import datetime, timezone, timedelta

from condorgame.price_provider import shared_pricedb
from condorgame.tracker import TrackerBase
from condorgame.tracker_evaluator import TrackerEvaluator
from condorgame.examples.utils import load_test_prices_once, load_initial_price_histories_once
from condorgame.debug.plots import plot_quarantine, plot_prices, plot_log_return_prices, plot_scores

## What You Must Predict

Trackers must predict the **cumulative probability distribution of the future price**.

For an asset with current price $P_t$, the tracker must output at each future step $k$ (e.g., +5 minutes, +10 minutes, ‚Ä¶) a full predictive distribution over the future price $P_{t+k}$.

Example:
If you assume Gaussian increments with drift:

$$P_{t+k} \sim \mathcal{N}(P_t + k\mu, k\sigma^2)$$

# Gaussian Step Tracker

A simple benchmark that predicts future prices by assuming they follow a Gaussian (normal) distribution estimated from recent historical data. It models the relative price change over each prediction step.

### **Key Ideas**  

- Historical prices are transformed into returns.
- The tracker estimates:
    - Drift: mean historical returns ùúá
    - Volatility: standard deviation historical returns ùúé
- For each future step ùëò, it outputs a normal density:
$$P_{t+k} \sim \mathcal{N}(P_t + k\mu, k\sigma^2)$$

Each density prediction must comply with the [density_pdf](https://github.com/microprediction/densitypdf/blob/main/densitypdf/__init__.py) specification.

In [24]:
class GaussianStepTracker(TrackerBase):
    """
    A benchmark tracker that models future prices as Gaussian-distributed.

    At time t with price Pt, for each forecast step k, the tracker returns a normal distribution
    N(Pt + k*mu, np.sqrt(k * sigma**2)) where:
        - mu    = mean historical returns
        - sigma = std historical returns

    This is a cumulative price-distribution forecasting  between consecutive steps.
    """
    def __init__(self):
        super().__init__()

    def predict(self, asset: str, horizon: int, step: int):

        # Retrieve past prices with sampling resolution equal to the prediction step.
        pairs = self.prices.get_prices(asset, days=5, resolution=step)
        if not pairs:
            return []

        _, past_prices = zip(*pairs)

        if len(past_prices) < 3:
            return []

        # Compute historical returns
        returns = np.diff(past_prices)

        # Estimate drift (mean returns) and volatility (std dev of returns)
        mu = float(np.mean(returns))
        sigma = float(np.std(returns))

        # Current price is last past price
        current_price = past_prices[-1]

        if sigma <= 0:
            return []

        num_segments = horizon // step

        # Produce one Gaussian for each future time step
        # The returned list must be compatible with the `density_pdf` library.
        distributions = []
        for k in range(1, num_segments + 1):
            distributions.append({
                "step": k * step,
                "type": "mixture",
                "components": [{
                    "density": {
                        "type": "builtin",             # Note: use 'builtin' instead of 'scipy' for speed
                        "name": "norm",  
                        "params": {"loc": current_price + k*mu, "scale": np.sqrt(k*sigma**2)}
                    },
                    "weight": 1
                }]
            })

        return distributions

## Configurations

In [25]:
##########
# For each asset and historical timestamp, compute a 24-hour density forecast 
# at 5-minute intervals and evaluate the tracker against actual outcomes.

assets = ["SOL", "BTC"]

# Prediction horizon = 24h (in seconds)
HORIZON = 86400
# Prediction step = 5 minutes (in seconds)
STEP = 300
# How often we evaluate the tracker (in seconds)
INTERVAL = 3600

# Base directory where all evaluation results will be stored
base_dir_results = "results"
os.makedirs(base_dir_results, exist_ok=True)

# End timestamp for the test data
# evaluation_end: datetime = datetime.now(timezone.utc)
evaluation_end: datetime = datetime(2025, 11, 15, 12, 00, 00, tzinfo=timezone.utc)

# Number of days of test data to load
days = 30
# Amount of warm-up history to load
days_history = 30

## Data

In [26]:
## Load the last N days of price data (test period)
test_asset_prices = load_test_prices_once(
    assets, shared_pricedb, evaluation_end, days=days
)

## Provide the tracker with initial historical data (for the first tick):
## load prices from the last H days up to N days ago
initial_histories = load_initial_price_histories_once(
    assets, shared_pricedb, evaluation_end, days_history=days_history, days_offset=days
)

## Run live simulation on historic data

In [27]:
# Setup tracker + evaluator
tracker_evaluator = TrackerEvaluator(GaussianStepTracker())

for asset, history_price in test_asset_prices.items():

    # First tick: initialize historical data
    tracker_evaluator.tick({asset: initial_histories[asset]})

    prev_ts = 0
    predict_count = 0
    for ts, price in history_price:
        # Feed the new tick
        tracker_evaluator.tick({asset: [(ts, price)]})

        # Evaluate prediction every hour (ts is in second)
        if ts - prev_ts >= INTERVAL:
            prev_ts = ts
            predictions_evaluated = tracker_evaluator.predict(asset, HORIZON, STEP)

            # Periodically display results
            if predictions_evaluated and predict_count % 200 == 0:
                print(f"My average log-likelihood score {asset}: {tracker_evaluator.overall_likelihood_score_asset(asset):.4f}")
                print(f"My recent average log-likelihood score {asset}: {tracker_evaluator.recent_likelihood_score_asset(asset):.4f}")
            predict_count += 1

tracker_name = tracker_evaluator.tracker.__class__.__name__
print(f"\nTracker {tracker_name}:"
      f"\nFinal average log-likelihood score: {tracker_evaluator.overall_likelihood_score():.4f}")

current_results_dir = tracker_evaluator.to_json(horizon=HORIZON, step=STEP,
                                                interval=INTERVAL, base_dir=base_dir_results)

# Plot scoring timeline
timestamped_scores = tracker_evaluator.scores
plot_scores(timestamped_scores)

My average log-likelihood score SOL: -3.0272
My recent average log-likelihood score SOL: -2.9962
My average log-likelihood score SOL: -2.9036
My recent average log-likelihood score SOL: -3.0106
My average log-likelihood score SOL: -2.9434
My recent average log-likelihood score SOL: -2.9403
My average log-likelihood score BTC: -8.9081
My recent average log-likelihood score BTC: -8.9479
My average log-likelihood score BTC: -8.7922
My recent average log-likelihood score BTC: -8.9228
My average log-likelihood score BTC: -8.7872
My recent average log-likelihood score BTC: -8.7357

Tracker GaussianStepTracker:
Final average log-likelihood score: -5.8873
[‚úî] Tracker results saved to results\2025-10-17T12-00-00_to_2025-11-15T11-00-00\GaussianStepTracker_h86400_s300.json


In [28]:
## Density forecast mapped into price space (for the last asset and last prediction)
print("Log-likelihood score:", tracker_evaluator.scores[asset][-1][1])
plot_quarantine(asset, predictions_evaluated[0], tracker_evaluator.tracker.prices, mode="point")

Log-likelihood score: -8.580738584623823


# Tracker Comparison

In [29]:
from condorgame.examples.utils import load_all_results, plot_tracker_comparison

In [30]:
df_all = load_all_results(current_results_dir, horizon=HORIZON, step=STEP)
df_all

[‚úî] Found 3 files:
   - GaussianStepTracker_10stepshistoric_h86400_s300.json
   - GaussianStepTracker_30stepshistoric_h86400_s300.json
   - GaussianStepTracker_h86400_s300.json


Unnamed: 0,tracker,asset,ts,score,time
0,GaussianStepTracker_10stepshistoric,SOL,1760702400,-5.245349,2025-10-17 12:00:00+00:00
1,GaussianStepTracker_10stepshistoric,SOL,1760706000,-6.497065,2025-10-17 13:00:00+00:00
2,GaussianStepTracker_10stepshistoric,SOL,1760709600,-3.925253,2025-10-17 14:00:00+00:00
3,GaussianStepTracker_10stepshistoric,SOL,1760713200,-3.732070,2025-10-17 15:00:00+00:00
4,GaussianStepTracker_10stepshistoric,SOL,1760716800,-2.906332,2025-10-17 16:00:00+00:00
...,...,...,...,...,...
4171,GaussianStepTracker,BTC,1763190000,-8.592705,2025-11-15 07:00:00+00:00
4172,GaussianStepTracker,BTC,1763193600,-8.523666,2025-11-15 08:00:00+00:00
4173,GaussianStepTracker,BTC,1763197200,-8.712014,2025-11-15 09:00:00+00:00
4174,GaussianStepTracker,BTC,1763200800,-8.554414,2025-11-15 10:00:00+00:00


In [31]:
# Tracker comparison all assets
plot_tracker_comparison(df_all)

In [32]:
# Tracker comparison one assets
plot_tracker_comparison(df_all, 'SOL')

In [33]:
# Tracker comparison one assets
plot_tracker_comparison(df_all, 'BTC')