# MIRAGE++ Notebook 3: Application to Synthetic Data

In this notebook, we'll use MIRAGE++ on synthetic financial data. We'll generate fake alpha signals and returns, fit the model, and compare it to OLS, Ridge, and Lasso. Every step is explained for clarity.

## 1. Generating Synthetic Alpha Signals and Returns

Suppose we have 10 different trading signals (features) and want to predict future returns. We'll create a ground-truth weight vector and add noise to simulate real markets.

In [None]:
import numpy as np
np.random.seed(42)
n_samples = 200
n_signals = 10
X = np.random.randn(n_samples, n_signals)
true_theta = np.random.dirichlet(np.ones(n_signals))
y = X @ true_theta + np.random.randn(n_samples) * 0.1
print('True weights:', true_theta)

## 2. Visualize the True Weights

Let's see what the 'real' signal blend looks like.

In [None]:
import matplotlib.pyplot as plt
plt.bar(range(n_signals), true_theta)
plt.xlabel('Signal Index')
plt.ylabel('True Weight')
plt.title('Ground Truth Signal Weights')
plt.show()

## 3. Fit MIRAGE++ to the Data

We'll use the code from the previous notebook to fit MIRAGE++ and track the loss.

In [None]:
def entropy(theta, epsilon=1e-8):
    theta = np.clip(theta, epsilon, 1.0)
    return -np.sum(theta * np.log(theta))

def loss(X, y, theta, lam):
    preds = X @ theta
    mse = np.mean((preds - y)**2)
    ent = entropy(theta)
    return mse + lam * ent

def gradient(X, y, theta, lam):
    preds = X @ theta
    grad_mse = 2 * X.T @ (preds - y) / len(y)
    grad_entropy = -1 - np.log(np.clip(theta, 1e-8, 1.0))
    return grad_mse + lam * grad_entropy

def mirror_descent_step(grad, theta_t, eta):
    theta_new = theta_t * np.exp(-eta * grad)
    theta_new = np.clip(theta_new, 1e-12, None)
    return theta_new / np.sum(theta_new)

def fit_mirage(X, y, lam=0.1, eta=0.2, n_iters=300):
    n = X.shape[1]
    theta = np.ones(n) / n
    loss_hist = []
    for i in range(n_iters):
        grad = gradient(X, y, theta, lam)
        theta = mirror_descent_step(grad, theta, eta)
        loss_hist.append(loss(X, y, theta, lam))
    return theta, loss_hist

theta_mirage, loss_hist = fit_mirage(X, y, lam=0.1, eta=0.2, n_iters=300)
print('MIRAGE++ weights:', theta_mirage)

## 4. Visualize MIRAGE++ Weights vs. Ground Truth

Let's compare the learned weights to the true weights.

In [None]:
plt.bar(np.arange(n_signals)-0.15, true_theta, width=0.3, label='True')
plt.bar(np.arange(n_signals)+0.15, theta_mirage, width=0.3, label='MIRAGE++')
plt.xlabel('Signal Index')
plt.ylabel('Weight')
plt.title('True vs. MIRAGE++ Weights')
plt.legend()
plt.show()

## 5. Compare to OLS, Ridge, and Lasso

Let's fit standard models and compare their weights and prediction errors.

In [None]:
from sklearn.linear_model import LinearRegression, Ridge, Lasso
ols = LinearRegression().fit(X, y)
ridge = Ridge(alpha=0.1).fit(X, y)
lasso = Lasso(alpha=0.1).fit(X, y)

plt.figure(figsize=(10,4))
plt.bar(np.arange(n_signals)-0.3, ols.coef_, width=0.2, label='OLS')
plt.bar(np.arange(n_signals)-0.1, ridge.coef_, width=0.2, label='Ridge')
plt.bar(np.arange(n_signals)+0.1, lasso.coef_, width=0.2, label='Lasso')
plt.bar(np.arange(n_signals)+0.3, theta_mirage, width=0.2, label='MIRAGE++')
plt.xlabel('Signal Index')
plt.ylabel('Weight')
plt.title('Model Weights Comparison')
plt.legend()
plt.show()

## 6. Prediction Error Comparison

Let's compare mean squared error (MSE) for each model.

In [None]:
def mse(y_true, y_pred):
    return np.mean((y_true - y_pred)**2)

y_pred_ols = ols.predict(X)
y_pred_ridge = ridge.predict(X)
y_pred_lasso = lasso.predict(X)
y_pred_mirage = X @ theta_mirage

print('OLS MSE:', mse(y, y_pred_ols))
print('Ridge MSE:', mse(y, y_pred_ridge))
print('Lasso MSE:', mse(y, y_pred_lasso))
print('MIRAGE++ MSE:', mse(y, y_pred_mirage))

## 7. Weight Diversity and Interpretability

MIRAGE++ weights are always positive and sum to 1, making them easy to interpret as probabilities or portfolio allocations.

Let's check this property.

In [None]:
print('MIRAGE++ weights (should be positive):', theta_mirage)
print('Sum of MIRAGE++ weights:', np.sum(theta_mirage))

## 8. Summary

- MIRAGE++ finds a diversified, interpretable blend of signals.
- It performs competitively with OLS, Ridge, and Lasso.
- Weights are always positive and sum to 1.

In the next notebook, we'll apply MIRAGE++ to real financial data!