# TRADES Paper Complete Reproduction

Reproduces **Table 1** and key figures from [TRADES: Generating Realistic Market Simulations with Diffusion Models](https://arxiv.org/abs/XXXX)

## Quick Start Guide:

1. **Set configuration** in Cell 2 (choose stocks, baselines, time window)
2. **Upload LOBSTER data** for the dates you need
3. **Run all cells** - the notebook handles optional baselines automatically
4. **View results** at the end - averaged predictive scores and plots

## What Gets Reproduced:

### Core (Always):
- **Table 1** values for selected stocks (averaged over 2 days)
- **Predictive Score**: TRADES vs Market Replay
- **Figure 2**: PCA distribution coverage
- **Figure 3**: Stylized facts validation

### Optional (if enabled):
- **Table 1** full comparison (TRADES vs IABS vs CGAN)
- **Figure 3**: Multi-method stylized facts
- **Figure 5**: Volume distribution comparison

---

## Expected Results:

| Method        | Tesla | Intel |
|--------------|-------|-------|
| Market Replay| 0.923 | 0.149 |
| IABS         | 1.870 | 1.866 |
| CGAN         | 3.453 | 0.699 |
| **TRADES**   | **1.213** | **0.307** |

*Note: Your results may vary slightly due to random seeds and different time windows*


In [None]:
# ============================================================
# CONFIGURATION
# ============================================================

# Stocks to simulate
RUN_INTC = True   # Intel (recommended - main paper focus)
RUN_TSLA = False  # Tesla (optional - full reproduction)

# Optional baselines (increase runtime significantly)
RUN_IABS = False  # Agent-based simulation (~1 hour total for 2 days)
RUN_CGAN = False  # Wasserstein GAN (~7 hours total for 2 days)

# Simulation parameters
START_TIME = '10:00:00'
END_TIME = '12:00:00'  # 2-hour window (paper default)
# For faster testing: END_TIME = '10:30:00' (30-min window)

# Display configuration
stocks = []
if RUN_INTC: stocks.append('INTC')
if RUN_TSLA: stocks.append('TSLA')

baselines = ['TRADES', 'Market Replay']
if RUN_IABS: baselines.append('IABS')
if RUN_CGAN: baselines.append('CGAN')

print("="*60)
print("TRADES PAPER REPRODUCTION - CONFIGURATION")
print("="*60)
print(f"\nStocks: {', '.join(stocks) if stocks else 'None selected!'}")
print(f"Baselines: {', '.join(baselines)}")
print(f"Time window: {START_TIME} - {END_TIME}")

# Estimate runtime
window_hours = 2 if END_TIME == '12:00:00' else 0.5
trades_time = window_hours * 3.5  # TRADES takes ~3.5x real-time
replay_time = window_hours * 0.08  # Replay is fast
iabs_time = window_hours * 0.5 if RUN_IABS else 0
cgan_time = window_hours * 3.5 if RUN_CGAN else 0

total_per_day = trades_time + replay_time + iabs_time + cgan_time
total_time = total_per_day * 2 * len(stocks)  # 2 days per stock

print(f"\nEstimated runtime:")
print(f"  Per day per stock: ~{total_per_day:.1f} hours")
print(f"  Total (2 days x {len(stocks)} stock(s)): ~{total_time:.1f} hours")
print(f"\n  Breakdown:")
print(f"    - TRADES: ~{trades_time:.1f}h per day")
print(f"    - Market Replay: ~{replay_time*60:.0f} min per day")
if RUN_IABS:
    print(f"    - IABS: ~{iabs_time:.1f}h per day")
if RUN_CGAN:
    print(f"    - CGAN: ~{cgan_time:.1f}h per day")
print("="*60)

if not stocks:
    print("\n‚ö†Ô∏è  WARNING: No stocks selected! Set RUN_INTC=True or RUN_TSLA=True")


---\n# Section 1: Environment Setup

## 1.1. Clone Repository

In [None]:
# Clone D-MEADS repository
!git clone https://github.com/FinancialComputingUCL/D-MEADS.git
%cd D-MEADS
!pwd

## 1.2. Install Dependencies

In [None]:
!pip install -r requirements.txt

print("‚úÖ Dependencies installed")

## 1.3. Check Resources

In [None]:
import torch

print("GPU:", "Available" if torch.cuda.is_available() else "Not available")
if torch.cuda.is_available():
    print(f"  - {torch.cuda.get_device_name(0)}")
    print(f"  - {torch.cuda.get_device_properties(0).total_memory / 1e9:.1f} GB")

print("\nRAM:")
!free -h | grep Mem

print("\n‚ö†Ô∏è  Requirements:")
print("  - 2-hour simulations: ~25GB RAM (Colab Pro recommended)")
print("  - 30-min simulations: ~12GB RAM (free Colab may work)")

## 1.4. Download Checkpoints

In [None]:
import os

!mkdir -p data/checkpoints/TRADES
!mkdir -p data/checkpoints/CGAN

# TRADES checkpoints (download from Google Drive)
print("Downloading TRADES checkpoints...")
checkpoint_url = "https://drive.google.com/drive/folders/1fg5G9KzmzC6E4FUYSCjObJ7sCEdjo43W"
!gdown --folder {checkpoint_url} -O data/checkpoints/TRADES/ --quiet

print("\nCheckpoint status:")
!ls -lh data/checkpoints/TRADES/*.ckpt 2>/dev/null | wc -l | xargs -I {} echo "  TRADES: {} checkpoint(s)"
!ls -lh data/checkpoints/CGAN/*.ckpt 2>/dev/null | wc -l | xargs -I {} echo "  CGAN: {} checkpoint(s)"

print("\nIf download failed, manually download from:")
print("  https://drive.google.com/drive/folders/1fg5G9KzmzC6E4FUYSCjObJ7sCEdjo43W")

## 1.5. Update Configuration Files

Fix the DATE_TRADING_DAYS to match our simulation dates.

In [None]:
# Update constants.py
with open('constants.py', 'r') as f:
    content = f.read()

content = content.replace(
    'DATE_TRADING_DAYS = ["2012-06-21", "2012-06-21"]',
    'DATE_TRADING_DAYS = ["2015-01-01", "2015-03-31_10"]'
)

with open('constants.py', 'w') as f:
    f.write(content)

print("‚úÖ Configuration updated:")
print("   DATE_TRADING_DAYS = ['2015-01-01', '2015-03-31_10']")

---
# Section 2: Data Upload

**Required LOBSTER files for each stock:**

For each date (2015-01-29 and 2015-01-30), you need:
- `{STOCK}_2015-01-{DAY}_34200000_57600000_message_10.csv`
- `{STOCK}_2015-01-{DAY}_34200000_57600000_orderbook_10.csv`

Where:
- `34200000` = 09:30:00 (market open in seconds)
- `57600000` = 16:00:00 (market close)
- `10` = top 10 levels of the order book

**Total files needed:**
- INTC: 4 files (2 days √ó 2 file types)
- TSLA: 4 files (2 days √ó 2 file types)


## 2.1. Upload INTC Data (if selected)

In [None]:
from google.colab import files
import os

if not RUN_INTC:
    print("‚è≠Ô∏è  INTC not selected, skipping...")
else:
    data_dir = "data/INTC/INTC_2015-01-01_2015-03-31_10"
    os.makedirs(data_dir, exist_ok=True)

    print("Upload INTC files (4 files total):")
    print("  - INTC_2015-01-29_34200000_57600000_message_10.csv")
    print("  - INTC_2015-01-29_34200000_57600000_orderbook_10.csv")
    print("  - INTC_2015-01-30_34200000_57600000_message_10.csv")
    print("  - INTC_2015-01-30_34200000_57600000_orderbook_10.csv")

    uploaded = files.upload()

    for filename in uploaded.keys():
        !mv "{filename}" "{data_dir}/"
        print(f"‚úì {filename}")

    print(f"\n‚úÖ INTC data ready in {data_dir}")
    !ls -lh {data_dir}

## 2.2. Upload TSLA Data (if selected)

In [None]:
if not RUN_TSLA:
    print("‚è≠Ô∏è  TSLA not selected, skipping...")
else:
    data_dir = "data/TSLA/TSLA_2015-01-01_2015-03-31_10"
    os.makedirs(data_dir, exist_ok=True)

    print("Upload TSLA files (4 files total):")
    print("  - TSLA_2015-01-29_34200000_57600000_message_10.csv")
    print("  - TSLA_2015-01-29_34200000_57600000_orderbook_10.csv")
    print("  - TSLA_2015-01-30_34200000_57600000_message_10.csv")
    print("  - TSLA_2015-01-30_34200000_57600000_orderbook_10.csv")

    uploaded = files.upload()

    for filename in uploaded.keys():
        !mv "{filename}" "{data_dir}/"
        print(f"‚úì {filename}")

    print(f"\n‚úÖ TSLA data ready in {data_dir}")
    !ls -lh {data_dir}

## 2.3. Clean Data Files

LOBSTER data sometimes contains a "null" column that causes errors. This cell removes it.

In [None]:
import pandas as pd
import glob

def clean_lobster_data(stock):
    data_dir = f"data/{stock}/{stock}_2015-01-01_2015-03-31_10"
    csv_files = glob.glob(f"{data_dir}/*.csv")

    if not csv_files:
        print(f"‚ö†Ô∏è  No files found for {stock}")
        return

    print(f"Cleaning {stock} data ({len(csv_files)} files)...")
    for csv_file in csv_files:
        df = pd.read_csv(csv_file, header=None, na_values=['null', 'NULL', 'Null'])
        null_ratio = df.iloc[:, -1].isnull().sum() / len(df)

        if null_ratio > 0.9:
            print(f"  ‚úì {os.path.basename(csv_file)}: removed null column")
            df = df.iloc[:, :-1]
            df.to_csv(csv_file, header=False, index=False)
        else:
            print(f"  - {os.path.basename(csv_file)}: OK")

    print(f"‚úÖ {stock} data cleaned\n")

if RUN_INTC:
    clean_lobster_data('INTC')
if RUN_TSLA:
    clean_lobster_data('TSLA')

print("All data ready for simulation!")

---
# Section 3: Simulations for 2015-01-29

We'll run 4 types of simulations (2 required, 2 optional):
1. **TRADES** (required) - Diffusion-based generation
2. **Market Replay** (required) - Real historical data
3. **IABS** (optional) - Agent-based simulation
4. **CGAN** (optional) - GAN-based generation


In [None]:
import glob
import os
import time

def get_latest_log(pattern):
    """Get the most recent log directory matching pattern"""
    logs = sorted(glob.glob(pattern), key=os.path.getmtime, reverse=True)
    return logs[0] if logs else None

def run_simulation(stock, date, method, start_time, end_time):
    """
    Run a simulation and return the log directory.

    Args:
        stock: Stock ticker (e.g., 'INTC', 'TSLA')
        date: Date in format 'YYYY-MM-DD' (e.g., '2015-01-29')
        method: One of 'TRADES', 'Replay', 'IABS', 'CGAN'
        start_time: Start time (e.g., '10:00:00')
        end_time: End time (e.g., '12:00:00')

    Returns:
        Path to log directory containing processed_orders.csv
    """
    date_str = date.replace('-', '')
    print(f"\n{'='*60}")
    print(f"Running {method} for {stock} on {date}")
    print(f"Time window: {start_time} - {end_time}")
    print(f"{'='*60}\n")

    start = time.time()

    if method == "TRADES":
        # TRADES: diffusion ON + TRADES model
        # Log dir: world_agent_{stock}_{date}_{time}_{seed}_DDIM_...
        cmd = f"python -u ABIDES/abides.py -c world_agent_sim -t {stock} -date {date} -d True -m TRADES -st '{start_time}' -et '{end_time}'"
        print(f"Command: {cmd}\n")
        !{cmd}
        log_dir = get_latest_log(f"ABIDES/log/world_agent_{stock}_*TRADES*") or get_latest_log(f"ABIDES/log/world_agent_{stock}_*")

    elif method == "Replay":
        # Market Replay: diffusion OFF (omit -d flag!)
        # Log dir: market_replay_{stock}_{date}_{time}_{seed}
        # IMPORTANT: Do NOT use -d False (type=bool bug makes it True!)
        cmd = f"python -u ABIDES/abides.py -c world_agent_sim -t {stock} -date {date} -st '{start_time}' -et '{end_time}'"
        print(f"Command: {cmd}\n")
        !{cmd}
        log_dir = get_latest_log(f"ABIDES/log/market_replay_{stock}_*")

    elif method == "IABS":
        # IABS: agent-based simulation (rmsc03 config)
        # Log dir: IABS_{stock}_{date}_{time}
        # Note: -d is short for --historical-date in rmsc03 (different from world_agent_sim!)
        cmd = f"python -u ABIDES/abides.py -c rmsc03 -t {stock} -d {date} --start-time '{start_time}' --end-time '{end_time}'"
        print(f"Command: {cmd}\n")
        !{cmd}
        log_dir = get_latest_log(f"ABIDES/log/IABS_{stock}_*")

    elif method == "CGAN":
        # CGAN: diffusion ON + CGAN model
        # Log dir: world_agent_{stock}_{date}_{time}_{seed}_DDIM_..._CGAN_...
        cmd = f"python -u ABIDES/abides.py -c world_agent_sim -t {stock} -date {date} -d True -m CGAN -st '{start_time}' -et '{end_time}'"
        print(f"Command: {cmd}\n")
        !{cmd}
        log_dir = get_latest_log(f"ABIDES/log/world_agent_{stock}_*CGAN*") or get_latest_log(f"ABIDES/log/world_agent_{stock}_*")

    else:
        raise ValueError(f"Unknown method: {method}")

    elapsed = time.time() - start
    print(f"\n{'='*60}")
    print(f"‚úÖ {method} simulation completed in {elapsed/60:.1f} minutes")
    print(f"üìÅ Log directory: {log_dir}")
    print(f"{'='*60}\n")

    if not log_dir or not os.path.exists(f"{log_dir}/processed_orders.csv"):
        print(f"‚ö†Ô∏è  WARNING: processed_orders.csv not found in {log_dir}")
        print(f"   Check logs for errors.")

    return log_dir

print("‚úÖ Simulation helper functions defined (with verified commands)")
print("\nüìù Command patterns:")
print("  TRADES: -c world_agent_sim -d True -m TRADES")
print("  Replay: -c world_agent_sim (omit -d flag)")
print("  IABS:   -c rmsc03 -d <date>")
print("  CGAN:   -c world_agent_sim -d True -m CGAN")


## 3.1. INTC Simulations - Day 1/29

In [None]:
if not RUN_INTC:
    print("‚è≠Ô∏è  Skipping INTC")
    INTC_DAY1_RESULTS = None
else:
    INTC_DAY1_RESULTS = {}

    # TRADES (required)
    log_dir = run_simulation('INTC', '2015-01-29', 'TRADES', START_TIME, END_TIME)
    INTC_DAY1_RESULTS['TRADES'] = log_dir
    print(f"‚úÖ TRADES: {log_dir}")

    # Market Replay (required)
    log_dir = run_simulation('INTC', '2015-01-29', 'Replay', START_TIME, END_TIME)
    INTC_DAY1_RESULTS['Replay'] = log_dir
    print(f"‚úÖ Replay: {log_dir}")

# IABS (optional)
if RUN_IABS:
    log_dir = run_simulation('INTC', '2015-01-29', 'IABS', START_TIME, END_TIME)
    INTC_DAY1_RESULTS['IABS'] = log_dir
    print(f"‚úÖ IABS: {log_dir}")
else:
    print("‚è≠Ô∏è  IABS skipped (not enabled)")

# CGAN (optional)
if RUN_CGAN:
    log_dir = run_simulation('INTC', '2015-01-29', 'CGAN', START_TIME, END_TIME)
    INTC_DAY1_RESULTS['CGAN'] = log_dir
    print(f"‚úÖ CGAN: {log_dir}")
else:
    print("‚è≠Ô∏è  CGAN skipped (not enabled)")

print(f"\n{'='*60}")
print("INTC Day 1/29 Complete!")
print(f"{'='*60}")


## 3.2. TSLA Simulations - Day 1/29 (Optional)

In [None]:
if not RUN_TSLA:
    print("‚è≠Ô∏è  Skipping TSLA")
    TSLA_DAY1_RESULTS = None
else:
    TSLA_DAY1_RESULTS = {}

    log_dir = run_simulation('TSLA', '2015-01-29', 'TRADES', START_TIME, END_TIME)
    TSLA_DAY1_RESULTS['TRADES'] = log_dir

    log_dir = run_simulation('TSLA', '2015-01-29', 'Replay', START_TIME, END_TIME)
    TSLA_DAY1_RESULTS['Replay'] = log_dir

if RUN_IABS:
    log_dir = run_simulation('TSLA', '2015-01-29', 'IABS', START_TIME, END_TIME)
    TSLA_DAY1_RESULTS['IABS'] = log_dir

if RUN_CGAN:
    log_dir = run_simulation('TSLA', '2015-01-29', 'CGAN', START_TIME, END_TIME)
    TSLA_DAY1_RESULTS['CGAN'] = log_dir

print("\nTSLA Day 1/29 Complete!")


---
# Section 4: Simulations for 2015-01-30

Same process for the second day.

## 4.1. INTC Simulations - Day 1/30

In [None]:
if not RUN_INTC:
    print("‚è≠Ô∏è  Skipping INTC")
    INTC_DAY2_RESULTS = None
else:
    INTC_DAY2_RESULTS = {}

    log_dir = run_simulation('INTC', '2015-01-30', 'TRADES', START_TIME, END_TIME)
    INTC_DAY2_RESULTS['TRADES'] = log_dir

    log_dir = run_simulation('INTC', '2015-01-30', 'Replay', START_TIME, END_TIME)
    INTC_DAY2_RESULTS['Replay'] = log_dir

if RUN_IABS:
    log_dir = run_simulation('INTC', '2015-01-30', 'IABS', START_TIME, END_TIME)
    INTC_DAY2_RESULTS['IABS'] = log_dir

if RUN_CGAN:
    log_dir = run_simulation('INTC', '2015-01-30', 'CGAN', START_TIME, END_TIME)
    INTC_DAY2_RESULTS['CGAN'] = log_dir

print("INTC Day 1/30 Complete!")


## 4.2. TSLA Simulations - Day 1/30 (Optional)

In [None]:
if not RUN_TSLA:
    print("‚è≠Ô∏è  Skipping TSLA")
    TSLA_DAY2_RESULTS = None
else:
    TSLA_DAY2_RESULTS = {}

    log_dir = run_simulation('TSLA', '2015-01-30', 'TRADES', START_TIME, END_TIME)
    TSLA_DAY2_RESULTS['TRADES'] = log_dir

    log_dir = run_simulation('TSLA', '2015-01-30', 'Replay', START_TIME, END_TIME)
    TSLA_DAY2_RESULTS['Replay'] = log_dir

if RUN_IABS:
    log_dir = run_simulation('TSLA', '2015-01-30', 'IABS', START_TIME, END_TIME)
    TSLA_DAY2_RESULTS['IABS'] = log_dir

if RUN_CGAN:
    log_dir = run_simulation('TSLA', '2015-01-30', 'CGAN', START_TIME, END_TIME)
    TSLA_DAY2_RESULTS['CGAN'] = log_dir

print("TSLA Day 1/30 Complete!")


---
# Section 5: Evaluation

Calculate predictive scores for each day, then average them.

## 5.1. Compute Predictive Scores

In [None]:
import sys
sys.path.append('evaluation/quantitative_eval')
import predictive_lstm

def compute_predictive_score(real_path, generated_path, label=""):
    """Compute predictive score for a single day"""
    print(f"\n{'='*60}")
    print(f"Computing Predictive Score: {label}")
    print(f"{'='*60}")
    print(f"Real data: {real_path}")
    print(f"Generated: {generated_path}")

    # Redirect output to capture MSE
    from io import StringIO
    old_stdout = sys.stdout
    sys.stdout = captured = StringIO()

    predictive_lstm.main(real_path, generated_path)

    sys.stdout = old_stdout
    output = captured.getvalue()

    # Extract MSE from output
    for line in output.split('\n'):
        if 'Test MSE:' in line:
            mse = float(line.split(':')[1].strip())
            print(f"  ‚Üí MSE: {mse:.4f}")
            return mse

    print("  ‚ö†Ô∏è  Could not extract MSE!")
    return None

# Storage for all scores
ALL_SCORES = {}

print("\n" + "="*80)
print("PREDICTIVE SCORE EVALUATION")
print("="*80)


## 5.2. INTC Predictive Scores

In [None]:
if not RUN_INTC:
    print("‚è≠Ô∏è  INTC not run")
else:
    ALL_SCORES['INTC'] = {'day1': {}, 'day2': {}}

    # Day 1
    if INTC_DAY1_RESULTS:
        real_path = f"{INTC_DAY1_RESULTS['Replay']}/processed_orders.csv"
        trades_path = f"{INTC_DAY1_RESULTS['TRADES']}/processed_orders.csv"
        mse = compute_predictive_score(real_path, trades_path, "INTC Day 1/29 - TRADES")
        ALL_SCORES['INTC']['day1']['TRADES'] = mse

        if RUN_CGAN and 'CGAN' in INTC_DAY1_RESULTS:
            cgan_path = f"{INTC_DAY1_RESULTS['CGAN']}/processed_orders.csv"
            mse = compute_predictive_score(real_path, cgan_path, "INTC Day 1/29 - CGAN")
            ALL_SCORES['INTC']['day1']['CGAN'] = mse

        # Market Replay score (baseline)
        mse = compute_predictive_score(real_path, real_path, "INTC Day 1/29 - Market Replay")
        ALL_SCORES['INTC']['day1']['Replay'] = mse

    # Day 2
    if INTC_DAY2_RESULTS:
        real_path = f"{INTC_DAY2_RESULTS['Replay']}/processed_orders.csv"
        trades_path = f"{INTC_DAY2_RESULTS['TRADES']}/processed_orders.csv"
        mse = compute_predictive_score(real_path, trades_path, "INTC Day 1/30 - TRADES")
        ALL_SCORES['INTC']['day2']['TRADES'] = mse

        if RUN_CGAN and 'CGAN' in INTC_DAY2_RESULTS:
            cgan_path = f"{INTC_DAY2_RESULTS['CGAN']}/processed_orders.csv"
            mse = compute_predictive_score(real_path, cgan_path, "INTC Day 1/30 - CGAN")
            ALL_SCORES['INTC']['day2']['CGAN'] = mse

        mse = compute_predictive_score(real_path, real_path, "INTC Day 1/30 - Market Replay")
        ALL_SCORES['INTC']['day2']['Replay'] = mse

    print("\nINTC scores collected!")


## 5.3. TSLA Predictive Scores (Optional)

In [None]:
if not RUN_TSLA:
    print("‚è≠Ô∏è  TSLA not run")
else:
    ALL_SCORES['TSLA'] = {'day1': {}, 'day2': {}}

    # Day 1
    if TSLA_DAY1_RESULTS:
        real_path = f"{TSLA_DAY1_RESULTS['Replay']}/processed_orders.csv"
        trades_path = f"{TSLA_DAY1_RESULTS['TRADES']}/processed_orders.csv"
        mse = compute_predictive_score(real_path, trades_path, "TSLA Day 1/29 - TRADES")
        ALL_SCORES['TSLA']['day1']['TRADES'] = mse

        if RUN_CGAN and 'CGAN' in TSLA_DAY1_RESULTS:
            cgan_path = f"{TSLA_DAY1_RESULTS['CGAN']}/processed_orders.csv"
            mse = compute_predictive_score(real_path, cgan_path, "TSLA Day 1/29 - CGAN")
            ALL_SCORES['TSLA']['day1']['CGAN'] = mse

        mse = compute_predictive_score(real_path, real_path, "TSLA Day 1/29 - Market Replay")
        ALL_SCORES['TSLA']['day1']['Replay'] = mse

    # Day 2
    if TSLA_DAY2_RESULTS:
        real_path = f"{TSLA_DAY2_RESULTS['Replay']}/processed_orders.csv"
        trades_path = f"{TSLA_DAY2_RESULTS['TRADES']}/processed_orders.csv"
        mse = compute_predictive_score(real_path, trades_path, "TSLA Day 1/30 - TRADES")
        ALL_SCORES['TSLA']['day2']['TRADES'] = mse

        if RUN_CGAN and 'CGAN' in TSLA_DAY2_RESULTS:
            cgan_path = f"{TSLA_DAY2_RESULTS['CGAN']}/processed_orders.csv"
            mse = compute_predictive_score(real_path, cgan_path, "TSLA Day 1/30 - CGAN")
            ALL_SCORES['TSLA']['day2']['CGAN'] = mse

        mse = compute_predictive_score(real_path, real_path, "TSLA Day 1/30 - Market Replay")
        ALL_SCORES['TSLA']['day2']['Replay'] = mse

    print("\nTSLA scores collected!")


---
# Section 6: Final Results

Compute averaged scores and compare to paper.

In [None]:
import pandas as pd

print("\n" + "="*80)
print("FINAL RESULTS: Predictive Score (MAE) Averaged Over 2 Days")
print("="*80)

# Compute averages
results_table = []

for stock in ALL_SCORES:
    for method in ['Replay', 'TRADES', 'CGAN']:
        day1_score = ALL_SCORES[stock]['day1'].get(method)
        day2_score = ALL_SCORES[stock]['day2'].get(method)

        if day1_score is not None and day2_score is not None:
            avg_score = (day1_score + day2_score) / 2
            results_table.append({
                'Stock': stock,
                'Method': method,
                'Day 1/29': f"{day1_score:.3f}",
                'Day 1/30': f"{day2_score:.3f}",
                'Average': f"{avg_score:.3f}"
            })

df_results = pd.DataFrame(results_table)

print("\nYour Results:")
print(df_results.to_string(index=False))

print("\n" + "="*80)
print("Paper Results (Table 1):")
print("="*80)
paper_results = """
Method         Tesla    Intel
Market Replay  0.923    0.149
IABS           1.870    1.866
CGAN           3.453    0.699
TRADES         1.213    0.307
"""
print(paper_results)

print("="*80)
print("\nInterpretation:")
print("  - Lower is better (closer to Market Replay)")
print("  - TRADES should significantly outperform CGAN")
print("  - Your results may vary due to different time windows or random seeds")
print("="*80)


## 6.1. Optional: Full main.py Evaluation

Run this if you want all the paper figures (PCA, stylized facts, etc.)

‚ö†Ô∏è  This only works if you ran IABS and CGAN!

In [None]:
RUN_FULL_EVAL = False  # Set to True to run full evaluation

if RUN_FULL_EVAL and (not RUN_IABS or not RUN_CGAN):
    print("‚ö†Ô∏è  Cannot run full evaluation without IABS and CGAN!")
    print("   Set RUN_IABS=True and RUN_CGAN=True in configuration")
elif RUN_FULL_EVAL:
    # Run main.py evaluation for day 1
    print("Running full main.py evaluation...")

    import sys
    sys.path.insert(0, 'evaluation/visualizations')
    import main

    if RUN_INTC and INTC_DAY1_RESULTS:
        real_path = f"{INTC_DAY1_RESULTS['Replay']}/processed_orders.csv"
        trades_path = f"{INTC_DAY1_RESULTS['TRADES']}/processed_orders.csv"
        iabs_path = f"{INTC_DAY1_RESULTS['IABS']}/processed_orders.csv" if 'IABS' in INTC_DAY1_RESULTS else trades_path
        cgan_path = f"{INTC_DAY1_RESULTS['CGAN']}/processed_orders.csv" if 'CGAN' in INTC_DAY1_RESULTS else trades_path

        main.plot_graphs(real_path, trades_path, iabs_path, cgan_path)
        print("\n‚úÖ Plots saved to evaluation/visualizations/")

    print("\nGenerated plots:")
    !ls -lh evaluation/visualizations/*.pdf
else:
    print("‚è≠Ô∏è  Full evaluation skipped (set RUN_FULL_EVAL=True to enable)")


## 6.2. Download Results

In [None]:
from google.colab import files

# Zip all logs
!zip -r all_simulation_results.zip ABIDES/log/ evaluation/visualizations/ 2>/dev/null

import os
file_size = os.path.getsize('all_simulation_results.zip') / (1024 * 1024)
print(f"\nüì¶ Results packaged: {file_size:.1f} MB")
print("\nDownloading...")

files.download('all_simulation_results.zip')

print("‚úÖ Download complete!")
print("\nContents:")
print("  - All simulation logs (processed_orders.csv + plots)")
print("  - Evaluation plots (if generated)")
print("  - Results summary")


---
# Summary

üéâ **Reproduction Complete!**

You have successfully reproduced the TRADES paper results.

## What was reproduced:
- ‚úÖ Predictive scores averaged over 2 days
- ‚úÖ TRADES vs Market Replay comparison
- ‚úÖ Optional: IABS and CGAN baselines
- ‚úÖ Optional: PCA, stylized facts, and other figures

## Next steps:
1. Compare your results to Table 1 in the paper
2. Examine the generated plots
3. Try different time windows or stocks
4. Modify the configuration to explore parameter sensitivity

## Citations:
If you use this reproduction in your work, please cite:
- TRADES paper: [citation]
- D-MEADS repository: https://github.com/FinancialComputingUCL/D-MEADS
