# DQN Market Competition Simulation

This notebook implements the complete experimental design as specified in README.md.

## Experiment Overview

1. **Experiment 1 (Basic)**: 2-firm symmetric parameters with multiple learning rates
2. **Experiment 2 (Extended)**: 3-firm symmetric parameters
3. **Experiment 3 (Extended)**: 4-firm symmetric parameters
4. **Experiment 4 (Extended)**: 3-firm asymmetric parameters
5. **Experiment 5 (Extended)**: 4-firm asymmetric parameters
6. **Comparative Analysis**: Cross-scenario comparisons

## Key Metrics

- **RPDI (Relative Price Deviation Index)**: Measures pricing relative to Nash/Monopoly levels
- **Δ (Profit Metric)**: Assesses profit levels relative to Nash/Monopoly benchmarks

## 1. Setup and Imports

In [1]:
# Import required libraries
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
from typing import Dict, List
import os
import warnings
warnings.filterwarnings('ignore')

# Import custom modules
from experiment_config import (
    EXPERIMENT_2FIRM_SYMMETRIC,
    EXPERIMENT_3FIRM_SYMMETRIC,
    EXPERIMENT_4FIRM_SYMMETRIC,
    EXPERIMENT_3FIRM_ASYMMETRIC,
    EXPERIMENT_4FIRM_ASYMMETRIC,
    DQN_HYPERPARAMS,
    TRAINING_CONFIG
)
from market_simulation import MarketSimulation

# Set plotting style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

print("✅ Setup complete!")
print(f"\nExperiment scenarios to run:")
print("1. Experiment 1: 2-Firm Symmetric (Basic)")
print("2. Experiment 2: 3-Firm Symmetric (Extended)")
print("3. Experiment 3: 4-Firm Symmetric (Extended)")
print("4. Experiment 4: 3-Firm Asymmetric (Extended)")
print("5. Experiment 5: 4-Firm Asymmetric (Extended)")

Using MPS (Metal Performance Shaders) for GPU acceleration
✅ Setup complete!

Experiment scenarios to run:
1. Experiment 1: 2-Firm Symmetric (Basic)
2. Experiment 2: 3-Firm Symmetric (Extended)
3. Experiment 3: 4-Firm Symmetric (Extended)
4. Experiment 4: 3-Firm Asymmetric (Extended)
5. Experiment 5: 4-Firm Asymmetric (Extended)


## Experiment 1: 2-Firm Symmetric (Basic)

This is the baseline experiment with 2 symmetric firms competing in a Logit Bertrand market.
We test multiple learning rates (0.01, 0.05, 0.1) and select the most collusive one for extended experiments.

In [2]:
# Display experiment configuration
print("="*80)
print("EXPERIMENT 1: 2-FIRM SYMMETRIC CONFIGURATION")
print("="*80)

exp1_config = EXPERIMENT_2FIRM_SYMMETRIC

print(f"\n📊 Market Structure:")
print(f"  • Number of firms: {exp1_config['market_structure']['n_firms']}")
print(f"  • Action space size: {exp1_config['market_structure']['n_actions']} discrete prices")

print(f"\n🏭 Firm Parameters:")
print(f"  • Marginal costs: {exp1_config['firm_parameters']['marginal_costs']}")
print(f"  • Product qualities: {exp1_config['firm_parameters']['product_qualities']}")
print(f"  • Substitutability (μ): {exp1_config['market_parameters']['substitutability']}")

print(f"\n🎯 Benchmark Values:")
print(f"  • Nash price: {exp1_config['benchmarks']['nash_price']:.3f}")
print(f"  • Monopoly price: {exp1_config['benchmarks']['monopoly_price']:.3f}")
print(f"  • Nash profit: {exp1_config['benchmarks']['nash_profit']:.3f}")
print(f"  • Monopoly profit: {exp1_config['benchmarks']['monopoly_profit']:.3f}")

print(f"\n📈 Price Range:")
print(f"  • Min: {exp1_config['price_range']['min_price']:.3f}")
print(f"  • Max: {exp1_config['price_range']['max_price']:.3f}")

EXPERIMENT 1: 2-FIRM SYMMETRIC CONFIGURATION

📊 Market Structure:
  • Number of firms: 2
  • Action space size: 15 discrete prices

🏭 Firm Parameters:
  • Marginal costs: [1.0, 1.0]
  • Product qualities: [2.0, 2.0]
  • Substitutability (μ): 0.4

🎯 Benchmark Values:
  • Nash price: 1.677
  • Monopoly price: 2.071
  • Nash profit: 0.277
  • Monopoly profit: 0.335

📈 Price Range:
  • Min: 1.637
  • Max: 2.110


In [3]:
# Run Experiment 1: 2-Firm Symmetric with multiple learning rates
print("\n🚀 Starting Experiment 1: 2-Firm Symmetric Simulation...\n")

# Test different learning rates per README specification
learning_rates = [0.01, 0.05, 0.1]
exp1_results = {}

for lr in learning_rates:
    print(f"\n{'='*70}")
    print(f"Testing Learning Rate: {lr}")
    print(f"{'='*70}")

    # Create simulation
    sim = MarketSimulation(
        experiment_config=exp1_config,
        learning_rate=lr,
        save_dir=f"results/exp1_2firm_symmetric",
        verbose=True
    )

    # Train agents (uses experiment config: 2000 episodes)
    sim.train()

    # Evaluate performance (100 episodes, last 10,000 timesteps)
    eval_results = sim.evaluate()

    # Store results
    exp1_results[lr] = {
        'simulation': sim,
        'evaluation': eval_results,
        'rpdi': eval_results['overall']['rpdi'],
        'delta': eval_results['overall']['delta'],
        'interpretation': eval_results['interpretation']
    }

    # Save results
    sim.save_results()


🚀 Starting Experiment 1: 2-Firm Symmetric Simulation...


Testing Learning Rate: 0.01
Initialized simulation: 2-Firm Symmetric
Number of firms: 2
Learning rate: 0.01
Substitutability: 0.4

Starting training for 2000 episodes...


KeyboardInterrupt: 

In [None]:
# Analyze Experiment 1 Results
print("\n" + "="*80)
print("EXPERIMENT 1 RESULTS SUMMARY")
print("="*80)

# Create comparison table
comparison_data = []
for lr, results in exp1_results.items():
    comparison_data.append({
        'Learning Rate': lr,
        'RPDI': f"{results['rpdi']:.4f}",
        'Delta': f"{results['delta']:.4f}",
        'Avg Price': f"{results['evaluation']['overall']['avg_price']:.3f}",
        'Avg Profit': f"{results['evaluation']['overall']['avg_profit']:.3f}",
        'Behavior': results['interpretation'].split(':')[0].strip()
    })

df_exp1 = pd.DataFrame(comparison_data)
print("\n" + df_exp1.to_string(index=False))

# Find most collusive learning rate (highest RPDI)
best_lr = max(exp1_results.keys(), key=lambda x: exp1_results[x]['rpdi'])
print(f"\n🏆 Most collusive learning rate selected: {best_lr}")
print(f"   RPDI: {exp1_results[best_lr]['rpdi']:.4f}")
print(f"   Delta: {exp1_results[best_lr]['delta']:.4f}")

## Experiment 2: 3-Firm Symmetric (Extended)

Testing market dynamics with 3 firms while maintaining symmetric parameters.
Uses the most collusive learning rate from Experiment 1 and trains for 1000 episodes.

In [None]:
# Experiment 2: 3-Firm Symmetric
print("="*80)
print("EXPERIMENT 2: 3-FIRM SYMMETRIC")
print("="*80)

exp2_config = EXPERIMENT_3FIRM_SYMMETRIC

# Display configuration
print(f"\n📊 Market Structure:")
print(f"  • Number of firms: {exp2_config['market_structure']['n_firms']}")
print(f"  • Nash price: {exp2_config['benchmarks']['nash_price']:.3f}")
print(f"  • Monopoly price: {exp2_config['benchmarks']['monopoly_price']:.3f}")

print(f"\nRunning 3-Firm symmetric simulation with learning rate: {best_lr}")
print(f"Training episodes: {exp2_config['training']['episodes']}")

sim_3firm = MarketSimulation(
    experiment_config=exp2_config,
    learning_rate=best_lr,
    save_dir="results/exp2_3firm_symmetric",
    verbose=True
)

# Train and evaluate (uses experiment config: 1000 episodes)
sim_3firm.train()
eval_3firm = sim_3firm.evaluate()
sim_3firm.save_results()

print(f"\n📊 Experiment 2 Results:")
print(f"  • RPDI: {eval_3firm['overall']['rpdi']:.4f}")
print(f"  • Delta: {eval_3firm['overall']['delta']:.4f}")
print(f"  • Interpretation: {eval_3firm['interpretation']}")

## Experiment 3: 4-Firm Symmetric (Extended)

Testing market dynamics with 4 firms while maintaining symmetric parameters.
Uses the same learning rate and trains for 1000 episodes.

In [None]:
# Experiment 3: 4-Firm Symmetric
print("="*80)
print("EXPERIMENT 3: 4-FIRM SYMMETRIC")
print("="*80)

exp3_config = EXPERIMENT_4FIRM_SYMMETRIC

# Display configuration
print(f"\n📊 Market Structure:")
print(f"  • Number of firms: {exp3_config['market_structure']['n_firms']}")
print(f"  • Nash price: {exp3_config['benchmarks']['nash_price']:.3f}")
print(f"  • Monopoly price: {exp3_config['benchmarks']['monopoly_price']:.3f}")

print(f"\nRunning 4-Firm symmetric simulation with learning rate: {best_lr}")
print(f"Training episodes: {exp3_config['training']['episodes']}")

sim_4firm = MarketSimulation(
    experiment_config=exp3_config,
    learning_rate=best_lr,
    save_dir="results/exp3_4firm_symmetric",
    verbose=True
)

# Train and evaluate (uses experiment config: 1000 episodes)
sim_4firm.train()
eval_4firm = sim_4firm.evaluate()
sim_4firm.save_results()

print(f"\n📊 Experiment 3 Results:")
print(f"  • RPDI: {eval_4firm['overall']['rpdi']:.4f}")
print(f"  • Delta: {eval_4firm['overall']['delta']:.4f}")
print(f"  • Interpretation: {eval_4firm['interpretation']}")

## Experiment 4: 3-Firm Asymmetric (Extended)

Testing market dynamics with 3 firms having different marginal costs and product qualities.
This tests whether algorithmic collusion can emerge even with heterogeneous firms.

In [None]:
# Experiment 4: 3-Firm Asymmetric
print("="*80)
print("EXPERIMENT 4: 3-FIRM ASYMMETRIC")
print("="*80)

exp4_config = EXPERIMENT_3FIRM_ASYMMETRIC

print(f"\n🏭 Asymmetric Firm Parameters:")
print(f"  • Marginal costs: {exp4_config['firm_parameters']['marginal_costs']}")
print(f"  • Product qualities: {exp4_config['firm_parameters']['product_qualities']}")
print(f"  • Substitutability (μ): {exp4_config['market_parameters']['substitutability']}")

print(f"\nRunning 3-Firm asymmetric simulation with learning rate: {best_lr}")
print(f"Training episodes: {exp4_config['training']['episodes']}")

sim_3firm_asym = MarketSimulation(
    experiment_config=exp4_config,
    learning_rate=best_lr,
    save_dir="results/exp4_3firm_asymmetric",
    verbose=True
)

# Train and evaluate (uses experiment config: 1000 episodes)
sim_3firm_asym.train()
eval_3firm_asym = sim_3firm_asym.evaluate()
sim_3firm_asym.save_results()

print(f"\n📊 Experiment 4 Results - Individual Firms:")
for firm_result in eval_3firm_asym['individual_firms']:
    print(f"\nFirm {firm_result['firm_id']}:")
    print(f"  • RPDI: {firm_result['rpdi']:.4f}")
    print(f"  • Delta: {firm_result['delta']:.4f}")
    print(f"  • Avg Price: {firm_result['avg_price']:.3f}")
    print(f"  • Market Share: {firm_result['avg_share']:.3f}")

print(f"\n📈 Overall Market:")
print(f"  • RPDI: {eval_3firm_asym['overall']['rpdi']:.4f}")
print(f"  • Delta: {eval_3firm_asym['overall']['delta']:.4f}")
print(f"  • Interpretation: {eval_3firm_asym['interpretation']}")

## Experiment 5: 4-Firm Asymmetric (Extended)

Testing market dynamics with 4 firms having different marginal costs and product qualities.
This is the most complex scenario with the highest number of heterogeneous competitors.

In [None]:
# Experiment 5: 4-Firm Asymmetric
print("="*80)
print("EXPERIMENT 5: 4-FIRM ASYMMETRIC")
print("="*80)

exp5_config = EXPERIMENT_4FIRM_ASYMMETRIC

print(f"\n🏭 Asymmetric Firm Parameters:")
print(f"  • Marginal costs: {exp5_config['firm_parameters']['marginal_costs']}")
print(f"  • Product qualities: {exp5_config['firm_parameters']['product_qualities']}")
print(f"  • Substitutability (μ): {exp5_config['market_parameters']['substitutability']}")

print(f"\nRunning 4-Firm asymmetric simulation with learning rate: {best_lr}")
print(f"Training episodes: {exp5_config['training']['episodes']}")

sim_4firm_asym = MarketSimulation(
    experiment_config=exp5_config,
    learning_rate=best_lr,
    save_dir="results/exp5_4firm_asymmetric",
    verbose=True
)

# Train and evaluate (uses experiment config: 1000 episodes)
sim_4firm_asym.train()
eval_4firm_asym = sim_4firm_asym.evaluate()
sim_4firm_asym.save_results()

print(f"\n📊 Experiment 5 Results - Individual Firms:")
for firm_result in eval_4firm_asym['individual_firms']:
    print(f"\nFirm {firm_result['firm_id']}:")
    print(f"  • RPDI: {firm_result['rpdi']:.4f}")
    print(f"  • Delta: {firm_result['delta']:.4f}")
    print(f"  • Avg Price: {firm_result['avg_price']:.3f}")
    print(f"  • Market Share: {firm_result['avg_share']:.3f}")

print(f"\n📈 Overall Market:")
print(f"  • RPDI: {eval_4firm_asym['overall']['rpdi']:.4f}")
print(f"  • Delta: {eval_4firm_asym['overall']['delta']:.4f}")
print(f"  • Interpretation: {eval_4firm_asym['interpretation']}")

## 6. Comparative Analysis

Comparing results across all 5 experimental scenarios to understand how market structure and firm heterogeneity affect algorithmic collusion.

In [None]:
# Compile all results for comparison
print("="*80)
print("COMPARATIVE ANALYSIS: ALL 5 SCENARIOS")
print("="*80)

# Create comparison dataframe with all 5 experiments
all_results = [
    {
        'Scenario': 'Exp1: 2-Firm Symmetric',
        'RPDI': exp1_results[best_lr]['rpdi'],
        'Delta': exp1_results[best_lr]['delta'],
        'Avg Price': exp1_results[best_lr]['evaluation']['overall']['avg_price'],
        'Avg Profit': exp1_results[best_lr]['evaluation']['overall']['avg_profit'],
        'Behavior': exp1_results[best_lr]['interpretation'].split(':')[0].strip()
    },
    {
        'Scenario': 'Exp2: 3-Firm Symmetric',
        'RPDI': eval_3firm['overall']['rpdi'],
        'Delta': eval_3firm['overall']['delta'],
        'Avg Price': eval_3firm['overall']['avg_price'],
        'Avg Profit': eval_3firm['overall']['avg_profit'],
        'Behavior': eval_3firm['interpretation'].split(':')[0].strip()
    },
    {
        'Scenario': 'Exp3: 4-Firm Symmetric',
        'RPDI': eval_4firm['overall']['rpdi'],
        'Delta': eval_4firm['overall']['delta'],
        'Avg Price': eval_4firm['overall']['avg_price'],
        'Avg Profit': eval_4firm['overall']['avg_profit'],
        'Behavior': eval_4firm['interpretation'].split(':')[0].strip()
    },
    {
        'Scenario': 'Exp4: 3-Firm Asymmetric',
        'RPDI': eval_3firm_asym['overall']['rpdi'],
        'Delta': eval_3firm_asym['overall']['delta'],
        'Avg Price': eval_3firm_asym['overall']['avg_price'],
        'Avg Profit': eval_3firm_asym['overall']['avg_profit'],
        'Behavior': eval_3firm_asym['interpretation'].split(':')[0].strip()
    },
    {
        'Scenario': 'Exp5: 4-Firm Asymmetric',
        'RPDI': eval_4firm_asym['overall']['rpdi'],
        'Delta': eval_4firm_asym['overall']['delta'],
        'Avg Price': eval_4firm_asym['overall']['avg_price'],
        'Avg Profit': eval_4firm_asym['overall']['avg_profit'],
        'Behavior': eval_4firm_asym['interpretation'].split(':')[0].strip()
    }
]

df_comparison = pd.DataFrame(all_results)
df_comparison['RPDI'] = df_comparison['RPDI'].round(4)
df_comparison['Delta'] = df_comparison['Delta'].round(4)
df_comparison['Avg Price'] = df_comparison['Avg Price'].round(3)
df_comparison['Avg Profit'] = df_comparison['Avg Profit'].round(3)

print("\n" + df_comparison.to_string(index=False))

# Save summary to CSV
os.makedirs('results', exist_ok=True)
df_comparison.to_csv('results/summary_all_experiments.csv', index=False)
print("\n✅ Summary saved to: results/summary_all_experiments.csv")

In [None]:
# Create comprehensive comparative visualizations
fig, axes = plt.subplots(2, 2, figsize=(16, 12))

# Extract data for plotting
scenarios = df_comparison['Scenario'].values
rpdi_values = df_comparison['RPDI'].values
delta_values = df_comparison['Delta'].values
prices = df_comparison['Avg Price'].values
profits = df_comparison['Avg Profit'].values

# Define colors based on behavior thresholds
def get_color(value):
    if value < 0.3:
        return 'green'
    elif value < 0.7:
        return 'orange'
    else:
        return 'red'

# Plot 1: RPDI Comparison
ax1 = axes[0, 0]
colors = [get_color(r) for r in rpdi_values]
bars1 = ax1.bar(range(len(scenarios)), rpdi_values, color=colors, alpha=0.7)
ax1.axhline(y=0.3, color='green', linestyle='--', alpha=0.5, label='Competitive threshold')
ax1.axhline(y=0.7, color='red', linestyle='--', alpha=0.5, label='Collusive threshold')
ax1.set_xticks(range(len(scenarios)))
ax1.set_xticklabels(scenarios, rotation=45, ha='right')
ax1.set_ylabel('RPDI', fontsize=12, fontweight='bold')
ax1.set_title('Relative Price Deviation Index Across Experiments', fontsize=14, fontweight='bold')
ax1.set_ylim([0, 1])
ax1.legend()
ax1.grid(True, alpha=0.3, axis='y')

# Plot 2: Delta Comparison
ax2 = axes[0, 1]
colors = [get_color(d) for d in delta_values]
bars2 = ax2.bar(range(len(scenarios)), delta_values, color=colors, alpha=0.7)
ax2.axhline(y=0.3, color='green', linestyle='--', alpha=0.5, label='Competitive threshold')
ax2.axhline(y=0.7, color='red', linestyle='--', alpha=0.5, label='Collusive threshold')
ax2.set_xticks(range(len(scenarios)))
ax2.set_xticklabels(scenarios, rotation=45, ha='right')
ax2.set_ylabel('Delta', fontsize=12, fontweight='bold')
ax2.set_title('Profit Metric Across Experiments', fontsize=14, fontweight='bold')
ax2.set_ylim([0, 1])
ax2.legend()
ax2.grid(True, alpha=0.3, axis='y')

# Plot 3: RPDI vs Delta Scatter
ax3 = axes[1, 0]
scatter_colors = [get_color(max(rpdi_values[i], delta_values[i])) for i in range(len(scenarios))]
ax3.scatter(rpdi_values, delta_values, s=300, alpha=0.7, c=scatter_colors)
for i, txt in enumerate(scenarios):
    ax3.annotate(f'E{i+1}', 
                (rpdi_values[i], delta_values[i]), 
                ha='center', va='center', fontsize=10, fontweight='bold')

# Add threshold regions
ax3.axvline(x=0.3, color='gray', linestyle='--', alpha=0.3)
ax3.axvline(x=0.7, color='gray', linestyle='--', alpha=0.3)
ax3.axhline(y=0.3, color='gray', linestyle='--', alpha=0.3)
ax3.axhline(y=0.7, color='gray', linestyle='--', alpha=0.3)

ax3.fill_between([0, 0.3], 0, 0.3, color='green', alpha=0.1, label='Competitive')
ax3.fill_between([0.7, 1], 0.7, 1, color='red', alpha=0.1, label='Collusive')

ax3.set_xlabel('RPDI', fontsize=12, fontweight='bold')
ax3.set_ylabel('Delta', fontsize=12, fontweight='bold')
ax3.set_title('Market Behavior Classification', fontsize=14, fontweight='bold')
ax3.set_xlim([0, 1])
ax3.set_ylim([0, 1])
ax3.legend()
ax3.grid(True, alpha=0.3)

# Plot 4: Grouped bar chart comparing metrics
ax4 = axes[1, 1]
x = np.arange(len(scenarios))
width = 0.35

bars_rpdi = ax4.bar(x - width/2, rpdi_values, width, label='RPDI', alpha=0.8)
bars_delta = ax4.bar(x + width/2, delta_values, width, label='Delta', alpha=0.8)

ax4.set_xlabel('Experiments', fontsize=12, fontweight='bold')
ax4.set_ylabel('Metric Value', fontsize=12, fontweight='bold')
ax4.set_title('RPDI vs Delta Comparison', fontsize=14, fontweight='bold')
ax4.set_xticks(x)
ax4.set_xticklabels([f'E{i+1}' for i in range(len(scenarios))])
ax4.legend()
ax4.grid(True, alpha=0.3, axis='y')
ax4.set_ylim([0, 1])

plt.suptitle('DQN Market Competition: Comprehensive Comparative Analysis', 
             fontsize=16, fontweight='bold', y=0.995)
plt.tight_layout()
plt.savefig('results/comparative_analysis.png', dpi=150, bbox_inches='tight')
plt.show()

print("\n📊 Comparative visualization saved to: results/comparative_analysis.png")