- As project 2, I made functions at .py file, and then ,plot results in .ipynb file. 
- I condider Repeated First Price Auction(hereinafter called repeated FPA), not second price auction.
- Players in Repeated FPA must have algorithm whose regret coverges to 0. 

# Project 3: Repeated First Price Auction

## Information Setting

- Full Information: Each player knows their own value (v) which is fixed across all rounds
- Full Feedback: After each round, each player observes all bids from all players, not just their own outcome

This setting allows players to use opponent's bid history directly in their learning algorithms.

## Parameters

- n_rounds: 1000 - number of rounds per simulation
- k: 100 - number of discrete arms (discretization level)
- n_mc: 100 - number of Monte Carlo simulation runs
- h: scaling parameter (default: value) - used in Exponential Weight algorithms
- value (v): 10.0 - player's value for the item (default)
- learning_rate: sqrt(log(k) / n) - learning rate for Exponential Weight algorithms (default for flexible)
  Optimal learning rate: epsilon = sqrt(log(k) / T)
  Note: cumulative_payoffs are normalized by h, so epsilon does not need h factor
- observation_rounds: 5 - number of observation rounds for exploitation algorithm (default)

## Algorithms

1. 1_empirical: Empirical algorithm - always maximizes current round expected utility based on past opponent's bid data. 
2. 2_ew: Flexible algorithm - Exponential Weight with learning_rate
3. 3_Exploitation: Exploitation algorithm - waits and exploits when opponent bids low

In [37]:
import sys, importlib
from pathlib import Path
sys.path.insert(0, str(Path('algorithm').resolve()))

empirical, ew, exploitation = [importlib.import_module(m) for m in ['1_empirical', '2_ew', '3_exploitation']]

import repeated_FPA
from repeated_FPA import run_repeated_fpa, plot_results

# Part 1
- we simulate the game with players who use above algorithms (This is Part1). 

In [13]:
# Parameters
n_rounds = 10000
k = 100
n_mc = 10

In [15]:
# Example 1: Empirical vs EW
# Note: ew_algorithm uses optimal learning_rate = sqrt(log(k) / n) by default
# Optimal learning rate: epsilon = sqrt(log(k) / T)

v1, v2 = 1.0, 1.0
player1 = (empirical.empirical_algorithm, v1, {'k': k, 'h': v1})
player2 = (ew.flexible_algorithm, v2, {'k': k, 'h': v2, 'learning_rate': None})  # None = default (sqrt(log(k) / n)), or specify a value like 0.1
results_empirical_vs_ew = run_repeated_fpa(player1, player2, n_rounds, n_mc, k=k)
print("Completed: Empirical vs EW")
plot_results(results_empirical_vs_ew, title="Empirical vs EW")

MC iteration 10/10 completed
Completed: Empirical vs EW
Plot 1 saved to: ../figures/empirical_vs_ew_bid_evolution.png
Plot 2 saved to: ../figures/empirical_vs_ew_regret.png
Plot 4 saved to: ../figures/empirical_vs_ew_utility_distribution.png
Plot 5 saved to: ../figures/empirical_vs_ew_win_rate_distribution.png

=== Summary Statistics ===
Player 1:
  Mean Regret: 3.70 ± 1.21
  Mean Utility: 346.00 ± 21.20
  Mean Win Rate: 0.513 ± 0.011

Player 2:
  Mean Regret: 84.32 ± 5.10
  Mean Utility: 285.56 ± 28.83
  Mean Win Rate: 0.487 ± 0.011
Summary statistics saved to: ../data/empirical_vs_ew_summary.csv
Detailed results saved to: ../data/empirical_vs_ew_detailed.csv
Regret history saved to: ../data/empirical_vs_ew_regret_history.csv
Bid history saved to: ../data/empirical_vs_ew_bid_history.csv


In [16]:
v1, v2 = 1.0, 1.0
player1 = (ew.flexible_algorithm, v1, {'k': k, 'h': v1, 'learning_rate': 100})
player2 = (ew.flexible_algorithm, v2, {'k': k, 'h': v2, 'learning_rate': None})
results_FTL_vs_ew = run_repeated_fpa(player1, player2, n_rounds, n_mc, k=k)
print("Completed: FTL vs EW")
plot_results(results_FTL_vs_ew, title="FTL vs EW")

  powers = (1 + learning_rate) ** (cumulative_payoffs / h)
  return ufunc.reduce(obj, axis, dtype, out, **passkwargs)


MC iteration 10/10 completed
Completed: FTL vs EW
Plot 1 saved to: ../figures/ftl_vs_ew_bid_evolution.png
Plot 2 saved to: ../figures/ftl_vs_ew_regret.png
Plot 4 saved to: ../figures/ftl_vs_ew_utility_distribution.png
Plot 5 saved to: ../figures/ftl_vs_ew_win_rate_distribution.png

=== Summary Statistics ===
Player 1:
  Mean Regret: 1428.12 ± 70.39
  Mean Utility: 985.70 ± 50.44
  Mean Win Rate: 0.483 ± 0.004

Player 2:
  Mean Regret: 31.82 ± 9.86
  Mean Utility: 1755.43 ± 81.80
  Mean Win Rate: 0.517 ± 0.004
Summary statistics saved to: ../data/ftl_vs_ew_summary.csv
Detailed results saved to: ../data/ftl_vs_ew_detailed.csv
Regret history saved to: ../data/ftl_vs_ew_regret_history.csv
Bid history saved to: ../data/ftl_vs_ew_bid_history.csv


In [17]:
v1, v2 = 1.0, 1.0
player1 = (ew.flexible_algorithm, v1, {'k': k, 'h': v1, 'learning_rate': 0.01})
player2 = (ew.flexible_algorithm, v2, {'k': k, 'h': v2, 'learning_rate': None})
results_uniform_vs_ew = run_repeated_fpa(player1, player2, n_rounds, n_mc, k=k)
print("Completed: Uniform guessing vs EW")
plot_results(results_uniform_vs_ew, title="Uniform guessing vs EW")

MC iteration 10/10 completed
Completed: Uniform guessing vs EW
Plot 1 saved to: ../figures/uniform_guessing_vs_ew_bid_evolution.png
Plot 2 saved to: ../figures/uniform_guessing_vs_ew_regret.png
Plot 4 saved to: ../figures/uniform_guessing_vs_ew_utility_distribution.png
Plot 5 saved to: ../figures/uniform_guessing_vs_ew_win_rate_distribution.png

=== Summary Statistics ===
Player 1:
  Mean Regret: 324.31 ± 6.51
  Mean Utility: 804.58 ± 13.32
  Mean Win Rate: 0.485 ± 0.002

Player 2:
  Mean Regret: 87.56 ± 6.09
  Mean Utility: 978.65 ± 20.52
  Mean Win Rate: 0.515 ± 0.002
Summary statistics saved to: ../data/uniform_guessing_vs_ew_summary.csv
Detailed results saved to: ../data/uniform_guessing_vs_ew_detailed.csv
Regret history saved to: ../data/uniform_guessing_vs_ew_regret_history.csv
Bid history saved to: ../data/uniform_guessing_vs_ew_bid_history.csv


In [18]:
# Optimal learning rate vs 3x optimal learning rate
# Optimal learning rate: epsilon = sqrt(log(k) / n_rounds)
import numpy as np

v1, v2 = 1.0, 1.0
optimal_lr = np.sqrt(np.log(k) / n_rounds)
lr_3x = 3 * optimal_lr

player1 = (ew.flexible_algorithm, v1, {'k': k, 'h': v1, 'learning_rate': None})
player2 = (ew.flexible_algorithm, v2, {'k': k, 'h': v2, 'learning_rate': lr_3x})

results_lr_comparison = run_repeated_fpa(player1, player2, n_rounds, n_mc, k=k)
print("Completed: Optimal LR vs 3x Optimal LR")
plot_results(results_lr_comparison, title="Optimal LR vs 3x Optimal LR")

MC iteration 10/10 completed
Completed: Optimal LR vs 3x Optimal LR
Plot 1 saved to: ../figures/optimal_lr_vs_3x_optimal_lr_bid_evolution.png
Plot 2 saved to: ../figures/optimal_lr_vs_3x_optimal_lr_regret.png
Plot 4 saved to: ../figures/optimal_lr_vs_3x_optimal_lr_utility_distribution.png
Plot 5 saved to: ../figures/optimal_lr_vs_3x_optimal_lr_win_rate_distribution.png

=== Summary Statistics ===
Player 1:
  Mean Regret: 77.08 ± 6.83
  Mean Utility: 528.00 ± 12.30
  Mean Win Rate: 0.506 ± 0.004

Player 2:
  Mean Regret: 66.87 ± 5.07
  Mean Utility: 521.09 ± 14.62
  Mean Win Rate: 0.494 ± 0.004
Summary statistics saved to: ../data/optimal_lr_vs_3x_optimal_lr_summary.csv
Detailed results saved to: ../data/optimal_lr_vs_3x_optimal_lr_detailed.csv
Regret history saved to: ../data/optimal_lr_vs_3x_optimal_lr_regret_history.csv
Bid history saved to: ../data/optimal_lr_vs_3x_optimal_lr_bid_history.csv


# Part 1 - 2
what if they have different value?

In [19]:
v1, v2 = 0.9, 0.3
player1 = (ew.flexible_algorithm, v1, {'k': k, 'h': v1, 'learning_rate': None})
player2 = (ew.flexible_algorithm, v2, {'k': k, 'h': v2, 'learning_rate': None}) 
results_ew_vs_ew_with_different_values = run_repeated_fpa(player1, player2, n_rounds, n_mc, k=k)
print("Completed: EW vs EW")
plot_results(results_ew_vs_ew_with_different_values, title="EW vs EW (with different values)")

MC iteration 10/10 completed
Completed: EW vs EW
Plot 1 saved to: ../figures/ew_vs_ew_with_different_values_bid_evolution.png
Plot 2 saved to: ../figures/ew_vs_ew_with_different_values_regret.png
Plot 4 saved to: ../figures/ew_vs_ew_with_different_values_utility_distribution.png
Plot 5 saved to: ../figures/ew_vs_ew_with_different_values_win_rate_distribution.png

=== Summary Statistics ===
Player 1:
  Mean Regret: 63.91 ± 3.94
  Mean Utility: 5908.73 ± 5.38
  Mean Win Rate: 0.991 ± 0.001

Player 2:
  Mean Regret: 5.32 ± 0.93
  Mean Utility: 0.82 ± 0.22
  Mean Win Rate: 0.009 ± 0.001
Summary statistics saved to: ../data/ew_vs_ew_with_different_values_summary.csv
Detailed results saved to: ../data/ew_vs_ew_with_different_values_detailed.csv
Regret history saved to: ../data/ew_vs_ew_with_different_values_regret_history.csv
Bid history saved to: ../data/ew_vs_ew_with_different_values_bid_history.csv


# Part 1 - 3
what if values are drawn from distributions?

In [38]:
from value_generate import generate_value

In [39]:
# Parameters for uniform distribution experiment
n_rounds = 10000
k = 100
n_mc = 10

# Use lambda functions to generate values for each MC run
# Each MC run will get a new value from uniform distribution
player1 = (ew.flexible_algorithm, lambda: generate_value('uniform', low=0.0, high=1.0), {'k': k, 'learning_rate': None}) 
player2 = (ew.flexible_algorithm, lambda: generate_value('uniform', low=0.0, high=1.0), {'k': k, 'learning_rate': None}) 
results_ew_vs_ew_from_uniform_distribution = run_repeated_fpa(player1, player2, n_rounds, n_mc, k=k)
print("Completed: EW vs EW from uniform distribution")
plot_results(results_ew_vs_ew_from_uniform_distribution, title="EW vs EW from uniform distribution")

TypeError: '>' not supported between instances of 'function' and 'function'

# Part 2
- Exploitation strategy vs 1_empirical (Empirical strategy)
- Exploitation strategy vs 2_ew (Exponential Weight algorithm)

In [31]:
# Part 2 Implementation

# Parameters
n_rounds = 10000
k = 10
n_mc = 10

In [32]:
v1, v2 = 0.9, 0.3
player1 = (ew.flexible_algorithm, v1, {'k': k, 'h': v1, 'learning_rate': None})  
player2 = (exploitation.exploitation_algorithm, v2, {'k': k, 'h': v2, 'observation_rounds': 5})
results_ew_vs_exploitation = run_repeated_fpa(player1, player2, n_rounds, n_mc, k=k)
print("Completed: EW vs Exploitation")
plot_results(results_ew_vs_exploitation, title="EW vs Exploitation")

KeyboardInterrupt: 

In [29]:
# Parameters
n_rounds = 1000
k = 10
n_mc = 1

In [30]:
v1, v2 = 0.9, 0.3
# Use default learning rate (sqrt(log(k) / n)) - no need to specify explicitly
player1 = (ew.flexible_algorithm, v1, {'k': k, 'h': v1, 'learning_rate': None})  
player2 = (exploitation.exploitation_algorithm, v2, {'k': k, 'h': v2, 'observation_rounds': 5})
results_ew_vs_exploitation_for_detail = run_repeated_fpa(player1, player2, n_rounds, n_mc, k=k)
print("Completed: EW vs Exploitation")
plot_results(results_ew_vs_exploitation_for_detail, title="EW vs Exploitation (for detail)")

Completed: EW vs Exploitation
Plot 1 saved to: ../figures/ew_vs_exploitation_for_detail_bid_evolution.png
Plot 2 saved to: ../figures/ew_vs_exploitation_for_detail_regret.png
Plot 4 saved to: ../figures/ew_vs_exploitation_for_detail_utility_distribution.png
Plot 5 saved to: ../figures/ew_vs_exploitation_for_detail_win_rate_distribution.png

=== Summary Statistics ===
Player 1:
  Mean Regret: 14.60 ± 0.00
  Mean Utility: 632.55 ± 0.00
  Mean Win Rate: 0.882 ± 0.000

Player 2:
  Mean Regret: 51.18 ± 0.00
  Mean Utility: 13.32 ± 0.00
  Mean Win Rate: 0.118 ± 0.000
Summary statistics saved to: ../data/ew_vs_exploitation_for_detail_summary.csv
Detailed results saved to: ../data/ew_vs_exploitation_for_detail_detailed.csv
Regret history saved to: ../data/ew_vs_exploitation_for_detail_regret_history.csv
Bid history saved to: ../data/ew_vs_exploitation_for_detail_bid_history.csv
