# Problem Set 2: Two-player Iterated Prisoner's Dilemma
Minho Kang (239742)

## Introduction
In this problem set, I will simulate a two-player Iterated Prisoner's Dilemma (IPD) game using several strategies. The objective of this assignment is to explore how different strategies interact in the context of the prisoner's dilemma, focusing on payoffs, cooperation, and defection patterns. We will test the following strategies:

1. Tit-for-Tat
2. Grim Trigger
3. Always Cooperate
4. Always Defect
5. Probabilistic Strategy
6. Intermediate Punishment Strategy

### The Payoff Matrix
The payoff matrix used for the Iterated Prisoner's Dilemma is as follows:

|                               | **Cooperate** | **Don't cooperate** |
|-------------------------------|---------------|---------------------|
| **Cooperate**                  | 6, 6          | 2, 10               |
| **Don't cooperate**            | 10, 2         | 4, 4                |

## Objective
The primary objective of this problem set is to implement and simulate a series of strategies in an iterated version of the Prisoner's Dilemma game and analyze their payoffs and behavior when played against each other.

## Methodology
### Strategies Implemented
1. **Tit-for-Tat**: This strategy cooperates on the first move and then mimics the opponent's previous move.
2. **Grim Trigger**: This strategy cooperates until the opponent defects, after which it defects permanently.
3. **Always Cooperate**: This strategy always cooperates regardless of the opponent's moves.
4. **Always Defect**: This strategy always defects regardless of the opponent's moves.
5. **Probabilistic Strategy**: This strategy cooperates with a given probability `p` and defects with the complementary probability.
6. **Intermediate Punishment**: This strategy cooperates until the opponent defects. If the opponent defects, it will not cooperate for the next `k` rounds but will return to cooperation afterward.

### Simulation Setup
The game was simulated over a series of 10 rounds, where each strategy played against every other strategy, except against itself. For each game, the cumulative payoffs were calculated, and the moves of the players were recorded for visualization.

The number of rounds was chosen to be 10 to balance between observing long-term effects and avoiding excessive computation time. Additionally, the `k` parameter in the Intermediate Punishment strategy was set to 3.


In [1]:
import os
from ps2.strategies import *
from ps2.save_plot import save_plot
from ps2.simulate_game import simulate_game

In [2]:
payoff_matrix = {
    ("C", "C"): (6, 6),
    ("C", "D"): (2, 10),
    ("D", "C"): (10, 2),
    ("D", "D"): (4, 4),
}

In [4]:
# set a directory to save plots
save_dir = "/Users/minhokang/hertie/agt-fall25/ps2/plots"
os.makedirs(save_dir, exist_ok=True)

## Simulate games and save plot of results

In [5]:
# create a list of tuple for strategy names and their functions
strategies = [
    ("Tit-for-Tat", tit_for_tat),
    ("Grim Trigger", grim_trigger),
    ("Always Cooperate", always_cooperate),
    ("Always Defect", always_defect),
    ("Probabilistic (p=0.5)", lambda h: probabilistic_strategy(h, p=0.5, seed=42)), # set the seed for reproducibility
    ("Intermediate Punishment", intermediate_punishment),
]

# simulate and save results for all strategy combinations
payoffs = []  # to store payoff results
cooperation_rates = {} # to store cooperation rates for each strategy pair

for i in range(len(strategies)):
    for j in range(i + 1, len(strategies)):  # i to j to avoid duplicate pairs
        strategy1_name, strategy1 = strategies[i]
        strategy2_name, strategy2 = strategies[j]

        # Simulate the game
        player1_moves, player2_moves, player1_payoff, player2_payoff = simulate_game(
            player1_strategy=strategy1, player2_strategy=strategy2, rounds=10, payoff_matrix=payoff_matrix
        )

        # save the plot and calculate payoffs
        save_plot(
            player1_strategy_name=strategy1_name,
            player2_strategy_name=strategy2_name,
            player1_moves=player1_moves,
            player2_moves=player2_moves,
            player1_payoff=player1_payoff,
            player2_payoff=player2_payoff,
            save_dir=save_dir,
        )

        # store the payoffs for later analysis
        payoffs.append((strategy1_name, strategy2_name, player1_payoff, player2_payoff))

        # calculate cooperation rates
        cooperation_rate1 = sum(player1_moves) / len(player1_moves)
        cooperation_rate2 = sum(player2_moves) / len(player2_moves)
        cooperation_rates[(strategy1_name, strategy2_name)] = (cooperation_rate1, cooperation_rate2)

# find plots of each strategy results from the below link
# https://github.com/minhokg/agt-fall25/tree/main/ps2/plots

## Summary of games (final payoff and cooperation rates for each game)

In [11]:
for payoff in payoffs:
    print(f"{payoff[0]} vs {payoff[1]}: {payoff[2]} - {payoff[3]}")

print("\nCooperation Rates:")
for (strategy1_name, strategy2_name), (cooperation_rate1, cooperation_rate2) in cooperation_rates.items():
    print(f"{strategy1_name} vs {strategy2_name}: {cooperation_rate1:.2f} - {cooperation_rate2:.2f}")

Tit-for-Tat vs Grim Trigger: 60 - 60
Tit-for-Tat vs Always Cooperate: 60 - 60
Tit-for-Tat vs Always Defect: 38 - 46
Tit-for-Tat vs Probabilistic (p=0.5): 48 - 56
Tit-for-Tat vs Intermediate Punishment: 60 - 60
Grim Trigger vs Always Cooperate: 60 - 60
Grim Trigger vs Always Defect: 38 - 46
Grim Trigger vs Probabilistic (p=0.5): 68 - 36
Grim Trigger vs Intermediate Punishment: 60 - 60
Always Cooperate vs Always Defect: 20 - 100
Always Cooperate vs Probabilistic (p=0.5): 36 - 84
Always Cooperate vs Intermediate Punishment: 60 - 60
Always Defect vs Probabilistic (p=0.5): 70 - 30
Always Defect vs Intermediate Punishment: 46 - 38
Probabilistic (p=0.5) vs Intermediate Punishment: 44 - 68

Cooperation Rates:
Tit-for-Tat vs Grim Trigger: 1.00 - 1.00
Tit-for-Tat vs Always Cooperate: 1.00 - 1.00
Tit-for-Tat vs Always Defect: 0.10 - 0.00
Tit-for-Tat vs Probabilistic (p=0.5): 0.50 - 0.40
Tit-for-Tat vs Intermediate Punishment: 1.00 - 1.00
Grim Trigger vs Always Cooperate: 1.00 - 1.00
Grim Trigger 

## Highest