# PokerBotRL Code Walkthrough

## Prerequisites

This notebook assumes:
1.  You have a Python environment with necessary libraries installed (torch, numpy, gymnasium).
2.  You are running this notebook server from the root `PokerBotRL` directory.
4.  For `simulate`, `analyze`, and `ui` commands, a trained model checkpoint may be required in the `checkpoints/` folder.
5.  For the `analyze` command, a `detailed_simulation_log.csv` file generated by `simulate` needs to exist.

## File Descriptions

* **`main.py`**: The main entry point for the project. Uses `argparse` to handle command-line arguments and dispatch tasks like training, simulation, analysis, or running the UI.
* **`envs.py`**: Defines the Gymnasium-compliant poker environment (`BaseFullPokerEnv`, `TrainFullPokerEnv`).
* **`models.py`**: Contains the PyTorch neural network architecture (`BestPokerModel`).
* **`utils.py`**: Provides utility functions including state encoding, replay buffer, epsilon decay, and model loading helpers.
* **`card_utils.py`**: Card parsing and rendering (including ASCII art).
* **`seat_config.py`**: Manages seat configuration (e.g., AI, human, random).
* **`train.py`**: Implements the RL training loop.
* **`simulate.py`**: Runs simulation episodes using a trained model.
* **`main_ui.py`**: Launches a Tkinter-based GUI for human interaction.
* **`decision_analysis.py`**: Evaluates decisions from simulation logs.
* **`human_action_handler.py`**: Likely used to interpret human-like actions or strategies.
* **`plot.py`**: Plots rewards and metrics from logs.
* **`readme.txt`**: Project usage and overview.
* **`code_walkthrough.ipynb`**: This notebook.

## Command-Line Usage (`main.py`)
The `main.py` script serves as the central command-line interface.

In [None]:
!python main.py --help
!python main.py train --help

### 1. Train Command
**Purpose:** Train the RL agent model.

31380

In [10]:
!python main.py train --episodes 50000 --resume "checkpoints/checkpoint_30500.pt"

^C


In [None]:
!python main.py train --episodes 2000

In [None]:
!python main.py train --episodes 500000 --resume "checkpoints/checkpoint_100.pt"  --random 5000-10000 --variable

### 2. Simulate Command
**Purpose:** Run simulations using a trained agent checkpoint.

In [5]:
!python main.py simulate --checkpoint "checkpoints/checkpoint_31200.pt" --episodes 500 --opponent "random"

Starting simulation process...
Closing Poker Environment.
Dynamically obtained Action List: ['fold', 'call', 'check', 'bet_small', 'bet_big', 'all_in']
Using device: cpu
Loading checkpoint from checkpoints/checkpoint_31200.pt
Successfully loaded model from checkpoints/checkpoint_31200.pt onto cpu.
Seat configuration: {0: 'agent', 1: 'random', 2: 'random', 3: 'random', 4: 'random', 5: 'random'}
Set Seat 2 policy to: random
Set Seat 3 policy to: random
Set Seat 4 policy to: random
Set Seat 5 policy to: random
Set Seat 6 policy to: random
Overwriting detailed log file: detailed_simulation_log.csv

--- Starting Simulation (500 tournaments) ---
--- Starting Simulation Tournament 1 ---

===== RESETTING TOURNAMENT =====
--- All-in Runout Detected ---
Dealing flop...
Dealing turn...
Dealing river...
--- All-in Runout Detected ---
Dealing turn...
Dealing river...
--- All-in Runout Detected ---
Dealing flop...
Dealing turn...
Dealing river...
--- Finished Simulation Tournament 1. Final Reward: -

In [None]:
!python main.py simulate --checkpoint "checkpoints/final_agent_model.pt" --episodes 100 --seat_config "agent,random,model,empty,random,model"

### 3. Analyze Command
**Purpose:** Analyze the detailed simulation log.

In [14]:
!python main.py analyze --detailed_log Output_CSVs/detailed_simulation_log.csv

Starting analysis process...
Ensured output directory exists: Output_CSVs
Log loaded successfully from Output_CSVs/detailed_simulation_log.csv. Rows: 7362
Analysis summary saved to Output_CSVs\decision_analysis_summary.csv

--- Analysis Summary for Agent ID: 1 ---

--- Action Summary (Preflop) ---
   Action  Count  Avg_Reward
bet_small   3601  105.246598
     call   1125   55.200000
   all_in    694  720.136167
     fold    616 -619.875000
  bet_big      4    0.000000
    check      2    0.000000

--- Starting Hand Distribution (Preflop) ---
Hand_Class  Count
Q5 Offsuit     83
63 Offsuit     79
97 Offsuit     77
92 Offsuit     76
Q7 Offsuit     75
Q4 Offsuit     73
K7 Offsuit     70
J3 Offsuit     68
J7 Offsuit     67
Q3 Offsuit     66
AJ Offsuit     66
95 Offsuit     66
62 Offsuit     66
K3 Offsuit     65
T9 Offsuit     64
KT Offsuit     64
A6 Offsuit     63
K8 Offsuit     62
93 Offsuit     62
T3 Offsuit     62
K2 Offsuit     61
K6 Offsuit     61
72 Offsuit     61
A4 Offsuit     61
J6

### 4. GUI Command
**Purpose:** Launch the graphical user interface.

In [1]:
!python main.py ui

Launching Enhanced Poker UI...
Relative imports successful.
Using device: cpu
Setting up game...
Using Seat Config: {0: 'player', 1: 'model', 2: 'model', 3: 'model', 4: 'model', 5: 'model'}
Checkpoint Paths: {0: None, 1: None, 2: None, 3: None, 4: None, 5: None}
Looking for checkpoints in: c:\Users\Matheus Viana\Documents\School\DS 440W\PokerBotRL\checkpoints
Found default checkpoint: c:\Users\Matheus Viana\Documents\School\DS 440W\PokerBotRL\checkpoints\final_agent_model.pt
Successfully loaded model from c:\Users\Matheus Viana\Documents\School\DS 440W\PokerBotRL\checkpoints\final_agent_model.pt onto cpu.
Primary agent model loaded successfully.
Poker Environment Initialized/Reset.
Set Seat 2 policy to: model
Set Seat 3 policy to: model
Set Seat 4 policy to: model
Set Seat 5 policy to: model
Set Seat 6 policy to: model
Clearing table state visuals...
Animation speed set to: 1.0x

--- Starting New Tournament ---

===== RESETTING TOURNAMENT =====
Opponent 2 (model) turn. Scheduling actio

In [11]:
!taskkill /F /IM python.exe

: 