# PokerBot Training on Google Colab

This notebook trains a neural network-based poker bot using GPU acceleration.

## Setup Instructions

1. **Enable GPU**: Runtime → Change runtime type → GPU → Save
2. **Upload files**: Upload all PokerBot files to Colab (or clone from GitHub)
3. **Run all cells** in order

## Time Estimates
- Small scale (10K samples): ~10-20 minutes with GPU
- Medium scale (100K samples): ~1-2 hours with GPU

## Step 1: Setup and Installation

In [None]:
# Install dependencies
#!pip install tensorflow[and-cuda] numpy termcolor -q

# Verify installation
import tensorflow.compat.v1 as tf
tf.disable_v2_behavior()
import numpy as np
import json
import os
import subprocess
from google.colab import drive

print("✓ Dependencies installed")

Instructions for updating:
non-resource variables are not supported in the long term


✓ Dependencies installed


In [None]:
# Check GPU availability
gpu_devices = tf.config.list_physical_devices('GPU')
if len(gpu_devices) > 0:
    print(f"✓ GPU Detected: {len(gpu_devices)} device(s)")
    for i, gpu in enumerate(gpu_devices):
        print(f"  GPU {i}: {gpu.name}")
else:
    print("⚠ No GPU detected!")
    print("Please enable GPU: Runtime → Change runtime type → GPU")

# Test GPU computation
with tf.Session() as sess:
    with tf.device('/GPU:0' if len(gpu_devices) > 0 else '/CPU:0'):
        a = tf.constant([[1.0, 2.0], [3.0, 4.0]])
        b = tf.constant([[1.0, 1.0], [0.0, 1.0]])
        c = tf.matmul(a, b)
        result = sess.run(c)
    print(f"✓ Test computation successful (shape: {result.shape})")

✓ GPU Detected: 1 device(s)
  GPU 0: /physical_device:GPU:0
✓ Test computation successful (shape: (2, 2))


## Step 2: Upload PokerBot Files

Choose one method:

### Method 1: Upload from Local Computer

In [None]:
from google.colab import files
import zipfile

print("Upload PokerBot.zip (containing all .py files)")
uploaded = files.upload()

# Extract
for filename in uploaded.keys():
    if filename.endswith('.zip'):
        with zipfile.ZipFile(filename, 'r') as zip_ref:
            zip_ref.extractall('.')
        print(f"✓ Extracted {filename}")

Upload PokerBot.zip (containing all .py files)


Saving poker.zip to poker.zip
✓ Extracted poker.zip


In [None]:
# Verify required files are present
required_files = ['pokerbot_FFdumb.py', 'pokerbot_qlearn.py', 'tableQ.py']
missing = []

for file in required_files:
    if os.path.exists(file):
        print(f"✓ {file}")
    else:
        print(f"✗ {file} - MISSING")
        missing.append(file)

if missing:
    print(f"\n⚠ Please upload: {', '.join(missing)}")
else:
    print("\n✓ All required files present!")

✓ pokerbot_FFdumb.py
✓ pokerbot_qlearn.py
✓ tableQ.py

✓ All required files present!


## Step 3: Generate Training Data

In [None]:
# Create directories
os.makedirs('training', exist_ok=True)
os.makedirs('ffresults', exist_ok=True)
print("✓ Directories created")

✓ Directories created


rom tableQ import generateTrainingData

In [None]:
# Import after files are uploaded
from tableQ import generateTrainingData
from tqdm import tqdm
import time

# Configuration
NUM_SAMPLES = 100000  # Start with 10K for testing (increase for production)
ROUND_NAMES = ['pre-flop', 'flop', 'turn', 'river']

# Progress tracking
start_time = time.time()

# Generate training data for all rounds
for round_num in tqdm(range(4), desc="Overall Progress", position=0):
    print(f"\n{'='*60}")
    print(f"Generating {NUM_SAMPLES} games for {ROUND_NAMES[round_num]} (round {round_num})...")
    print(f"{'='*60}")

    generateTrainingData(
        numSamples=NUM_SAMPLES,
        fname=f'training/round{round_num}.json',
        iterPrint=max(1, NUM_SAMPLES // 10),
        numPlayers=2,
        targetRound=round_num
    )

    elapsed = time.time() - start_time
    print(f"✓ Round {round_num} complete | Elapsed: {elapsed:.1f}s")

total_time = time.time() - start_time
print("\n" + "="*60)
print(f"✓ All training data generated!")
print(f"Total samples: {NUM_SAMPLES * 4:,}")
print(f"Total time: {total_time:.1f}s ({total_time/60:.1f} min)")
print("="*60)

Overall Progress:   0%|          | 0/4 [00:00<?, ?it/s]


Generating 50000 games for pre-flop (round 0)...
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
dumping


Overall Progress:  25%|██▌       | 1/4 [00:09<00:29,  9.75s/it]

finished dump
negCount: 47949
✓ Round 0 complete | Elapsed: 9.8s

Generating 50000 games for flop (round 1)...
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
dumping


Overall Progress:  50%|█████     | 2/4 [00:19<00:20, 10.02s/it]

finished dump
negCount: 47875
✓ Round 1 complete | Elapsed: 20.0s

Generating 50000 games for turn (round 2)...
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
dumping


Overall Progress:  75%|███████▌  | 3/4 [00:30<00:10, 10.39s/it]

finished dump
negCount: 47943
✓ Round 2 complete | Elapsed: 30.8s

Generating 50000 games for river (round 3)...
0
5000
10000
15000
20000
25000
30000
35000
40000
45000
dumping


Overall Progress: 100%|██████████| 4/4 [00:40<00:00, 10.03s/it]

finished dump
negCount: 47962
✓ Round 3 complete | Elapsed: 40.1s

✓ All training data generated!
Total samples: 200,000
Total time: 40.1s (0.7 min)





## Step 4: Train Feed-Forward Models

In [None]:
# Training configuration
EPOCHS = 100
LAYERS = 4
NEURONS = 128
DROPOUT = 0.5
LEARNING_RATE = 0.0001

# Train models for each round
for round_num in range(4):
    print(f"\n{'='*60}")
    print(f"Training model for round {round_num} ({ROUND_NAMES[round_num]})...")
    print(f"{'='*60}")

    cmd = [
        'python', 'pokerbot_FFdumb.py',
        f'training/round{round_num}.json',
        str(NUM_SAMPLES),      # limit
        str(EPOCHS),           # epochs
        str(LAYERS),           # layers
        str(NEURONS),          # neurons
        str(DROPOUT),          # dropout
        str(LEARNING_RATE),    # learning rate
        '1',                   # doSave
        '1',                   # trainOrLoad (1=train)
        f'ffresults/round{round_num}_model.ckpt'
    ]

    result = subprocess.run(cmd, capture_output=True, text=True)
    print(result.stdout)
    if result.stderr:
        print("Errors:", result.stderr)

    print(f"✓ Round {round_num} training complete")

print("\n" + "="*60)
print("✓ All feed-forward models trained!")
print("="*60)


Training model for round 0 (pre-flop)...
Save is ON
d len: 100000
{'dealerIndex': 0, 'playerIndex': 1, 'handCards': [33, 12], 'tableCards': [], 'pot': 200, 'bets': [100, 100], 'remPlayers': [True, True], 'rd_num': 0, 'action': [1, 0], 'result': 1}
Negcount: 23972
[1 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 1]
Finished loading and parsing data...
Creating NN structure...
✓ GPU detected and enabled
  GPU devices: ['/physical_device:GPU:0']
Starting training...
step 0, training accuracy 0.5604
step 1, training accuracy 0.565057
step 2, training accuracy 0.5676
step 3, training accuracy 0.569457
step 4, training accuracy 0.571971
step 5, training accuracy 0.572543
step 6, training accuracy 0.574543
step 7, training accuracy 0.5756
step 8, training accuracy 0.577514
step 9, training accuracy 0.580171
step 10, training accuracy 0.5824
step 11, training accuracy 0.585
step 12, training accuracy 0.5856
step 13, training accuracy 0.587514
step 14, training accuracy 0.588229
step 15, training accuracy

## Step 5: Train Q-Learning Agent

In [None]:
print(f"\n{'='*60}")
print("Training Q-learning agent...")
print(f"{'='*60}")

cmd = [
    'python', 'pokerbot_qlearn.py',
    'training/round0.json',              # Q-learning data
    '0.1',                               # random exploration rate
    '0.2',                               # Q-learning discount factor
    str(LAYERS),                         # layers
    str(NEURONS),                        # neurons
    str(DROPOUT),                        # dropout
    str(LEARNING_RATE),                  # learning rate
    '1',                                 # doSave
    '1',                                 # next round type (1=dumb, 0=q, -1=none)
    'ffresults/round0_model.ckpt',      # current round model
    'ffresults/round1_model.ckpt',      # next round model
    'ffresults/qlearn_model.ckpt',      # output Q-learning model
    'training/round0.json',              # current micro data
    'training/round1.json'               # next micro data
]

result = subprocess.run(cmd, capture_output=True, text=True)
print(result.stdout)
if result.stderr:
    print("Errors:", result.stderr)

print("\n" + "="*60)
print("✓ Q-learning training complete!")
print("="*60)


Training Q-learning agent...
Save is ON
d len: 20000
Converting data format from actDicts to [featDict, tableDict]...
Converted 20000 samples
numSamples: 20000
{'dealerIndex': 0, 'playerIndex': 1, 'handCards': [45, 5], 'tableCards': [], 'pot': 200, 'bets': [100, 100], 'remPlayers': [True, True], 'rd_num': 0, 'action': [1, 0], 'result': 1}
[1, 0, 0, 1, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 100, 100, 200, 1, 1, 1]
d len: 20000
numSamples: 20000
{'dealerIndex': 0, 'playerIndex': 1, 'handCards': [6, 10], 'tableCards': [], 'pot': 200, 'bets': [100, 100], 'remPlayers': [True, True], 'rd_num': 0, 'action': [1, 0], 'result': -1}
[2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0]
Finished loading and parsing data...
Creating NN structure...
d len: 100000
numSamples: 100000
{'dealerIndex': 0, 'playerIndex': 0, 'handCards': [21, 47], 'tableCards': [10, 20, 17], 'pot': 200, 'bets': [100, 100], 'remPlayers': [True, True], 'rd_num': 1, 'handScore': 5.7839721254355405, 'handRank': 8, 'action': [1,

## Step 6: Download Results

In [None]:
# Create zip of results
import zipfile
from google.colab import files

zip_filename = 'pokerbot_results.zip'

with zipfile.ZipFile(zip_filename, 'w') as zipf:
    # Add model checkpoints
    if os.path.exists('ffresults'):
        for root, dirs, files_list in os.walk('ffresults'):
            for file in files_list:
                if any(file.endswith(ext) for ext in ['.ckpt.index', '.ckpt.meta', '.ckpt.data-00000-of-00001']):
                    filepath = os.path.join(root, file)
                    zipf.write(filepath, os.path.relpath(filepath))
                    print(f"Added: {filepath}")

print(f"\n✓ Created {zip_filename}")

# Download
files.download(zip_filename)
print("✓ Download started!")