Skip to content

xptea/Tetris

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

1 Commit
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐ŸŽฎ Tetris AI Agent with Deep Q-Learning

A production-grade Tetris-playing AI using Deep Q-Network (DQN) reinforcement learning, managed with UV for fast dependency management.

๐ŸŽฏ Features

  • Game Environment: Full Tetris game logic (no external game libraries required for simulation)
  • DQN Agent: Deep Q-Network with experience replay and target network
  • Epsilon-Greedy Exploration: Balances exploration and exploitation
  • Fast Training: Optimized NumPy-based game logic
  • Monitoring: Real-time training visualization and metrics
  • Checkpointing: Save/load model weights for reproducibility
  • UV Package Management: Ultra-fast Python dependency management

๐Ÿ“‹ Project Structure

tetris_agent/
โ”œโ”€โ”€ pyproject.toml         # UV project config
โ”œโ”€โ”€ src/
โ”‚   โ”œโ”€โ”€ main.py            # Training & evaluation loop
โ”‚   โ”œโ”€โ”€ tetris.py          # Game environment
โ”‚   โ”œโ”€โ”€ agent.py           # Not in this version (see model.py)
โ”‚   โ”œโ”€โ”€ model.py           # DQN architecture & agent
โ”‚   โ”œโ”€โ”€ utils.py           # Replay buffer & monitoring
โ”‚   โ””โ”€โ”€ config.py          # Hyperparameters
โ”œโ”€โ”€ models/                # Saved checkpoints
โ”œโ”€โ”€ README.md
โ””โ”€โ”€ .gitignore

๐Ÿš€ Quick Start

Installation

  1. Install UV (if not already installed):

    pip install uv
  2. Create and setup project:

    cd c:\Users\webst\OneDrive\Desktop\Tetris
    uv sync

Training

Start training the agent:

uv run python src/main.py --mode train

Training will:

  • Run for 500 episodes (configurable in config.py)
  • Save checkpoints every 500 episodes to models/
  • Display training statistics every 50 episodes
  • Generate training progress plot at models/training_progress.png

Evaluation

Evaluate a trained model:

uv run python src/main.py --mode eval --model models/tetris_agent_final.pt --eval-episodes 10

๐Ÿง  Architecture

Game Environment (tetris.py)

  • State Representation: 213-dimensional feature vector

    • Board state: 200 dimensions (20ร—10 flattened)
    • Current piece type: 7 dimensions (one-hot)
    • Piece rotation: 4 dimensions (one-hot)
    • Piece position: 2 dimensions (normalized)
  • Actions (4 total):

    • 0: Move left
    • 1: Move right
    • 2: Rotate clockwise
    • 3: Drop piece
  • Reward Function:

    • +10 per line cleared
    • -1 per time step
    • -10 for game over
    • -0.5 per piece locked

Neural Network (model.py)

Input (213) โ†’ FC(256) โ†’ ReLU โ†’ FC(256) โ†’ ReLU โ†’ FC(4 actions)
  • Architecture: 2 hidden layers with 256 units
  • Activation: ReLU
  • Output: Q-values for each action
  • Loss: Mean Squared Error (MSE)
  • Optimizer: Adam (lr=1e-4)

Training Algorithm

  • Algorithm: Deep Q-Learning with Experience Replay
  • Batch Size: 32
  • Replay Buffer: 100,000 transitions
  • Gamma (discount): 0.99
  • Epsilon Decay: 0.995 per episode
  • Target Update: Every 1000 training steps

๐Ÿ“Š Hyperparameters

Edit src/config.py to customize:

# Training
LEARNING_RATE = 1e-4
BATCH_SIZE = 32
GAMMA = 0.99
EPSILON_START = 1.0
EPSILON_END = 0.01
EPSILON_DECAY = 0.995

# Network
HIDDEN_DIM = 256

# Episodes
NUM_EPISODES = 500
MAX_STEPS_PER_EPISODE = 10000

๐Ÿ“ˆ Monitoring

Training generates:

  1. Console Output: Real-time metrics every 50 episodes

    Episode 50 | Avg Reward: 45.32 | Avg Lines: 4.21 | Avg Loss: 0.0234
    
  2. Plot: models/training_progress.png with 4 subplots:

    • Episode rewards + moving average
    • Lines cleared per episode
    • Episode length (steps)
    • Training loss

๐Ÿ’พ Model Checkpoints

Models are saved during training:

  • models/tetris_agent_ep500.pt - Checkpoint at episode 500
  • models/tetris_agent_final.pt - Final trained model
  • models/tetris_agent_emergency.pt - Saved if training interrupted

Load a model:

from model import DQNAgent
agent = DQNAgent(state_dim=213, device="cuda")
agent.load("models/tetris_agent_final.pt")

๐Ÿ”ง Troubleshooting

CUDA Issues

If you don't have a GPU, edit config.py:

DEVICE = "cpu"  # Change from "cuda"

Out of Memory

Reduce batch size in config.py:

BATCH_SIZE = 16  # Reduce from 32

Training Too Slow

  • Reduce NUM_EPISODES for quick testing
  • Use DEVICE = "cuda" if available
  • Increase UPDATE_FREQ to train less frequently

๐Ÿ“š Key Concepts

Experience Replay

Stores transitions in a buffer and samples random batches for training, breaking correlations between consecutive samples.

Target Network

Separate network updated less frequently, providing stable Q-value targets and reducing training instability.

Epsilon-Greedy

Exploration strategy that selects random actions with probability ฮต, enabling the agent to discover new strategies.

Q-Learning

Learns optimal action-value function Q(s,a) that estimates the expected return from taking action a in state s.

๐ŸŽ“ Learning Resources

๐Ÿ“ Future Enhancements

  • Add Double DQN for reduced overestimation
  • Implement Dueling DQN architecture
  • Add prioritized experience replay
  • Frame stacking for temporal awareness
  • Pygame visualization during training
  • Policy gradient methods (A3C, PPO)
  • Rust backend for game physics (pyo3)

๐Ÿ“„ License

MIT License - Feel free to use for learning and research.

๐Ÿค Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Submit a pull request

Built with โค๏ธ for AI + Tetris enthusiasts

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages