Monopoly is a Python Monopoly project with three major layers built around the same game engine:
- A rules-driven game model in
src/monopoly. - A
pygameGUI with local play, online lobby support, save/load, replay, and debug tooling. - An RL/agent stack that can train, evaluate, benchmark, and tournament AI checkpoints against each other or against scripted opponents.
The same engine powers human play, GUI interactions, saved games, AI turn execution, and offline training. That keeps the runtime behavior consistent across the project instead of maintaining separate code paths for gameplay and AI.
For contributor-oriented workflows and quick debug commands, see DEVELOPMENT.md.
- Full Monopoly rules engine with properties, rent, jail, auctions, mortgages, buildings, trades, cards, saving, and restore.
pygamefrontend with lobby setup, AI slot assignment, online session discovery, and a debug editor.- Structured file logging for frontend and backend processes.
- RL agent pipeline with action masking, observation encoding, reward shaping, self-play training, checkpoints, evaluation, benchmarks, and tournaments.
- Scripted AI profiles for baseline opponents and league-style self-play.
- Automated tests across engine, GUI support layers, networking, and agent utilities.
.
|-- main.py
|-- train_agent.py
|-- evaluate_agent.py
|-- tournament_checkpoints.py
|-- run_tests.py
|-- requirements.txt
|-- src/monopoly/
| |-- game.py
| |-- board.py
| |-- rules.py
| |-- cards.py
| |-- spaces.py
| |-- player.py
| |-- trading.py
| |-- api.py
| |-- logging_utils.py
| |-- gui/
| | |-- launcher.py
| | |-- backend_process.py
| | |-- transport.py
| | |-- rendezvous.py
| | `-- pygame_frontend/
| `-- agent/
| |-- action_space.py
| |-- features.py
| |-- heuristics.py
| |-- reward.py
| |-- model.py
| |-- controller.py
| |-- environment.py
| |-- trainer.py
| |-- evaluation.py
| |-- league.py
| |-- scripted.py
| `-- worker_pool.py
`-- tests/
- Python 3.11 or newer
- Windows, macOS, or Linux
- A desktop environment for the
pygameGUI - Optional CUDA-capable PyTorch install if you want GPU training
The repository currently targets editable local installs and uses setuptools via pyproject.toml.
Windows PowerShell:
python -m venv .venv
.\.venv\Scripts\Activate.ps1macOS or Linux:
python -m venv .venv
source .venv/bin/activateIf you prefer conda, create an environment with Python 3.11+ and activate it before installing dependencies.
PyTorch should be installed before the rest of the dependencies so you can choose the correct build for your machine.
CPU-only example:
python -m pip install torchCUDA example:
python -m pip install torch --index-url https://download.pytorch.org/whl/cu124Choose the CUDA wheel that matches your driver and toolkit support. If you are not training on GPU, install the CPU build.
python -m pip install --upgrade pip
python -m pip install -r requirements.txtThat installs the remaining runtime dependencies and the local package in editable mode using -e . without overwriting the PyTorch build you selected earlier.
python main.pyWhat this does:
- Starts the rendezvous service used for online lobby discovery.
- Starts the
pygamefrontend process. - Lets the frontend spin up and communicate with a backend game process.
By default, main.py launches the GUI with DEBUG_MODE = False. If you want the debug editor, set DEBUG_MODE = True in main.py before starting the app.
Use the setup screen to:
- choose player count
- assign each seat as human or AI
- select AI checkpoints or scripted AI profiles for AI seats
- configure AI action cooldown speed per AI seat
- start a new local game
The frontend sends commands to the backend process, which owns the authoritative Game object and returns serialized frontend state for rendering.
The GUI supports host-authoritative online sessions.
Runtime flow:
- The host creates an online lobby.
- The backend registers the lobby with the rendezvous process.
- Joining clients resolve the session code through the rendezvous service.
- Clients connect to the host backend through the socket transport layer.
- The host backend remains authoritative for game state and broadcasts updates.
Relevant components:
src/monopoly/gui/rendezvous.py: session-code registration and resolutionsrc/monopoly/gui/transport.py: framed socket request/response and event transportsrc/monopoly/gui/backend_process.py: authoritative backend runtime, lobby state, AI seat handling, save/load, debug, and action executionsrc/monopoly/gui/pygame_frontend/app.py: GUI flow, prompts, board rendering, and debug UI
The backend exposes save/load support through the GUI. Game state is serialized by the engine and can be restored later, including interactive state needed to continue an unfinished turn.
The project uses rotating file logs with a shared format:
YYYY-MM-DD HH:MM:SS | LEVEL | logger.name | message
Default log directory:
logs/
Typical log files:
logs/frontend.loglogs/backend.log
Relevant environment variables:
MONOPOLY_LOG_DIR: override the log folderMONOPOLY_LOG_LEVEL: set log verbosity such asDEBUG,INFO, orWARNING
Run the full test suite with coverage:
python run_tests.pyYou can also run pytest directly for focused suites:
python -m pytest
python -m pytest tests/test_game.py
python -m pytest tests/test_agent.py -k action_spaceThe engine lives in src/monopoly/game.py and coordinates the full turn state machine.
Key engine responsibilities:
- create and own players, board, dice, and turn state
- resolve movement and landing effects
- handle auctions, jail decisions, property purchase decisions, and trades
- validate legal actions and expose them as serialized turn plans
- support save/load and full-state restoration
Supporting engine modules:
board.py: builds the standard board and card decksrules.py: rent, mortgage, and building validation logicspaces.py: typed board-space modelscards.py: Chance and Community Chest deck definitions and effectsplayer.py: mutable player statetrading.py: trade validation and executionapi.py: serialized views used by frontend, online runtime, and agents
The GUI is process-based rather than embedding everything in one loop.
Main responsibilities by component:
main.py: application entry pointsrc/monopoly/gui/launcher.py: starts rendezvous and frontend processessrc/monopoly/gui/pygame_frontend/app.py: top-level GUI application, screens, prompts, and debug toolssrc/monopoly/gui/pygame_frontend/controller.py: client-side orchestration between GUI and backendsrc/monopoly/gui/pygame_frontend/board.py: board renderingsrc/monopoly/gui/backend_process.py: authoritative game runtime and online session management
The frontend does not implement game rules itself. It renders serialized state, offers user actions, and sends actions back to the backend.
flowchart LR
User[Player] --> GUI[pygame Frontend]
GUI --> Controller[Frontend Controller]
Controller --> Backend[Backend Runtime]
Backend --> Game[Authoritative Game Engine]
Game --> API[Serialized Frontend State]
API --> Controller
Controller --> GUI
Host[Host Lobby] --> Rendezvous[Rendezvous Service]
Client[Joining Client] --> Rendezvous
Client --> Transport[Socket Transport]
Transport --> Backend
Backend --> AIHost[AI Host or Scripted Controller]
AIHost --> AgentController[Agent Policy Controller]
AgentController --> Encoder[Observation Encoder]
AgentController --> ActionSpace[Action Space]
AgentController --> Policy[Policy Model]
Policy --> AgentController
AgentController --> Backend
Trainer[Parallel Trainer] --> Environment[Self-Play Environment]
Environment --> Game
Environment --> Reward[Reward Function]
Environment --> Examples[Training Examples]
Examples --> Trainer
Trainer --> Checkpoints[Checkpoint Files]
Checkpoints --> Evaluator[Evaluation or Tournament]
Interpretation:
- The GUI only renders state and submits commands.
- The backend owns the authoritative game state and applies all legal actions.
- Online discovery happens through the rendezvous service, but actual gameplay traffic goes through the host backend.
- AI decisions use the same legal-action interface as human commands.
- Training reuses the same engine through the self-play environment, then writes checkpoints that can be evaluated or tournamented later.
The RL code lives in src/monopoly/agent.
Pipeline overview:
- The engine exposes legal actions and frontend state.
features.pyencodes state into a fixed-size observation vector.action_space.pyexpands legal engine actions into a discrete policy action space.model.pyscores those actions and predicts values.controller.pyturns model outputs into legal gameplay choices.environment.pyruns self-play episodes and produces training examples.reward.pycomputes reward shaping between states.trainer.pycoordinates rollout collection, PPO updates, checkpoints, and optional benchmark runs.
Additional agent support modules:
board_analysis.py: strategic board metrics shared across encoding and evaluationheuristics.py: optional heuristic priors for action scoringscripted.py: deterministic or semi-random scripted AI opponentsleague.py: self-play snapshot management and league source mixingworker_pool.py: persistent rollout workers for faster self-play collectionevaluation.py: checkpoint evaluation, benchmarks, Elo summaries, and tournamentscheckpoints.py: checkpoint path resolution and controller loadingconfig.py: training, policy, heuristic, and reward configuration dataclasses
Fresh training run:
python train_agent.py --iterations 50Before running GPU training, make sure you installed a CUDA-enabled PyTorch build in the environment during setup.
For long-running jobs in nohup, tmux, CI, or redirected logs, use plain log-style output instead of the interactive progress bars:
python train_agent.py --iterations 50 --plain_outputResume from a checkpoint:
python train_agent.py --resume .checkpoints/latest.pt --iterations 20Useful options:
--threads: rollout worker count--episodes-per-thread: episodes per worker per iteration--max-steps: environment step cap per episode--max-actions: raw action cap per episode--players: players per self-play game--model-type {mlp,transformer}--device cpuor--device cuda--checkpoint-dir .checkpoints--checkpoint-interval 5--plain_output: disable tqdm progress bars and emit log-style status lines--heuristic-biasor--no-heuristic-bias--league-self-playor--no-league-self-play--benchmark-interval 10
What training writes:
- periodic checkpoints such as
.checkpoints/iteration_0005.pt - final
.checkpoints/latest.pt - console summaries for rollout, update, and benchmark phases
Plain-output mode writes line-oriented status updates that are easier to capture in files, for example under Ubuntu:
nohup python train_agent.py --iterations 200 --plain_output > train.log 2>&1 &
tail -f train.logEvaluate one checkpoint:
python evaluate_agent.py .checkpoints/latest.pt --games 8 --players 2 --device cpuRun a benchmark suite against other checkpoints:
python evaluate_agent.py .checkpoints/latest.pt --benchmark-opponents .checkpoints/iteration_0050.pt .checkpoints/iteration_0100.pt --games 8 --players 2The evaluation script prints per-run summary metrics including wins, draws, average steps, assets, rent trend, monopoly-denial events, board-strength trend, and auction-bid quality.
Run a fixed-seed tournament across multiple checkpoints:
python tournament_checkpoints.py .checkpoints/iteration_0050.pt .checkpoints/iteration_0100.pt .checkpoints/latest.pt --games 6 --players 2 --device cpuThis produces aggregate tournament summaries plus per-seed matchup results.
The project creates several local artifacts during normal use:
.checkpoints/: training checkpointslogs/: frontend and backend logs.coverage: coverage data file.pytest_cache/: pytest cache__pycache__/: Python bytecode caches
These are local development artifacts and should not be committed.
Run the game GUI:
python main.pyRun all tests:
python run_tests.pyTrain from scratch on CPU:
python train_agent.py --iterations 20 --device cpu --threads 2 --plain_outputResume training on GPU:
python train_agent.py --resume .checkpoints/latest.pt --iterations 20 --device cuda --plain_outputEvaluate the latest checkpoint:
python evaluate_agent.py .checkpoints/latest.pt --games 8 --players 2Tournament three checkpoints:
python tournament_checkpoints.py .checkpoints/iteration_0020.pt .checkpoints/iteration_0040.pt .checkpoints/latest.pt- The default GUI entry point is
main.py. - Training now starts from scratch by default; use
--resumeonly when you actually want checkpoint resume behavior. --plain_outputis recommended for non-interactive terminals, redirected output, andnohupruns.- For first-time training on a machine without CUDA, pass
--device cpuexplicitly if you want to avoid any ambiguity. - Logs and checkpoints are intentionally local artifacts and are ignored by
.gitignore.