Browser-based Pac-Man + in-browser Q-learning training lab. No backend, no build-time dependencies beyond Node.
npm install
npm run devOpen the local Vite dev URL (usually http://localhost:5173).
- Select Human in the Mode dropdown.
- Use arrow keys to move Pac-Man.
- Click Reset at any time to restart the current episode with the selected seed.
- 🟡 Pellets — eat all pellets to win the level and earn a win bonus.
- ⭐ Power pellets — larger, pulsing orange orbs in the maze corners. Eating one makes all ghosts edible (they turn blue) for a limited time.
- 👻 Ghosts — each ghost has a distinct color (red, pink, blue, orange, purple, green). Contact with a non-edible ghost kills Pac-Man and ends the game.
- 😋 Eating ghosts — while ghosts are edible, Pac-Man can eat them for bonus points. A combo multiplier rewards eating multiple ghosts per power pellet (1x, 2x, 3x, 4x).
- 📊 Scoring — points come from pellets (+5), power pellets (+20), eating ghosts (+30 x combo), and clearing all pellets (win bonus +200). All values are configurable.
| Symbol | Meaning |
|---|---|
| Colored outlines | Walls (color varies per maze) |
| 🟡 Small yellow dots | Regular pellets |
| ⭐ Pulsing orange orbs | Power pellets |
| 👻 Colored ghost shapes with eyes | Ghosts (flash white near timer expiry) |
| 💛 Yellow wedge | Pac-Man |
Score, pellets remaining, and current step count are displayed below the canvas.
Three hand-designed mazes of increasing size:
| Maze | Size | Wall color |
|---|---|---|
| Classic | 19×15 | 🔵 Blue |
| Arena | 21×17 | 🟣 Purple |
| Corridors | 17×13 | 🟢 Green |
Five procedurally generated mazes are available out of the box (Procedural #100 through #104). Each is built with a recursive-backtracker algorithm with extra loop-opening passes to create Pac-Man-friendly layouts, plus a central ghost house. Wall colors are randomly assigned per seed.
The generator can produce unlimited unique mazes — see generateMaze(seed) in src/mazes/mazes.ts.
- Select AI controlled in the Mode dropdown.
- The current Q-table policy runs at ~120 ms/step.
- If no policy has been trained or loaded yet, the agent acts randomly.
Switching to AI mode automatically stops any running training loop.
- ⚙️ Configure environment parameters in the right-hand panel (maze, ghost count, speeds, rewards, etc.).
- 🌱 Set seed — determines pellet layout and ghost/pac start positions for each episode.
- Click
▶️ Start training — launches arequestAnimationFrametraining loop.- Adjust steps/frame and turbo at any time; the loop picks up changes immediately.
- Adjust renderEveryNSteps to control how often the canvas refreshes during training (higher = faster throughput).
- The green ● TRAINING — episode N badge in the header shows training is active.
- Click ⏸️ Pause to stop the loop without resetting the Q-table or stats.
- Click ⏭️ Single step to advance exactly one environment step (useful for debugging).
- Click 📈 Evaluate to run 20 greedy-policy episodes and display avg score / length / win rate.
- 💾 Save policy downloads the Q-table as JSON (
policy-<timestamp>.json). - 📂 Load policy restores a previously saved JSON file.
- 🗑️ Reset Q clears the Q-table and stats.
- Start with 1 ghost, Classic maze, default rewards.
- Set steps/frame to 50–200 and enable turbo for ~10× throughput.
- Watch the Moving avg score chart; it should trend upward after a few hundred episodes.
- Decay epsilon toward 0 via epsilonDecay ≈ 0.999 (default) or lower for faster exploitation.
| Parameter | Default | Description |
|---|---|---|
numGhosts |
1 | 👻 Number of ghosts (1–6) |
numPacmen |
1 | 💛 Number of Pac-Man clones (extra clones move randomly) |
ghostSpeed |
0.95 | ⚡ Fractional tiles/step. 0.5 = moves every other step; 2 = 2 tiles/step |
pacmanSpeed |
1.0 | ⚡ Same scale as ghostSpeed |
pelletDensity |
1.0 | 🟡 Fraction of open cells that spawn a pellet |
enablePowerPellets |
true | ⭐ Spawn power pellets at maze-defined corner positions |
powerPelletDuration |
20 | ⏱️ Steps ghosts remain edible after power pellet |
captureRules |
tile | 🎯 tile = same cell; touch = manhattan distance ≤ 1 |
maxEpisodeSteps |
400 | ⏰ Hard episode timeout |
illegalMoveMode |
stay | 🚫 stay = ignore illegal key; noop = take random legal move |
| Key | Default | Notes |
|---|---|---|
pelletReward |
5 | 🟡 Per pellet eaten |
powerPelletReward |
20 | ⭐ Per power pellet eaten |
deathPenalty |
-100 | 💀 Captured by a non-edible ghost |
stepPenalty |
-0.1 | ⏱️ Per-step cost to discourage idling |
survivalReward |
0.02 | 💚 Per-step bonus while alive |
ghostEatReward |
30 | 😋 Base reward for eating an edible ghost (multiplied by combo) |
winBonus |
200 | 🏆 Bonus for clearing all pellets |
npm run buildOutput lands in dist/. Host on any static server (GitHub Pages, Netlify, Cloudflare Pages, etc.).
npm run preview # local preview of the built distnpm testThree test suites: maze collision, observation determinism, Q-value update.
src/
engine/ Core types (Direction, Vec2) and seeded PRNG
env/ PacmanEnvironment (reset/step/observe) + observation encoding
ghosts/ Ghost AI strategies: classic, heatmap, hybrid
rl/ QLearningAgent + TrainingController
render/ CanvasRenderer (walls, pellets, ghosts, Pac-Man, heatmap overlay)
ui/ LineChart component
mazes/ Static maze definitions + procedural maze generator
- Add a new literal to
GhostAITypeinsrc/ghosts/ghostAi.ts. - Add a branch in
chooseGhostMove. - The UI dropdown will pick it up automatically.
- Define a string grid in
src/mazes/mazes.ts(1 = wall, 0 = open). - Call
parse(id, name, rows, wallColor)and add it toSTATIC_MAZES. - Or use
generateMaze(seed, width, height, wallColor)for procedural mazes. - Select it via the Maze dropdown.
- 🧠 Q-table observation is compact but lossy (5×5 wall mask + nearest pellet direction + clamped ghost offsets). A neural DQN would generalise better.
- 👥 Extra Pac-Man clones (numPacmen > 1) move randomly and do not collect pellets — a cooperative multi-agent extension is scaffold-ready.
- ⏱️ Ghost edibility timer does not reset between episodes if Pause is used mid-episode (resets on the next
env.reset()call).