Skip to content

vavalm/pacmanDqn

Repository files navigation

PacmanDQN

Deep Reinforcement Learning in Pac-man

1. Prerequisites

2. Pacman rules

Context

The player navigates Pac-Man through a maze with no dead ends. The maze is filled with Pac-Dots, and includes four roving multi-colored ghosts: Blinky, Pinky, Inky, and Clyde.

Objective

The objective of the game is to accumulate as many points as possible by eating dots, and blue ghosts. When all of the dots in a stage are eaten, that stage is completed. The four ghosts roam the maze and chase Pac-Man. If any of the ghosts touches Pac-Man, the game is over.

Points

Each dot pacman eat earn 10 points. Near the corners of the maze are four flashing energizers that allow Pac-Man to eat the ghosts and earn bonus points. The enemies turn deep blue, reverse direction and move away from Pac-Man, and usually move more slowly. When an enemy is eaten, its eyes return to the center ghost box where the ghost is regenerated in its normal color. The bonus score earned for eating a blue ghost increases exponentially for each consecutive ghost eaten while a single energizer is active:

  • 200 points for eating one ghost,
  • 400 for a second,
  • 800 for a third,
  • 1600 for the fourth.

This cycle restarts from 200 points when Pac-Man eats the next energizer. Blue enemies flash white to signal that they are about to return to their normal color and become dangerous again; the length of time the enemies remain vulnerable varies from one stage to the next, generally becoming shorter as the game progresses. In later stages, the enemies begin flashing immediately after an energizer is consumed, without a solid-blue phase; starting at stage nineteen, the ghosts do not become edible at all, but still reverse direction.

Demo

Demo

3. How to execute

$ xhost local:root
$ sudo docker-compose up

Notice: to use the GPU for training, you need nvidia-docker (tutorial here)

4. Parameters

Edit the variables in .env file to edit the main parameters:

  • TRAINING_GAMES_NB: the number of games used to train the model
  • VAL_GAMES_NB: the number of games used for validation
  • LAYOUT: the map on which pacman will be played

After the training, the games will be displayed on your screen.

Example: Run a model on smallGrid layout for 6000 episodes After the training, 10 episodes will be displayed and used for validation We load a previous trained model with the file model-mediumClassic-50000-50000_1248620_6448

TRAINING_GAMES_NB=6000 
VAL_GAMES_NB=10
LAYOUT=smallGrid
SAVE_FILE=./saves/mediumClassic/model-mediumClassic-50000-50000_1248620_6448

4.1 Layout

Different layouts (= maps) can be found and created in the layouts directory. For more information, take a look on the readme file in layout directory. Set the value of variable LAYOUT to the name of the layout you want.

4.2 TOTAL_GAMES_NB

Represents the total amount of games that will be played. The games number for training is equal to TOTAL_GAMES_NB-VALIDATION_GAMES_NB

4.3 VALIDATION_GAMES_NB

Represents the games number played for validation. These games will be displayed on your screen

4.4 OTHER Parameters

Parameters can be found in the params dictionary in pacmanDQN_Agents.py.

Models are saved as "checkpoint" files in the /saves directory.
Load and save filenames can be set using the load_file and save_file parameters.

Episodes before training starts: train_start
Size of replay memory batch size: batch_size
Amount of experience tuples in replay memory: mem_size
Discount rate (gamma value): discount
Learning rate: lr

Exploration/Exploitation (ε-greedy):
Epsilon start value: eps
Epsilon final value: eps_final
Number of steps between start and final epsilon value (linear): eps_step

Citation

@article{van2016deep,
  title={Deep Reinforcement Learning in Pac-man},
  subtitle={Bachelor Thesis},
  author={van der Ouderaa, Tycho},
  year={2016},
  school={University of Amsterdam},
  type={Bachelor Thesis},
  pdf={https://esc.fnwi.uva.nl/thesis/centraal/files/f323981448.pdf},
}

Acknowledgements

Article exaplaining deep reinforcement learning for Pacman made by Towards data science

Wikipedia page on pacman

DQN Framework by (made for ATARI / Arcade Learning Environment)

Pac-man implementation by UC Berkeley:

Pac-man implementation by tychovodo

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages