Pac-man Bot

This repository contains the implementation of a simple reinforcement learning task. So it's not exactly the same as the original Pac-man game. In this environment, the episode ends when the agent gets all the stars (unlike the original Pac-man game that the episode ends when all coins are removed).

Algorithms

The algorithms in this repository focus on table-based classical reinforcement learning algorithms that do not use the deep neural network. Currently, the approximation version does not converge well, so it needs to be modified.

Monte-Carlo Method
SARSA
Q-learning
Double Q-learning
Value Approximation
REINFORCE

How To Run

You can execute code in scripts directory. The name of the script file means algorithm-environment.sh.

cd scripts/
sh ${runfile name} 	# Ex) sh mc-small.sh

The results of code execution can be found in the results directory or terminal window.

Requirements

You can check the virtual environments and required modules in environment.yaml and requirements.txt.

Environments

SmallGridEnv

In this environment, the ghost does not move. And since there is only one star, the episode ends when the agent arrives at the star. The figure below is a visualization of the environment through visualize_matrix(). Visualization is also possible through the env.render() function.

Observation space: 5 x 5 grid world - wall positon
- observation_space.n = (25 - 5) = 20
Action space: { up: 0, down: 1, left: 2, right: 3 }
Reward: { ghost: -100, wall: -5, others: -1 }
- Encountering a wall does not end the episode. The agent is just stationary as it is.

BigGridEnv

In this environment, the ghost randomly moves left and right. And since there are multiple stars, the episode ends when all the stars are obtained. The visualization method is the same as SmallGridEnv.

Observation space: (11 x 11 grid world - wall positon) x (Star state) x (Ghost position state)
- observation_space.n = (121 - 40) * (2^4) * (3 * 7) = 27216
Action space: { up: 0, down: 1, left: 2, right: 3 }
Reward: { star: 100, ghost: -100, wall:-5 ,others: -1 }
- Encountering a wall does not end the episode. The agent is just stationary as it is.

UnistEnv

There is no big difference from BigGridEnv. It's just that the wall position and the ghost position have changed. And the wall looks like UNIST 😙

References

Sutton, Richard S., and Andrew G. Barto. Reinforcement learning: An introduction. MIT press, 2018.
UNIST AI51201 Reinforcement Learning, Instructor Sungbin Lim

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
img		img
notebook		notebook
results		results
scripts		scripts
.gitignore		.gitignore
Readme.md		Readme.md
agents.py		agents.py
environment.yaml		environment.yaml
envs.py		envs.py
requirements.txt		requirements.txt
run.py		run.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pac-man Bot

Algorithms

How To Run

Requirements

Environments

SmallGridEnv

BigGridEnv

UnistEnv

References

About

Releases

Packages

Contributors 2

Languages

yuhodots/pacman

Folders and files

Latest commit

History

Repository files navigation

Pac-man Bot

Algorithms

How To Run

Requirements

Environments

SmallGridEnv

BigGridEnv

UnistEnv

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages