Skip to content

burchellmax554-afk/PacManAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This project implements core Markov Decision Process (MDP) and Reinforcement Learning techniques in the classic UC Berkeley Pac-Man environment. Using value iteration, the agent first learns an optimal policy offline by repeatedly evaluating state utilities and computing the best action for each state based on expected long-term reward. The implementation includes full updates of state values and Q-values using transition probabilities, rewards, and discounting, enabling the agent to solve the Gridworld MDP and behave optimally after sufficient iterations.

In addition to planning-based methods, the project extends into model-free reinforcement learning through Q-learning. The Q-learning agent updates its action–value estimates directly from experience using exploration, learning rate, and discounted future rewards. Once trained, the same implementation seamlessly transfers to Pac-Man, where the agent learns effective strategies by playing thousands of training episodes. Together, these two components demonstrate the contrast between MDP planning and experiential learning, showcasing foundational AI techniques used in optimal decision-making and autonomous game-playing.

About

A reinforcement-learning project implementing value iteration and Q-learning agents for both Gridworld and Pac-Man environments.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages