Agent learns to play a simple game using Q-Learning in Numpy
- PIL >= 6.2.0
- opencv-python >= 4.1.1
- numpy >= 1.7.3
- matplotlib >= 3.1.1
- tqdm
python q_learning.py
Note that if you want to train the agent from scratch (initiaalize the q-table randomly), then set q_table to None in q_learning.py, else set q_table to the path of the already saved q_table.
Following is the plot for the moving average of the rewards. It's upward trend shows that the agent becomes smarter with more and more episodes of training.
And here are some GIFs that show how the agent gets smarter with every episode of training.
Here is the thirsty agent looking for the bottle of beer with randomly initialized q-table. It means that the agent has no clue about the environment yet.
After some training, the agent does a relatively better job of making sequential decisions. He is not very fast yet but he ends up finding the beer eventually.
Finally after thousands of episode of training, the agent gets really good at making sequential decisions and finds the beer in no time ! : D