TicTacToe with implemented Q-learning where the agent gets only basic game rules. It learns how to play a game only by playing thousands of matches with itself. Without interaction with human. The program saves what he learned in files in 'policy' catalog. It uses it later to play games with human. The program works quite well. It's not simple to win against it. Most matches end with tie or win for a computer. I trained it by letting him play 500 000 matches with itself.
During training - playing games with itself it saves and updates every state of the board to the dictionary and adds value that tells how good this state is. The picture below shows a piece of code that does this thing.
self.lr
- Learning Rate How quickly the agent learns new information and how long the user remembers old information. I keep this value as 0.1
self.decay_gamma
- Decay Rate This variable decides if early or late moves are more important for the result. Usually it has a value 0.8 or 0.9
self.exp_rate
- Exploratory Rate Probability that the agend will choose completely random move. A bit of spontaneity is never a bad thing. Hawever, too high value brings too much randomness which causes worse result. I tried many values and the best are between 0.1 till 0.3
self.states_value
The dictionary contains all encountered states and values assigned to that states.
Later when an agent decide what move to make it checks all possible moves and chooses this one that has the highest value according to the dictionary.
As a reward it takes:
For the user interface, I used PyQT because I used this library a few times already. Buttons and other elements I designed in Figma and export as png files.
To separate logic from code related to design I used MVP pattern.
view.py
- Contains code that shows user interface.
presenter.py
- There is code that does all business logic and connects with visual elements from view.py.
game.py
- Code strictly related to TicTacToe game and machine learning
Here is link to my -> Google Drive <- with exe file of this program