Skip to content

Bloodaxe90/2048_DQL_python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deep Q-Learning for 2048

Description

This is my second attempt at implementing Q-Learning on 2048, following a previous failed project (https://github.com/Bloodaxe90/2048-Q-Learning) where I used a tabular approach (more details in that repository). Thankfully, this attempt was much more successful!

This project's UI was built using PySide6 with Qt Designer, and it uses TensorBoard for inference.

Usage:

  1. Activate a virtual environment.
  2. Run pip install -r requirements.txt to install the dependencies.
  3. Either:
    • Run main.py to train a model
    • Run application.py to test a model play 2048 visually or play 2048 yourself.

Hyperparameters:

Hyperparameters found in main.py:

  • EPISODES (int): The number of episodes to train across.
  • HIDDEN_NEURONS (tuple[int]): Defines the number of hidden neurons in each hidden layer. The number of hidden layers is len(HIDDEN_NEURONS) - 1. For example, (128, 64, 32) results in two hidden layers: the first with 128 input and 64 output neurons, and the second with 64 input and 32 output neurons.
  • REPLAY_CAPACITY (int): The capacity of the replay buffer.
  • BATCH_SIZE (int): The number of experiences used in each training step.
  • ALPHA (float): The learning rate.
  • GAMMA (float): The discount factor.
  • TRIAL_NAME (str): The name of the current experiment, used as part of the filename for the TensorBoard logs.
  • MAIN_UPDATE_COUNT (int): The number of training steps performed on the main network when an update condition is met.
  • MAIN_UPDATE_FREQ (int): The frequency (in episodes) at which the main network is updated.
  • TARGET_UPDATE_FREQ (int): The frequency (in episodes) at which the target network is updated from the main network.
  • MODEL_SAVE_NAME (str): The name to save the trained model under. Leave as an empty string if the model should not be saved.

Hyperparameters found in application.py:

  • MODEL_LOAD_NAME (str): The name of the model to load and use for playing 2048.
  • MODEL_LOAD_HIDDEN_NEURONS (tuple[int]): The hidden layer structure of the model being loaded. Follows the same format as HIDDEN_NEURONS described above.

Controls:

  • Default Radio Button: (Disabled while the agent is autoplaying)
    Allows you to play 2048 manually.
    • Arrow Keys: Move the number tiles in the corresponding direction.
  • Q-AI Radio Button:
    Enables the agent to play automatically.
    • S Key: Starts or stops the agent autoplaying 2048.
  • Space Bar: Resets the game (Disabled while the agent is autoplaying).

Results

Baseline

image

These baseline results are when the agent played using a random policy. This image can originally be found in the Experiment Notebook.

Final Results

image

After a lot of testing I trained my model for 30,000 episodes which took about 4 days. These results show an amazing improvement from the baseline with the agent even reaching a score of 2048 occasionally, I am sure that with more training an agent would be able to consistently reach a value of 2048. The original image of the results can be found in the Inference Notebook.

Screenshot of the final UI:

image

About

Deep Q Learning for 2048 with a QTDesigner UI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors