Introduction

The following is a university project of a DQN agent playing the attacker in the game battleship. It plays on a 10x10 grid using a CNN to approximate the value function. A full description and analysis can be seen in the report. The below diagram shows the Q-Value estimation network used.

Installation

Install Python 3.9
pip install -r requirements.txt - installs basic python dependencies
pip install git+https://github.com/NiklasZ/gym-battleship - installs battleship environment

Implementation

This implementation makes heavy usage of tensorflow and tf-agents. The main files of interest are:

agents/train_agent.py - trains a DQN agent.
agents/tune_agent.py - randomly tunes many DQN agents.
training/tf_dqn_agent.py - contains tuning specifications.
models/cnn.py - generates a CNN architecture according to configuration.
training/best_agent_tester.ipynb - training of specific CNN configurations.

Data & Models

All training-related data is in training/data/1_tuning. For example, in training/data/1_tuning/2022-03-14T14-30-38-10x10-promising-cnns we have:

logs where each folder inside contains training metrics. Viewable via: tensorboard --logdir training/data/1_tuning/2022-03-14T14-30-38-10x10-promising-cnns
top_parameters.json - not an actual JSON but a dump of hyper-parameter dictionaries that can be used to re-run a particular model configuration.
policies contains agent policies of the best models. They can be loaded and tested via:

saved_policy = tf.compat.v2.saved_model.load(f"training/data/1_tuning/2022-03-14T14-30-38-10x10-promising-cnns/1")
policy_state = saved_policy.get_initial_state(batch_size=1)

time_step = eval_env.reset()
counter = 0
while not time_step.is_last():
    action_step = saved_policy.action(time_step)
    time_step = eval_env.step(action_step.action)
    eval_py_env.render()
    counter +=1

eval_py_env.render_board_generated()
print(f'Completed in {counter} steps')

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
agents		agents
example_projects		example_projects
helpers		helpers
models		models
training		training
.gitignore		.gitignore
README.md		README.md
Using_DQN_Agents_to_play_Battleship.pdf		Using_DQN_Agents_to_play_Battleship.pdf
dqn_mask_average_steps.png		dqn_mask_average_steps.png
hidden_board.png		hidden_board.png
my_net.png		my_net.png
requirements.txt		requirements.txt
visible_board.png		visible_board.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

agents

agents

example_projects

example_projects

helpers

helpers

models

models

training

training

.gitignore

.gitignore

README.md

README.md

Using_DQN_Agents_to_play_Battleship.pdf

Using_DQN_Agents_to_play_Battleship.pdf

dqn_mask_average_steps.png

dqn_mask_average_steps.png

hidden_board.png

hidden_board.png

my_net.png

my_net.png

requirements.txt

requirements.txt

visible_board.png

visible_board.png

Repository files navigation

Introduction

Installation

Implementation

Data & Models

About

Releases

Packages

Languages

NiklasZ/deep-battleships

Folders and files

Latest commit

History

Repository files navigation

Introduction

Installation

Implementation

Data & Models

About

Resources

Stars

Watchers

Forks

Languages