<a href="https://colab.research.google.com/github/KaranVyas7/KVBattleShip-Bot/blob/main/KVBattleship_Bot.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Battleship Bot

This notebook is dedicated to building a battleship bot! The goal of the project is to build a game where a bot will verse huma players at battleship.

__Project Details:__
1. We will leverage Deep-Q Learning.
2. We will first focus on building a bot that can attack well. Our bot will be trained to best guess where your ships are, but it won't be specialized in how to defend aka where to place its ships for best defense.

In [1]:
# Importing libraries
import numpy as np
import tensorflow as tf
from tensorflow import keras
import random
import math

# Setting up the board parameters
BOARD_HEIGHT = 5
BOARD_WIDTH = 5
ship_dict = {'A':1,'B':1}

## Creating some classes

Before we can play our game, we need to create some classes that will help us play our game!

In [2]:
# Creating the Game Class
class Game():
  # Constructor to set up the Game
  def __init__(self,board_height,board_width,ship_dict,model=None):
    """

    board_height - height of the board
    board_width - width of the board (same as the height)
    ship_dict - a dictionary that maps the ship name to the ship size
    model - the model to use

    """
    # Setting up variables
    self.board_height = board_height
    self.board_width = board_width
    self.ship_dict = ship_dict

    # Creating the boards - your board and the model's board
    self.player_board = [[0] * self.board_height for row in range(self.board_width)]
    self.AI_board = [[0] * self.board_height for row in range(self.board_width)]
    self.player_hit_board = [[0] * self.board_height for row in range(self.board_width)]
    self.AI_hit_board = [[0] * self.board_height for row in range(self.board_width)]

    # Checking how many ships are sunk
    self.player_ship_sunk = 0
    self.AI_ship_sunk = 0

    # Defining the model
    self.model = model

  # Creating a function to place your ships
  def place_player_ships(self):
    # Printing the board
    self.print_board(self.player_board)
    # Creating a dictionary to check if the ship was placed
    ships_placed = {}
    for ship in self.ship_dict.keys():
      ships_placed[ship] = 'N'


    # Iterating through all the ships to place them
    for ship in ships_placed.keys():
      print(f'Where would you like {ship} (length {self.ship_dict[ship]} to start from?')
      row = int(input('Enter row: '))
      col = int(input('Enter column: '))

      # Checking to make sure a ship isn't already placed there
      while self.player_board[row][col] == 'S':
        print('A ship is already in that spot! Pick another one!')
        row = int(input('Enter row: '))
        col = int(input('Enter column: '))

      self.player_board[row][col] = 'S'
      ships_placed[ship] = 'Y'

      # Printing the board
      self.print_board(self.player_board)

  # A function to for your turn
  def player_turn(self):
    # Printing the hit board
    self.print_board(self.player_hit_board)
    # Defining the row and column you want to hit
    row = int(input('Pick a row to hit: '))
    col = int(input('Pick a column to hit: '))

    # Making sure they are valid selection
    while row < 0 or row > self.board_width - 1:
      print('Row selection is not valid!!! Choose a new row!')
      row = int(input('Pick a row to hit: '))

    # Making sure they are valid selection
    while col < 0 or col > self.board_height - 1:
      print('Column selection is not valid!!! Choose a new column!')
      col = int(input('Pick a column to hit: '))

    while self.player_hit_board[row][col] == 1 or self.player_hit_board[row][col] == 2:
      print('Selected spot has already been targetted! Choose a new one!')
      row = int(input('Pick a row to hit: '))
      col = int(input('Pick a column to hit: '))

       # Making sure they are valid selection
      while row < 0 or row > self.board_width - 1:
        print('Row selection is not valid!!! Choose a new row!')
        row = int(input('Pick a row to hit: '))

      # Making sure they are valid selection
      while col < 0 or col > self.board_height - 1:
        print('Column selection is not valid!!! Choose a new column!')
        col = int(input('Pick a column to hit: '))

    # Checking if the spot is a hit or miss
    if self.AI_board[row][col] == 'S':
      print('HIT!!!!')
      self.player_hit_board[row][col] = 2
      self.AI_ship_sunk += 1
    else:
      print('MISS!!!!')
      self.player_hit_board[row][col] = 1

    self.print_board(self.player_hit_board)

  # Function for the AI to set ships (Remember this is done randomly)
  def AI_set_ships(self):
    # Creating a dictionary to check if the ship was placed
    ships_placed = {}
    for ship in self.ship_dict.keys():
      ships_placed[ship] = 'N'


    # Iterating through all the ships to place them
    for ship in ships_placed.keys():
      row = np.random.randint(0,self.board_width-1)
      col = np.random.randint(0,self.board_height-1)

      # Checking to make sure a ship isn't already placed there
      while self.AI_board[row][col] == 'S':
        row = np.random.randint(0,self.board_width-1)
        col = np.random.randint(0,self.board_height-1)

      self.AI_board[row][col] = 'S'
      ships_placed[ship] = 'Y'

  # Having the AI randomly selecting spots to hit
  # We can use this as a baseline for checking how
  # our model does!
  def random_AI_turn(self):
    # Defining the row and column you want to hit
    row = np.random.randint(0,self.board_width-1)
    col = np.random.randint(0,self.board_height-1)

    # Making sure they are valid selection
    while row < 0 or row > self.board_width - 1:
      row = np.random.randint(0,self.board_width-1)

    # Making sure they are valid selection
    while col < 0 or col > self.board_height - 1:
      col = np.random.randint(0,self.board_height-1)

    while self.AI_hit_board[row][col] == 1 or self.AI_hit_board[row][col] == 2:
      row = np.random.randint(0,self.board_width-1)
      col = np.random.randint(0,self.board_height-1)

      # Making sure they are valid selection
      while row < 0 or row > self.board_width - 1:
        row = np.random.randint(0,self.board_width-1)

      # Making sure they are valid selection
      while col < 0 or col > self.board_height - 1:
        col = np.random.randint(0,self.board_height-1)

    # Checking if the spot is a hit or miss
    if self.player_board[row][col] == 'S':
      print('HIT!!!!')
      self.AI_hit_board[row][col] = 2
      self.player_ship_sunk += 1
    else:
      print('MISS!!!!')
      self.AI_hit_board[row][col] = 1

    self.print_board(self.AI_hit_board)

  # Function for model predicting the spot to hit
  def model_AI_predict(self):
    preds = self.model.model.predict(np.array(self.AI_hit_board).reshape(-1,1).T).reshape(self.board_height,self.board_width)
    row, col = np.unravel_index(np.argmax(preds,axis=None),preds.shape)

    # Checking to make sure the row and column are not taken
    while self.AI_hit_board[row][col] == 1 or self.AI_hit_board[row][col] == 2:
      row = np.random.randint(0,self.board_width-1)
      col = np.random.randint(0,self.board_height-1)

    return row,col

  # Leveraging the AI to play the game
  def AI_Turn_bot(self):
    row, col = self.model_AI_predict()

    # Using the move
    # Checking if the spot is a hit or miss
    if self.player_board[row][col] == 'S':
      print('HIT!!!!')
      self.AI_hit_board[row][col] = 2
      self.player_ship_sunk += 1
    else:
      print('MISS!!!!')
      self.AI_hit_board[row][col] = 1

    self.print_board(self.AI_hit_board)

  # Function for the Player to randomly set ships (Remember this is done randomly)
  def player_set_ships_random(self):
    # Creating a dictionary to check if the ship was placed
    ships_placed = {}
    for ship in self.ship_dict.keys():
      ships_placed[ship] = 'N'


    # Iterating through all the ships to place them
    for ship in ships_placed.keys():
      row = np.random.randint(0,self.board_width-1)
      col = np.random.randint(0,self.board_height-1)

      # Checking to make sure a ship isn't already placed there
      while self.player_board[row][col] == 'S':
        row = np.random.randint(0,self.board_width-1)
        col = np.random.randint(0,self.board_height-1)

      self.player_board[row][col] = 'S'
      ships_placed[ship] = 'Y'

  # Player playing randomly
  def random_player_turn(self):
    # Defining the row and column you want to hit
    row = np.random.randint(0,self.board_width-1)
    col = np.random.randint(0,self.board_height-1)

    # Making sure they are valid selection
    while row < 0 or row > self.board_width - 1:
      row = np.random.randint(0,self.board_width-1)

    # Making sure they are valid selection
    while col < 0 or col > self.board_height - 1:
      col = np.random.randint(0,self.board_height-1)

    while self.player_hit_board[row][col] == 1 or self.player_hit_board[row][col] == 2:
      row = np.random.randint(0,self.board_width-1)
      col = np.random.randint(0,self.board_height-1)

      # Making sure they are valid selection
      while row < 0 or row > self.board_width - 1:
        row = np.random.randint(0,self.board_width-1)

      # Making sure they are valid selection
      while col < 0 or col > self.board_height - 1:
        col = np.random.randint(0,self.board_height-1)

    # Checking if the spot is a hit or miss
    if self.AI_board[row][col] == 'S':
      self.player_hit_board[row][col] = 2
      self.AI_ship_sunk += 1
    else:
      self.player_hit_board[row][col] = 1

  # Printing any board
  def print_board(self,board):
    for row in range(0,len(board)):
      print(board[row])

  # A function to reset the game
  def reset_game(self):
    self.player_board = [[0] * self.board_height for row in range(self.board_width)]
    self.AI_board = [[0] * self.board_height for row in range(self.board_width)]
    self.player_hit_board = [[0] * self.board_height for row in range(self.board_width)]
    self.AI_hit_board = [[0] * self.board_height for row in range(self.board_width)]

In [3]:
# Defining our model in a model class
class Model:
    # Constructor
    def __init__(self, num_states, num_actions, batch_size):
      # Defining some variables
        self.num_states = num_states # number of states you can go to
        self.num_actions = num_actions # Number of actions you can take
        self.batch_size = batch_size # batch size for training the data

        # Defining the model
        self.model = None
        self.define_model()

    # Function to define your model
    def define_model(self):
        # Input layers
        self.model = keras.Sequential()
        self.model.add(keras.layers.Flatten(input_shape=(self.num_states,))) #(6,7) - input layer should be the state that we are in!
        """
        Build your model here!

        """
        self.model.add(keras.layers.Dense(100,activation='sigmoid'))
        self.model.add(keras.layers.Dense(500,activation='relu'))
        self.model.add(keras.layers.Dense(300,activation='sigmoid'))
        self.model.add(keras.layers.Dense(100,activation='relu'))

        # Output layer
        self.model.add(keras.layers.Dense(self.num_actions, activation=None))
        self.model.compile(optimizer="Adam", loss="MSE")

In [4]:
# Creating a class to create our "dataset" for model training
class Memory:
    # Constructor
    def __init__(self, max_memory):
        self.max_memory = max_memory # maximum amount of samples to remember
        self.samples = [] # the samples

    # Adding a sample to memory
    def add_sample(self, sample):
        self.samples.append(sample)
        if len(self.samples) > self.max_memory: # Removing the earliest sample if we reach maximum memory
            self.samples.pop(0)

    # Sampling the samples we have in memory
    def sample(self, no_samples):
        if no_samples > len(self.samples):
            return random.sample(self.samples, len(self.samples))
        else:
            return random.sample(self.samples, no_samples)

In [None]:
# Training the model
# Creating some variables that we will need
MAX_EPSILON = 1
MIN_EPSILON = 0.01
LAMBDA = 0.0001
GAMMA = 0.99

# Creating some of the objects that we need
memory = Memory(50000)
model = Model(BOARD_HEIGHT*BOARD_WIDTH,BOARD_WIDTH*BOARD_HEIGHT,32)
game_board = Game(BOARD_HEIGHT,BOARD_WIDTH,ship_dict,model)

# This function will main training of the model
def training():
  batch = memory.sample(model.batch_size)
  states = np.array([val[0] for val in batch])
  next_states = np.array([(np.zeros(model.num_states)
                            if val[3] is None else val[3]) for val in batch])
  # predict Q(s,a) given the batch of states
  q_s_a = model.model.predict(np.array(states).reshape(-1,BOARD_HEIGHT*BOARD_WIDTH))
  # predict Q(s',a') - so that we can do gamma * max(Q(s'a')) below
  q_s_a_d = model.model.predict(np.array(next_states).reshape(-1,BOARD_HEIGHT*BOARD_WIDTH))
  # setup training arrays
  x = np.zeros((len(batch), model.num_states))
  y = np.zeros((len(batch), model.num_actions))
  for i, b in enumerate(batch):
      state, action, reward, next_state = b[0], b[1], b[2], b[3]
      # get the current q values for all actions in state
      current_q = q_s_a[i]
      # update the q value for action
      if next_state is None:
          # in this case, the game completed after action, so there is no max Q(s',a')
          # prediction possible
          current_q[action[0] * action[1]] = reward
      else:
          current_q[action[0] * action[1]] = reward + GAMMA * np.amax(q_s_a_d[i])
      x[i] = np.array(state).reshape(-1,BOARD_HEIGHT*BOARD_WIDTH)
      y[i] = current_q
  model.model.fit(x,y)

# This function will run through games and train the model as it plays
def run(epoch):

  # Setting up some variables
  total_reward = 0 # looking at the reward of the model during the current game
  eps = MAX_EPSILON
  done = False

  # Reseting the game for a fresh game
  game_board.reset_game()

  # Having the player and AI setup ships randomly
  game_board.AI_set_ships()
  game_board.player_set_ships_random()

  # Running through the game
  while True:
    # Calculating the reward for each move
    reward = 0

    # Player goes first. We will have the model play against a random chooser (no strategy)
    # This may be a spot you choose to improve in the future!
    game_board.random_player_turn()

    # Check to see if the player won
    if game_board.AI_ship_sunk == len(ship_dict):
      reward = -300
      total_reward += reward
      return total_reward

    # Model's turn if the game didn't end
    row,col = choose_action(game_board.AI_hit_board,eps)
    prev_board = game_board.AI_hit_board

    # Adding action to the hit board
    if game_board.player_board[row][col] == 'S':
      game_board.AI_hit_board[row][col] = 2
      game_board.player_ship_sunk += 1
      reward += 150
    else:
      game_board.AI_hit_board[row][col] = 1
      reward -= 100

    # Checking if the A.I won
    if game_board.player_ship_sunk == len(ship_dict):
      reward += 300
      done = True

    # Add the move to the sample
    memory.add_sample((prev_board,(row,col),reward,game_board.AI_hit_board))

    # Training the model
    training()

    # Adding the reward up
    total_reward += reward

    # Adjusting the eps
    eps = MIN_EPSILON + (MAX_EPSILON- MIN_EPSILON) * math.exp(-LAMBDA * epoch)

    # Breaking if game finished
    if done:
      return total_reward

# A function to choose the action based on epsilon greedy strategy
def choose_action(state,eps):
  if random.random() < eps:
    row = np.random.randint(0,BOARD_WIDTH-1)
    col = np.random.randint(0,BOARD_HEIGHT-1)

    # Making sure they are valid selection
    while row < 0 or row > BOARD_WIDTH - 1:
      row = np.random.randint(0,BOARD_HEIGHT-1)

    # Making sure they are valid selection
    while col < 0 or col > BOARD_HEIGHT - 1:
      col = np.random.randint(0,BOARD_HEIGHT-1)

    while game_board.AI_hit_board[row][col] == 1 or game_board.AI_hit_board[row][col] == 2:
      row = np.random.randint(0,BOARD_WIDTH-1)
      col = np.random.randint(0,BOARD_HEIGHT-1)

      # Making sure they are valid selection
      while row < 0 or row > BOARD_WIDTH - 1:
        row = np.random.randint(0,BOARD_WIDTH-1)

      # Making sure they are valid selection
      while col < 0 or col > BOARD_HEIGHT - 1:
        col = np.random.randint(0,BOARD_HEIGHT-1)
    return (row,col)
  else:
      return model.model.model_AI_predict()


In [6]:
# Training the model
epochs = 5

for epoch in range(0,epochs):
  print(epoch)
  total_reward = run(epoch)
  print(f'Total Reward: {total_reward}')

0
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 857ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2s/step - loss: 399.4713
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 542ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 16ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 805ms/step - loss: 388.9229
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 233ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - loss: 382.6372
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 309ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 27ms/step
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - loss: 380.2919
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 233ms

## Time to Train the Model and Play the Game!!!

In this section, we will train the model and play the game!

In [None]:
# Creating the game & model
model = Model(BOARD_HEIGHT*BOARD_WIDTH,BOARD_WIDTH*BOARD_HEIGHT,32)
game_runner = Game(BOARD_HEIGHT,BOARD_WIDTH,ship_dict,model)

# Having the player set their ships
game_runner.place_player_ships()

# Having the AI set its ships
game_runner.AI_set_ships()

print('Time to start the Battle!')
print()
# Running the game until someone wins
while game_runner.AI_ship_sunk != len(ship_dict.keys()) and game_runner.player_ship_sunk != len(ship_dict.keys()):
  # Alternating turns
  print('Player Turn')
  game_runner.player_turn()
  print()

  # Checking if the number of ships sunk is still good
  if game_runner.AI_ship_sunk == len(ship_dict.keys()):
    print('Player Wins!!!')
    break

  # Going on to the AI
  print('AI Turn')
  game_runner.AI_Turn_bot()
  print()

if game_runner.player_ship_sunk == len(ship_dict.keys()):
  print('AI Wins!!!')

[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0]
[0, 0, 0, 0, 0]
Where would you like A (length 1 to start from?
