Important note to the reader willing to run the code below. After having installed ViZDoom github.com/mwydmuch/ViZDoom/blob/master/doc/Building.md and cloned the following https://github.com/mwydmuch/ViZDoom, execute (for Linux):

> cd ../ViZDoom/examples

> jupyter notebook

then start this notebook...

In this capstone project we want to investigate various Q-Learning techniques to train an agent to play autonomously the game Doom in the ViZDoom environment

We start by discovering the game:
    - we take for granted that vizdoom can be imported (import VizDomm; to install on Linux: >pip3 install vizdoom)
    - what parameters are necessary to load the game
    - what routines are necessary to init and run an episode
    - what inputs the player (agent) can send to the game (the actions)
    - what inputs the player (agent) can receive from the game (the visual inputs & the reward function)
    - no learning takes place here, only random actions are taken from the player (agent): we focus on the interactions

The following code has been adapted from github.com/mwydmuch/ViZDoom/tree/master/examples/python, specifically the file /ViZDoom/examples/python/basic.py, whose credit goes to the developers of the ViZDoom platform, an AI research platform for reinforcement learning from raw visual information: Michal Kempka, Grzegorz Runc, Jakub Toczek & Marek Wydmuch.

# 1. DISCOVERING THE VIZDOOM ENVIRONMENT

In [1]:
from __future__ import print_function
from vizdoom import *

from random import choice
from time import sleep

# Create DoomGame instance. It will run the game and communicate with the agent.
game = DoomGame()

# CONFIGURATION: we will use game.load_config("../../scenarios/basic.cfg") in the future. 
# For now we check the parameters one by one

# Sets path to the wad file = the game scenario
game.set_doom_scenario_path("../../scenarios/basic.wad")

# Sets map to start (scenario .wad files can contain many maps).
game.set_doom_map("map01")

# Sets resolution. Default is 320X240
game.set_screen_resolution(ScreenResolution.RES_640X480)

# Sets the screen buffer format
game.set_screen_format(ScreenFormat.RGB24)

# Enables depth buffer
game.set_depth_buffer_enabled(True)

# Enables labeling in game objects labeling
game.set_labels_buffer_enabled(True)

# Enables buffer with top down map of the current episode/level
game.set_automap_buffer_enabled(True)

# Sets other rendering options (all of these options except crosshair are enabled by default)
game.set_render_hud(False)
game.set_render_minimal_hud(False)  # If hud is enabled
game.set_render_crosshair(False)
game.set_render_weapon(True)
game.set_render_decals(False)  # Bullet holes and blood on the walls
game.set_render_particles(False)
game.set_render_effects_sprites(False)  # Smoke and blood
game.set_render_messages(False)  # In-game messages
game.set_render_corpses(False)
game.set_render_screen_flashes(True)  # Effect upon taking damage or picking up items

# Adds buttons that will be allowed for the agent
game.add_available_button(Button.MOVE_LEFT)
game.add_available_button(Button.MOVE_RIGHT)
game.add_available_button(Button.ATTACK)

# Adds game variables that will be included in state
game.add_available_game_variable(GameVariable.AMMO2)

# Causes episodes to finish after 200 tics (actions)
game.set_episode_timeout(200)

# Makes episodes start after 10 tics (~after raising the weapon)
game.set_episode_start_time(10)

# Makes the window appear
game.set_window_visible(True)

# Turns on the sound
game.set_sound_enabled(True)

# Sets the living reward (for each move) to -1
game.set_living_reward(-1)

# Sets ViZDoom mode (PLAYER, ASYNC_PLAYER, SPECTATOR, ASYNC_SPECTATOR, PLAYER mode is default)
game.set_mode(Mode.PLAYER)

# Enables engine output to console.
# game.set_console_enabled(True)

# Initializes the game
game.init()

# Define some actions. Each list entry corresponds to declared buttons:
# MOVE_LEFT, MOVE_RIGHT, ATTACK
# 5 more combinations are naturally possible but only 3 are included for transparency when watching
actions = [[True, False, False], [False, True, False], [False, False, True]]

# Run this many episodes
episodes = 1

# Sets time that will pause the engine after each action (in seconds)
# Without this everything would go too fast
sleep_time = 1.0 / DEFAULT_TICRATE # = 0.028

for i in range(episodes):
    print("Episode #" + str(i + 1))

    # Starts a new episode
    game.new_episode()

    while not game.is_episode_finished():

        # Gets the state
        state = game.get_state()

        # Which consists of:
        n = state.number
        vars = state.game_variables
        screen_buf = state.screen_buffer
        depth_buf = state.depth_buffer
        labels_buf = state.labels_buffer
        automap_buf = state.automap_buffer
        labels = state.labels
        
        # random action and get reward
        r = game.make_action(choice(actions))

        # Makes a "prolonged" action and skip frames:
        # skiprate = 4
        # r = game.make_action(choice(actions), skiprate)

        # The same could be achieved with:
        # game.set_action(choice(actions))
        # game.advance_action(skiprate)
        # r = game.get_last_reward()

        # Prints state's game variables and reward.
        print("State #" + str(n))
        print("Game variables:", vars)
        print("Reward:", r)
        print("=====================")

        if sleep_time > 0:
            sleep(sleep_time)

    # Check the results
    print("Episode finished.")
    print("Total reward:", game.get_total_reward())
    print("************************")

# It will be done automatically anyway but sometimes you need to do it in the middle of the program...
game.close()


Episode #1
State #1
Game variables: [ 50.]
Reward: -1.0
State #2
Game variables: [ 50.]
Reward: -1.0
State #3
Game variables: [ 50.]
Reward: -1.0
State #4
Game variables: [ 50.]
Reward: -1.0
State #5
Game variables: [ 50.]
Reward: -1.0
State #6
Game variables: [ 50.]
Reward: -1.0
State #7
Game variables: [ 50.]
Reward: -1.0
State #8
Game variables: [ 50.]
Reward: -1.0
State #9
Game variables: [ 50.]
Reward: -6.0
State #10
Game variables: [ 49.]
Reward: -1.0
State #11
Game variables: [ 49.]
Reward: -1.0
State #12
Game variables: [ 49.]
Reward: -1.0
State #13
Game variables: [ 49.]
Reward: -1.0
State #14
Game variables: [ 49.]
Reward: -1.0
State #15
Game variables: [ 49.]
Reward: -1.0
State #16
Game variables: [ 49.]
Reward: -1.0
State #17
Game variables: [ 49.]
Reward: -1.0
State #18
Game variables: [ 49.]
Reward: -1.0
State #19
Game variables: [ 49.]
Reward: -1.0
State #20
Game variables: [ 49.]
Reward: -1.0
State #21
Game variables: [ 49.]
Reward: -1.0
State #22
Game variables: [ 49.]