<a href="https://colab.research.google.com/github/Sahilamin219/AI-Agents/blob/main/Doom_AI.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

RNN computation. So how do these things work? At the core, RNNs have a deceptively simple API: They accept an input vector x and give you an output vector y. However, crucially this output vector’s contents are influenced not only by the input you just fed in, but also on the entire history of inputs you’ve fed in in the past. Written as a class, the RNN’s API consists of a single step function

In [8]:
# http://karpathy.github.io/2015/05/21/rnn-effectiveness/
rnn = RNN()
y = rnn.step(x) # x is an input vector, y is the RNN's output vector

The RNN class has some internal state that it gets to update every time step is called. In the simplest case this state consists of a single hidden vector h. Here is an implementation of the step function in a Vanilla RNN:

In [None]:
class RNN:
  # ...
  def step(self, x):
    # update the hidden state
    self.h = np.tanh(np.dot(self.W_hh, self.h) + np.dot(self.W_xh, x))
    # compute the output vector
    y = np.dot(self.W_hy, self.h)
    return y


The above specifies the forward pass of a vanilla RNN. This RNN’s parameters are the three matrices `W_hh, W_xh, W_hy`. The hidden state `self.h `is initialized with the zero vector. The `np.tanh` function implements a non-linearity that squashes the activations to the range` [-1, 1]`. Notice briefly how this works: There are two terms inside of the tanh: one is based on the previous hidden state and one is based on the current input. In numpy` np.dot `is matrix multiplication. The two intermediates interact with addition, and then get squashed by the tanh into the new state vector. If you’re more comfortable with math notation, we can also write the hidden state update as ` ht=tanh(Whhht−1+Wxhxt)`

, where tanh is applied elementwise.

We initialize the matrices of the RNN with random numbers and the bulk of work during training goes into finding the matrices that give rise to desirable behavior, as measured with some loss function that expresses your preference to what kinds of outputs y you’d like to see in response to your input sequences x.

Going deep. RNNs are neural networks and everything works monotonically better (if done right) if you put on your deep learning hat and start stacking models up like pancakes. For instance, we can form a 2-layer recurrent network as follows:



```
y1 = rnn1.step(x)
y = rnn2.step(y1)
```

In [None]:
from IPython.display import HTML
HTML('<iframe width="560" height="315" src="https://www.youtube.com/embed/gCJyVX98KJ4?showinfo=0" frameborder="0" allow="autoplay; encrypted-media" allowfullscreen></iframe>')

In [9]:
%%bash
# Install deps from 
# https://github.com/mwydmuch/ViZDoom/blob/master/doc/Building.md#-linux

apt-get install build-essential zlib1g-dev libsdl2-dev libjpeg-dev \
nasm tar libbz2-dev libgtk2.0-dev cmake git libfluidsynth-dev libgme-dev \
libopenal-dev timidity libwildmidi-dev unzip

# Boost libraries
apt-get install libboost-all-dev

# Lua binding dependencies
apt-get install liblua5.1-dev

Reading package lists...
Building dependency tree...
Reading state information...
build-essential is already the newest version (12.4ubuntu1).
libjpeg-dev is already the newest version (8c-2ubuntu8).
libjpeg-dev set to manually installed.
zlib1g-dev is already the newest version (1:1.2.11.dfsg-0ubuntu2).
zlib1g-dev set to manually installed.
cmake is already the newest version (3.10.2-1ubuntu2.18.04.1).
git is already the newest version (1:2.17.1-1ubuntu0.8).
libbz2-dev is already the newest version (1.0.6-8.1ubuntu0.2).
libbz2-dev set to manually installed.
unzip is already the newest version (6.0-21ubuntu1.1).
The following additional packages will be installed:
  autoconf automake autopoint autotools-dev debhelper dh-autoreconf
  dh-strip-nondeterminism file freepats gettext gettext-base gir1.2-atk-1.0
  gir1.2-freedesktop gir1.2-gdkpixbuf-2.0 gir1.2-gtk-2.0 gir1.2-ibus-1.0
  gir1.2-pango-1.0 intltool-debian libarchive-cpio-perl libarchive-zip-perl
  libatk1.0-dev libaudio2 libcairo

In [10]:
!pip install vizdoom

Collecting vizdoom
[?25l  Downloading https://files.pythonhosted.org/packages/41/0e/e7299dc536baab77ca61e7459883f353e4607f24d2e6266cd2a5ceb754d6/vizdoom-1.1.8.tar.gz (21.9MB)
[K     |████████████████████████████████| 21.9MB 49.8MB/s 
Building wheels for collected packages: vizdoom
  Building wheel for vizdoom (setup.py) ... [?25l[?25hdone
  Created wheel for vizdoom: filename=vizdoom-1.1.8-cp37-none-any.whl size=14461418 sha256=4b8d46df97a8ee78f3f36455f2760dc1d0e5735929af2b8de9badfe3ad2413d1
  Stored in directory: /root/.cache/pip/wheels/7d/04/dd/fafbaf68bb30e82ca4e336b9e13813d667d81aecb4648227a3
Successfully built vizdoom
Installing collected packages: vizdoom
Successfully installed vizdoom-1.1.8


In [11]:
import tensorflow as tf      # Deep Learning library
import numpy as np           # Handle matrices
from vizdoom import *        # Doom Environment

import random                # Handling random number generation
import time                  # Handling time calculation
from skimage import transform# Help us to preprocess the frames

from collections import deque# Ordered collection with ends
import matplotlib.pyplot as plt # Display graphs

import warnings # This ignore all the warning messages that are normally printed during the training because of skiimage
warnings.filterwarnings('ignore')

In [27]:
# Create Enviroment
def create_enviroment():
  game=DoomGame()
  # load correct config
  game.load_config('/usr/local/lib/python3.7/dist-packages/vizdoom/scenarios/basic.cfg')
  # load correct scenerio in our case basic scenerio
  game.set_doom_scenario_path("/usr/local/lib/python3.7/dist-packages/vizdoom/scenarios/basic.wad")
  game.set_window_visible(False)
  game.init()
   # Here our possible actions
  left = [1, 0, 0]
  right = [0, 1, 0]
  shoot = [0, 0, 1]
  possible_actions = [left, right, shoot]

  return game, possible_actions

# Performing Random action to test enviroment
def test_enviroment():
  game=DoomGame()
  game.load_config('/usr/local/lib/python3.7/dist-packages/vizdoom/scenarios/basic.cfg')
  game.set_doom_scenario_path("/usr/local/lib/python3.7/dist-packages/vizdoom/scenarios/basic.wad")
  game.set_window_visible(False)
  game.init()
  shoot=[0,0,1]
  go_left=[1,0,0]
  go_right=[0,1,0]
  actions=[shoot, go_left, go_right]
  episodes=1
  for i in range(episodes):
    game.new_episode()
    while not game.is_episode_finished():
      state=game.get_state()
      img=state.screen_buffer
      misc=state.game_variables
      action=random.choice(actions)
      print(action)
      reward=game.make_action(action)
      print ("\treward:", reward)
      time.sleep(0.02)
      print ("Result:", game.get_total_reward())
      time.sleep(2)
  game.close()

game, possible_actions = create_enviroment()
test_enviroment()


[0, 1, 0]
	reward: -1.0
Result: -68.0
[0, 0, 1]
	reward: -1.0
Result: -69.0
[1, 0, 0]
	reward: -6.0
Result: -75.0
[0, 1, 0]
	reward: -1.0
Result: -76.0
[0, 0, 1]
	reward: -1.0
Result: -77.0
[0, 1, 0]
	reward: -1.0
Result: -78.0
[0, 0, 1]
	reward: -1.0
Result: -79.0
[0, 1, 0]
	reward: -1.0
Result: -80.0


KeyboardInterrupt: ignored

In [None]:
# Model Hyperparamters