## Neural Networks - Reinforcement Learning

Notes from https://www.youtube.com/watch?v=MotG3XI2qSs


Uses a *Dataflow Graph*

has both a Python and C++ API, but Python is more complete
Faster compile times
Supports CPUs, GPUs and Distributed Processing

**DataFlow Graph**:  
**Node** - represents a mathematical operation  
**Edge** - multidimensional array (Tensor)

Standard usage - build a graph and execute after the session is created by using the *run* and *eval* operations.

There's an option for interactive sessions that run on demand. 

DSWB - Data Scientist's WorkBench


DQN (Deep Q Network) patented by google
- Experience Replay
- Dueling Network Architecture

----

Reinforcement learning isn't just another way to say "supervised learning". *Supervised Learning* is all about making sense of the environment based on historical samples. 

Not always the best way to do things. "Driving only looking at the rearview mirror". 

*Reinforcement Learning* is all about *Reward*. You get points for your actions and can lose points through unhelpful behavior. The goal is to get the maximum number of points by accomplishing your goal based on the state of traffic around you.

*Reinforcement Learning* recognizes that an *action* results in a change of the *state*. 

Exploration vs Exploitation:

Most organizations operate in the realm of conventional wisdom. Which is about exploiting what is known to achieve finite rewards with known odds. Some groups venture into the unknown and explore new territory with the prospect of outsized rewards at long odds. And many of these organizations do fail. But some of them succeed and end up changing the world. 

With *Reinforcement Learning* a group can explore the trade-off between *Exploration and Exploitation* and choose the path to the maximum expected reward. 

AI encompasses Deep Learning and Reinforcement Learning (imagine a venn diagram with overlap). 

Goal setting, Planning and Perception. It can form a bridge between AI and the engineering disciplines. 



## Neural Networks Theory

Stepping stone to approaching concepts like Deep Learning.

Neural networks are modeled after *Biological Neural Networks* and attempt to allow computers to learn in a similar manner to humans - *Reinforcement Learning*. 

Use cases:
- Pattern Recognition
- Time Series Predictions
- Signal Processing
- Anomaly Detection
- Control

The human brain has interconnected *neurons* with *dendrites* that receive inputs and then, based on those inputs, produce and electrical signal output through the *axon*. 

**Artificial Neural Networks (ANN)**

Why bother?  
There are problems that are easy for computers but hard for humans (calculating large arithmetic problems)  
There are problems that are hard for computers but easy for humans (recognizing a picture of a person from the side)

*Neural Networks* attempt to solve problems that would be easy for humans but hard for computers. 

**The Perceptron** 
- simplest Neural Network

Consists of:
- one or more inputs
- a processor
- a single output

A *perceptron* follows the *Feed Forward* model, meaning inputs are 
1. sent to the neuron
2. processed
3. result in an output

The process follows **5 main steps**
1. **Receive** inputs
2. **Weight** inputs (add some sort of weighting - multiplying by some value (often a number between -1 and 1)
3. **Sum** inputs
4. **Activation Function** this tells the function whether to 'fire' or not
5. **Generate output**

*Activation Functions* (Logistic, Trigonometric, Sinusoidal, Step, ..., etc)

Also consider **Bias** - in case the earlier functions would output *Nulls or Zeroes* in a non-helpful way. This overcomes the *Zero Issue*. 

*Training the Perceptron*
1. Provide the perceptron inputs for which there is a *known answer*
2. Ask the perceptron to guess the answer
3. Compute the error (how far off from the correct answer?)
4. Adjust all the weights according to the error 
5. Return to step 1 and repeat.

We repeat the process until we reach an error that we're satisfied with. 

Eventually *Layer* a bunch of perceptrons together - this will create a Neural Net.

Link a bunch of Perceptrons together in Layers.

There will be an *Input Layer*, some number of layers and an *Output Layer*. 
The layers in-between are called *Hidden Layers* because you don't directly see the Input or Output.

**Deep Learning** just means *A Neural Network with many Input and Output Layers*. Microsoft's Vision Recognition (circa 2016) is 152 layers. 






### TensorFlow

Brief background on Tensorflow

An open source library developed by Google.  
Runs on either CPU or GPU.  
Most popular library in the field  

Typically Deep learning Networks run much faster on GPU

The basic idea is the be able to create a *Data Flow Graph*  
- the graphs have *Nodes* and *Edges* 
- arrays (data) passed along *From* a layer of nodes *To* a layer of nodes is called a **Tensor**

Two ways to use TensorFlow:
- Customizable Graph Session (typical)
- SciKit-Learn type interface with *Contrib.Learn* (not as customizable, but easier to use)

the *Session* mode can be overwhelming if you don't have a background in math


----

The MNIST database (Modified National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for training various image processing systems.

----

### TensorFlow Basics

Create a Simple Constant using TensorFlow

**Tensor Objects**


In [1]:
import tensorflow as tf

**TensorFlow Objects**

In [2]:
# Constants will be stored as a Tensor Object
tf.constant('Hello World')

<tf.Tensor 'Const:0' shape=() dtype=string>

In [3]:
hello = tf.constant('Hello World')

In [4]:
type(hello)

tensorflow.python.framework.ops.Tensor

In [5]:
x = tf.constant(100)

In [6]:
x

<tf.Tensor 'Const_2:0' shape=() dtype=int32>

In [7]:
type(x)

tensorflow.python.framework.ops.Tensor

**Creating a *TensorFlow Session***

The *Session Object* encapsulates the *Environment* where *Operation Objects* are executed.

*Tensor Objects* are *Evaluated* in those Operations

**TensorFlow Sessions**

In [9]:
# Instantiating the Session Object
sess = tf.Session()

In [10]:
# "b" indicates "unicode"
sess.run(hello)

b'Hello World'

In [11]:
sess.run(x)

100

In [12]:
type(sess.run(x))

numpy.int32

In [13]:
type(sess.run(hello))

bytes

**TensorFlow Operations**

You can line up multiple Tensorflow Operations to be run during a session. 

In [14]:
# e.g.
x = tf.constant(2)

In [15]:
y = tf.constant(3)

In [21]:
# Similar to opening a file
with tf.Session() as sess:
    print('Operations With Constants')
    print('Addition :', sess.run(x + y))
    print('Subtraction :',sess.run(x - y))
    print('Multiplication :',sess.run(x * y))
    print('Division :',sess.run(x / y))
    

Operations With Constants
Addition : 5
Subtraction : -1
Multiplication : 6
Division : 0.6666666666666666


**Using Placeholders**

Used when you don't have the constants

In [22]:
# Must pass in object type. Tensorflow has a number of default types available
# see also float16, float64, etc
x = tf.placeholder(tf.int32)

In [23]:
y = tf.placeholder(tf.int32)

In [24]:
# Notice the "shape=<unknown>"
x

<tf.Tensor 'Placeholder:0' shape=<unknown> dtype=int32>

Depending on machine type (64 or 32bit CPU or GPU, there may be errors with types)

Now that we have Placeholders we can *Operations* that have *Variable Input*, compared to the earlier examples where we had vars with *Constant Inputs*

These are somewhat analogous to python functions

In [25]:
# TensorFlow already has a number of built in operations for these, this is just to show what's possible
add = tf.add(x,y)
sub = tf.subtract(x,y)
mul = tf.multiply(x,y)

In [32]:
# Adding a dictionary object
d = {x:20,y:30}

In [33]:
# Passing in 'Feed Dictionary' feed_dict. A dictionary of Values that the Operation will expect
# note that the feed_dict is passing in x and y as raw var keys, not strings
with tf.Session() as sess:
    print('Operations with placeholders')
    print('addition: ', sess.run(add,feed_dict={x:20,y:30}))
    print('subtraction: ', sess.run(sub,feed_dict=d))
    print('multiplication: ', sess.run(mul,feed_dict=d))

Operations with placeholders
addition:  50
subtraction:  -10
multiplication:  600


**A more complex example** using *Matrix Multiplication*

In [34]:
import numpy as np

In [39]:
# Notice that A is a 1x2 shape and b is a 2x1 shape
a = np.array([[5.0,5.0]])
b = np.array([[2.0],[2.0]])

In [40]:
a.shape

(1, 2)

In [41]:
b.shape

(2, 1)

**Now convert the matrices into tensor objects**

In [44]:
mat1 = tf.constant(a)
mat2 = tf.constant(b)

Matrix Multiplication Operation

In [45]:
matrix_multi = tf.matmul(mat1,mat2)

Now run the session to perform the operation

In [47]:
# Here we're not using a feed dictionary, since the inputs are already constants
with tf.Session() as sess:
    result = sess.run(matrix_multi)
    print(result)

[[20.]]
