# Deep Learning Frameworks

It is not practical to implement everything from scratch. But there are many software programming frameworks that can help us implement algorithms more efficiently. This is analogous to how we do not program our own matrix multiply algorithm, but simply call a linear algebra library for us.

## Choosing Deep Learning Frameworks

A few criteria:

1. Ease of programming in development and deployment (to production)
2. Running speed
3. Truly open (open source with good governance)

# Tensorflow

Let's see a basic structure of a Tensorflow program. For the sake of providing a simple motivating example, suppose that we have a cost function $J(w) = w^2 - 10w + 25$. Let's see how we can find parameter $w$ in Tensorflow.

In [1]:
import numpy as np
import tensorflow as tf

  return f(*args, **kwds)


In [11]:
# Parameter we want to optimize
w = tf.Variable(0, dtype=tf.float32)

# Cost function - just forward propagation
cost = tf.add(tf.add(w**2, tf.multiply(-10., w)), 25)
# cost = w**2 - 10*w + 25 # This works too
train = tf.train.GradientDescentOptimizer(0.01).minimize(cost)

init = tf.global_variables_initializer()
session = tf.Session()
session.run(init)

# Evalutes w to be zero - we haven't run anything yet
print(session.run(w))

0.0


In [6]:
session.run(train)

# Evaluate w after one step of gradient descent
print(session.run(w))

0.1


In [8]:
# Run a thousand iterations of gradient descent
for _ in range(1000):
    session.run(train)
print(session.run(w))

4.99999


## Defining Training Data into Tensorflow Program

Suppose now we have a new variable `x` that is our input data. We use `tf.placeholder` to achieve what we want.

In [12]:
# x is a (3, 1) matrix
x = tf.placeholder(tf.float32, [3, 1])

# Cost is now a function of x
cost = x[0][0] * w**2 + x[1][0] * w + x[2][0]

Note that our cost function is now a **computation graph** that Tensorflow builds. By implementing this forward propagation, Tensorflow has already implemented the backward propagation formulas that can be deduced from the forward propagation.

That is why do we not have to implicitly implement the backpropagation.

We can now use `feed_dict` to feed in training examples into our cost function.

In [15]:
# Training data
coefficients = np.array([[1.], [-10.], [25.]])

In [17]:
session.run(train, feed_dict={x: coefficients})

# Evaluate w after one step of gradient descent
print(session.run(w))

0.1


In [18]:
# Run a thousand iterations of gradient descent
for _ in range(1000):
    session.run(train, feed_dict={x: coefficients})
print(session.run(w))

4.99999


# Alternative `tf.Session` initializer

The lines of code to initialize a session is pretty idiomatic, so what some programmers like to do is the following.

In [19]:
with tf.Session() as session:
    session.run(init)
    for _ in range(1000):
        session.run(train, feed_dict={x: coefficients})
    print(session.run(w))

4.99999


The `with` keyword in Python helps with cleaning things up better in the event that something fails or throws an error. It is much cleaner and in some sense safer to use this format of initializing our `tf.Session`.

# Simple Steps

1. Create tensors (`tf.Variables`) that are not yet executed/evaluated
2. Write operations between tensors
3. Initialize tensors
4. Create a session
5. Run the session