In [1]:
import pandas as pd
import numpy as np
import tensorflow as tf

# Intro to Tensorflow

https://www.tensorflow.org/get_started/get_started

### Goals
- Gain a basic understanding of the what/how/why of Tensorflow
- Implement a simple multi-layer perceptron 

## Tensorflow Basics

Tensorflow (and other 'deep learning' libraries) are really good at gradient descent. 

Three types of objects
- Placeholders where we will use real data
- Variables. These are the model parameters - they can be updated using gradient descent.
- Constants.

Use these objects to construct a loss function. Then use gradient descent to find the best parameters, given the data.

### Constants

In [4]:
node1 = tf.constant(3.0, tf.float32)
node2 = tf.constant(4.0)

In [6]:
sess = tf.InteractiveSession()

In [8]:
sess.run([node1 + node2, node1, 2 * node1])

[7.0, 3.0, 6.0]

### Placeholders
Placeholders are the objects that will be filled with real data at runtime

In [10]:
a = tf.placeholder(tf.float32)
b = tf.placeholder(tf.float32)
adder_node = a + b

In [11]:
adder_node

<tf.Tensor 'add_3:0' shape=<unknown> dtype=float32>

In [17]:
sess.run(adder_node, feed_dict={a: [3, .19], b: [8, 0]})

array([ 11.  ,   0.19], dtype=float32)

### Variables

Think about the linear equation
$$
y = 3 x - 3
$$

In [18]:
# Data
x = tf.placeholder(tf.float32)

# Define the model
W = tf.Variable([3.], tf.float32)
b = tf.Variable([-3.], tf.float32)
linear_model = W * x + b


Variables need to be initialized

In [19]:
sess.run(tf.global_variables_initializer())

In [20]:
sess.run(linear_model, {x: [1, 2, 3]})

array([ 0.,  3.,  6.], dtype=float32)

Or we could define some y values and see how well it fits the model

In [21]:
y = tf.placeholder(tf.float32)
error = tf.square(linear_model - y)

sess.run(error, {x: [1, 2, 3], y: [0, 3, 4]})

array([ 0.,  0.,  4.], dtype=float32)

# Linear Regression

## Crime Data

In [22]:
from sklearn.model_selection import train_test_split

# Load some crime data
headers = pd.read_csv('comm_names.txt', squeeze=True)
headers = headers.apply(lambda s: s.split()[1])
crime = (pd.read_csv('http://archive.ics.uci.edu/ml/machine-learning-databases/communities/communities.data', 
                    header=None, na_values=['?'], names=headers)
         .iloc[:, 5:]
         .dropna()
         )

# Set target and predictors
target = 'ViolentCrimesPerPop'
predictors = [c for c in crime.columns if not c == target]

# Train/test split
X = crime[predictors]
y = crime[[target]]
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=2)

In [24]:
y_train.head()

Unnamed: 0,ViolentCrimesPerPop
54,0.56
41,0.09
13,0.25
1,0.35
19,0.63


### Define the model

In [25]:
# Parameters
dim_input = X_train.shape[1]
dim_output = 1

# Input
x = tf.placeholder(tf.float32, [None, dim_input])  # to match X_train

# Output
y_ = tf.placeholder(tf.float32, [None, dim_output])  # to match y_train

# Variables
W = tf.Variable(tf.random_normal([dim_input, dim_output]))
b = tf.Variable(tf.random_normal([dim_output]))

# Model
y = tf.matmul(x, W) + b  # linear regression model

# Loss
mse = tf.reduce_mean(tf.square(y - y_))

# Optimizer
optimizer = tf.train.AdamOptimizer(0.01)  # automatically adjusts learning rate
train_step = optimizer.minimize(mse)

Initialize

In [26]:
sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())

In [29]:
X_train.shape

(239, 121)

In [33]:
sess.run(W)[:5]

array([[ 0.49057177],
       [ 1.46746576],
       [ 1.92892373],
       [ 0.07348144],
       [ 0.67604882]], dtype=float32)

In [34]:
sess.run(b)

array([ 0.10284797], dtype=float32)

View loss

In [32]:
sess.run(mse, {x: X_train, y_: y_train})

13.324575

Execute a training step

In [139]:
sess.run(train_step, {x: X_train, y_: y_train})

View error on training data

In [140]:
sess.run(mse, {x: X_train, y_: y_train})

1.4396427

In [143]:
sess.run(mse, {x: X_test, y_: y_test})

2.0132751

In [141]:
sess.run(W)[:5]

array([[ 0.46455124],
       [ 1.54276574],
       [ 2.06411052],
       [ 0.31604427],
       [ 0.35056594]], dtype=float32)

In [142]:
sess.run(b)

array([ 0.18959984], dtype=float32)

Parameters

### Exercise

1: Run 10000 gradient descent steps of the model above. Every 500 iterations, note the train error and the test error.

In [None]:
for i in range(10000)


2: Compare your results above to LinearRegression in scikit-learn.

3: In Week 5, we found that the best ridge regularization parameter for this data was alpha=11.8. Try to add the same amount of regularization to the tensorflow model above, then compare with ridge regression in scikit-learn.

# Multi-layer Perceptron (MLP)

![](mlp.png)

### Exercise

Build a multi-layer perceptron to predict crime rates.

Start with two hidden units. You should be able to define one matrix transforms the inputs to the hidden layer, and a second matrix that will transform the hidden layer to the output.

Don't forget add bias at each step and to apply a nonlinear transformation to the hidden layer (e.g. tf.nn.sigmoid())

In [None]:
dim_hidden = 2

# input

# output

# Input to hidden


# Hidden to output


# Model


# Loss


# Optimizer


Once you have something working, it is time to tune your network to find the right number of hidden layers and amount of regularization.

1. Use your code block from above that performs gradient descent steps and records intermediate results.
2. You might want to force the optimizer to be stochastic. That is, feed it 100 random training examples at each step instead of the whole training dataset.
3. Start with two hidden units and try to get the regularization right. Then slowly increase the number of hidden units and continue tuning the regularization.
4. If the training error is high, you have too much bias. If the training and testing errors are very different, you have too much variance. If the training or testing errors are jumping all over the place, your step size is too high.

In [None]:
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

# Bonus: Add _another_ hidden layer.

Can you decrease the MSE on the test set even further?

In [None]:
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()