In [9]:
%matplotlib inline

In [10]:
import matplotlib.pyplot as plt
import numpy as np
import scipy.optimize

## Preparation (Based on Linear Regression)

Prepare train and test data.

In [20]:
data_original = np.loadtxt('data/housing.data')
data = np.insert(data_original, 0, 1, axis=1)
np.random.shuffle(data)
train_X = data[:400, :-1]
train_y = data[:400, -1]

m, n = train_X.shape
theta = np.random.rand(n)


(14,)


Define some necessary functions.

In [12]:
def cost_function(theta, X, y): 
    squared_errors = (X.dot(theta) - y) ** 2
    J = 0.5 * squared_errors.sum()
    return J

def gradient(theta, X, y):
    errors = X.dot(theta) - y
    return errors.dot(X)

## Gradient Checking

Define "step size" (don't set it too low to avoid numerical precision issues).

In [13]:
epsilon = 1e-4

Prepare theta step values (making use of numpy broadcasting).

In [23]:
theta_plus = theta + epsilon * np.identity(len(theta))
theta_minus = theta - epsilon * np.identity(len(theta))


Compute diffs between theta's gradient as mathematically defined and the gradient as defined by our function above.

In [17]:
diffs = np.empty_like(theta)
for i in range(len(theta)):
    gradient_def = (
        (cost_function(theta_plus[i], train_X, train_y) - cost_function(theta_minus[i], train_X, train_y)) /
        (2 * epsilon)
        )
    gradient_lin_reg = gradient(theta, train_X, train_y)[i]
    diffs[i] = np.absolute(gradient_def - gradient_lin_reg)

In [18]:
diffs

array([  7.67482561e-06,   2.36367341e-05,   3.26307490e-05,
         4.27542254e-05,   2.29503712e-06,   3.41320992e-05,
         3.94347589e-05,   2.87033617e-06,   1.32125570e-05,
         3.99977434e-05,   1.50203705e-05,   2.42795795e-06,
         6.56396151e-05,   1.25207007e-05])

**Lookin' good!** The smaller the values, the better.<br>
(Any value significantly larger than 1e-4 indicates a problem.)

In [19]:
assert all(np.less(diffs, 1e-4))

Quality check: passed with distinction.