# Educational neural networks in Python

This code is loosely inspired to [Andrej Karpathy](https://cs.stanford.edu/people/karpathy/)'s excellent [Hacker's guide to Neural Networks](http://karpathy.github.io/neuralnets/).

This implementation is not a one-to-one transliteration of the original javascript code into Python, but [there](https://github.com/urwithajit9/HG_NeuralNetwork) [are](https://github.com/johnashu/hackers_guide_to_neural_networks) [many](https://github.com/saiashirwad/Hackers-Guide-To-Neural-Networks-Python) [repositories](https://github.com/pannous/karpathy_neuralnets_python) [on](https://github.com/techniquark/Hacker-s-Guide-to-Neural-Networks-in-Python) [Github](https://github.com/Mutinix/hacker-nn/) that closely match it line-by-line. Use those to follow along the blog post.

The main purpose of this version is to simplify network definition and automate the computation of forward and backward passes. Both these tasks are exploded and manual (for clarity's sake!) in Karpathy's code.

# Base case: single gate in the circuit

Shows a single gate implementing f(x,y) = xy.

In [2]:
from iogates import Constant
from opgates import MulGate

a = Constant(3)
b = Constant(-1)
ab = MulGate(a, b)

print 'a * b = ', ab.compute()

a * b =  -3


The code above is equivalent to:

In [3]:
from sugar import *

a = const(3)
b = const(-1)
ab = a * b

print 'a * b = ', ab.compute()


a * b =  -3


## Strategy #1: Random Local Search

Random search, perturb inputs and accept them if they improve the output.

In [4]:
from random import random

best_in = (-2, 3)
best_out = best_in[0] * best_in[1]

print 'Initial output: {} * {} = {}'.format(best_in[0], best_in[1], best_out)

tweak_amount = 0.01
for _ in range(100):
    
    best_plus_noise = tuple(x + tweak_amount * (random() * 2 - 1) for x in best_in)
    out = best_plus_noise[0] * best_plus_noise[1]
    if out > best_out:
        best_in = best_plus_noise
        best_out = out
        
print 'Final output: {:.3} * {:.3} = {:.3}'.format(best_in[0], best_in[1], best_out)

Initial output: -2 * 3 = -6
Final output: -1.82 * 2.83 = -5.17


## Strategy #2: Numerical Gradient

Perform one step of numerical gradient descent.

In [5]:
a, b = -2, 3  # initial inputs
eps = 0.0001  # tweak amount
out = a * b

da = ((a + eps) * b - out) / eps  # 3.0
db = (a * (b + eps) - out) / eps  # -2.0

step_size = 0.01
a, b = a + step_size * da, b + step_size * db
print 'Initial output: {}\nFinal output: {:.3}'.format(out, a * b)

Initial output: -6
Final output: -5.87


## Strategy #3: Analytic Gradient

Perform one step of gradient descent using analytical derivatives.

In [6]:
a = param(-2)
b = param(3)
ab = a * b

print 'Initial output: {:}'.format(ab.compute())
ab.backprop(lr=0.01)
print 'Final output: {:.3}'.format(ab.compute())

Initial output: -6
Final output: -5.87


# Recursive Case: Circuits with Multiple Gates

Here's an example with multiple gates that depend on each other:

In [16]:
x, y, z = param((-2, 5, -4))
xpyz = (x + y) * z

print 'Initial output: {:}'.format(xpyz.compute())  # -12
xpyz.backprop(0.01)
print 'Final output: {:.4}'.format(xpyz.compute())  # -11.59

Initial output: -12
Final output: -11.59


Let's compare analytical and numerical gradients:

In [17]:
from utils import check_gradients
assert(check_gradients(xpyz, verbose=True))

Param value: -2.04, Analytical grad: -3.97, Numerical grad: -3.97, Diff: 1.2167e-06
Param value: 4.96, Analytical grad: -3.97, Numerical grad: -3.97, Diff: 1.2167e-06
Param value: -3.97, Analytical grad: 2.92, Numerical grad: 2.92, Diff: -2.416e-07


# Example: single neuron

A 2-dimensional neuron tcomputes the following function f(x,y,a,b,c) = σ(ax + by + c) where σ is the sigmoid function.

In [18]:
a, b, c = param((1.0, 2.0, -3.0))
x, y = const((-1.0, 3.0))
s = sigmoid(a * x + b * y + c)

assert(check_gradients(s, verbose=True))
print '---'
print 'Initial output: {:}'.format(s.compute())  # 0.880797077978
s.backprop(0.01)
print 'Final output: {:.5}'.format(s.compute())  # 0.882


Param value: 1.0, Analytical grad: -0.10499, Numerical grad: -0.10499, Diff: -2.3805e-07
Param value: 2.0, Analytical grad: 0.31498, Numerical grad: 0.31498, Diff: -6.2994e-08
Param value: -3.0, Analytical grad: 0.10499, Numerical grad: 0.10499, Diff: -9.5013e-08
---
Initial output: 0.880797077978
Final output: 0.882


A single neuron can also be defined as a single gate with five inputs:

In [20]:
a, b, c = param((1.0, 2.0, -3.0))
x, y = const((-1.0, 3.0))
n = neuron(a, b, c, x, y)

assert(check_gradients(n, verbose=True))
print '---'
print 'Initial output: {:}'.format(n.compute())  # 0.880797077978
n.backprop(0.01)
print 'Final output: {:.5}'.format(n.compute())  # 0.882


Param value: 1.0, Analytical grad: -0.10499, Numerical grad: -0.10499, Diff: -2.3805e-07
Param value: 2.0, Analytical grad: 0.31498, Numerical grad: 0.31498, Diff: -6.2994e-08
Param value: -3.0, Analytical grad: 0.10499, Numerical grad: 0.10499, Diff: -9.5013e-08
---
Initial output: 0.880797077978
Final output: 0.882


# Binary classification

In [29]:
from random import choice

dataset = (((1.2, 0.7), +1), ((-0.3, 0.5), -1), ((-3, -1), +1),
           ((0.1, 1.0), -1), ((3.0, 1.1), -1), ((2.1, -3), +1))

a, b, c = param((1.0, 2.0, -3.0))
x, y = const(dataset[0][0])  # constant in the sense that it is not updated by incoming gradients
f = a * x + b * y + c


def evaluate_training_accuracy():
    total_correct = 0
    for training_example in dataset:
        (x.val, y.val), label = training_example
        output = 1 if f.compute() > 0 else -1
        correct = 1 if output == label else 0
        total_correct += correct
    return total_correct / len(dataset)


for iteration in range(400):
    
    (x.val, y.val), label = choice(dataset)
    output = f.compute()
    
    f.grad = 0
    if label == 1 and output < 1:
        f.grad = 1
    if label == -1 and output > -1:
        f.grad = -1
    f.backprop()

    for p in f.parameters():
        p.grad += p.val
    f.update_parameters(0.01)

    if (iteration + 1) % 25 == 0:
        print 'Accuracy at iteration {}: {}'.format(iteration, evaluate_training_accuracy())

Accuracy at iteration 24: 0
Accuracy at iteration 49: 0
Accuracy at iteration 74: 0
Accuracy at iteration 99: 0
Accuracy at iteration 124: 0
Accuracy at iteration 149: 0
Accuracy at iteration 174: 0
Accuracy at iteration 199: 0
Accuracy at iteration 224: 0
Accuracy at iteration 249: 0
Accuracy at iteration 274: 0
Accuracy at iteration 299: 0
Accuracy at iteration 324: 0
Accuracy at iteration 349: 0
Accuracy at iteration 374: 0
Accuracy at iteration 399: 0
