# Educational neural networks in Python

This code is loosely inspired to [Andrej Karpathy](https://cs.stanford.edu/people/karpathy/)'s excellent but discontinued [Hacker's guide to Neural Networks](http://karpathy.github.io/neuralnets/).

This implementation is not a one-to-one translation of the original javascript code into Python, but [there](https://github.com/urwithajit9/HG_NeuralNetwork) [are](https://github.com/johnashu/hackers_guide_to_neural_networks) [many](https://github.com/saiashirwad/Hackers-Guide-To-Neural-Networks-Python) [repositories](https://github.com/pannous/karpathy_neuralnets_python) [on](https://github.com/techniquark/Hacker-s-Guide-to-Neural-Networks-in-Python) [Github](https://github.com/Mutinix/hacker-nn/) that closely match it line-by-line. Use those to follow along the blog post.

The purpose of this version is to simplify network definition and automate the computation of forward and backward passes. Both these tasks are exploded and manual (for clarity's sake!) in Karpathy's code.

# Single gate circuit

In the example below, we define a network implementing the function f(x,y) = xy.

The module `utils.sugar` contains syntactic sugar that allows minimal boilerplate code.

In [24]:
from utils.sugar import *

a, b = const(3, -1)
ab = a * b

print 'a * b = ', ab.compute()

a * b =  -3.0


The gradients flowing back from output can be computed as follows:

In [25]:
ab.backprop(grad=1)

print 'a = {} (gradient {})'.format(a.val, a.grad)
print 'b = {} (gradient {})'.format(b.val, b.grad)

a = 3.0 (gradient -1.0)
b = -1.0 (gradient 3.0)


Without using `utils.sugar`, the code looks as follows:

In [26]:
from gates import *

a = Constant(3)
b = Constant(-1)
ab = MulGate(a, b)

print 'a * b = ', ab.compute()

a * b =  -3.0


Each operation and parameter in a formula is represented by a gate, which abstract a differentiable network unit.

Each gate has an associated value and gradient, which can be accessed through the `val` and `grad` members.

# Minima of a polynomial

As an example, let us find the minima of the function $f(x) = 3x^2 - 2x + 1$ by using gradient descent.

In [27]:
x = param(1)  # starting solution
f = 3 * x**2 - 2 * x + 1

for _ in range(16):
    
    print 'f({:.5}) = {:.5}'.format(x.val, f.compute())
    f.backprop(grad=-1, lr=0.1)

f(1.0) = 2.0
f(0.6) = 0.88
f(0.44) = 0.7008
f(0.376) = 0.67213
f(0.3504) = 0.66754
f(0.34016) = 0.66681
f(0.33606) = 0.66669
f(0.33443) = 0.66667
f(0.33377) = 0.66667
f(0.33351) = 0.66667
f(0.3334) = 0.66667
f(0.33336) = 0.66667
f(0.33334) = 0.66667
f(0.33334) = 0.66667
f(0.33334) = 0.66667
f(0.33333) = 0.66667


which quickly converges to the single global minima $\frac{2}{3}$.