added intro tutorial on neural networks

bfortuner · Apr 21, 2017 · 9f542de · 9f542de
1 parent 51a5f08
commit 9f542de
Show file tree

Hide file tree

Showing 18 changed files with 349 additions and 57 deletions.
diff --git a/code/activation_functions.py b/code/activation_functions.py
@@ -0,0 +1,16 @@
+import math
+import numpy as np
+
+def relu(z):
+    return max(0, z)
+
+def relu_prime(z):
+    if z > 0:
+        return 1
+    return 0
+
+def sigmoid(z):
+    return 1.0 / (1.0 + np.exp(-z))
+
+def sigmoid_prime(z):
+    return sigmoid(z) * (1 - sigmoid(z))
diff --git a/code/nn.py b/code/nn.py
@@ -1,9 +1,19 @@
-def myfunction(var1):
-    return var1 + 1
+import math
+import numpy as np
 
-class MyClass():
-    def __init__(self, var2):
-        self.var2 = var2
+def relu(z):
+  return max(0,z)
 
-    def do_something(self):
-        return "Hey"
+def feed_forward(x, Wh, Wo):
+  # Hidden layer
+  Zh = x * Wh
+  H = relu(Zh)
+
+  # Output layer
+  Zo = H * Wo
+  output = relu(Zo)
+  return output
+
+# Zh - Hidden layer weighted input
+# h = Hidden layer activation
+# Zo - Output layer weighted input
diff --git a/code/nn_simple.py b/code/nn_simple.py
@@ -0,0 +1,41 @@
+import math
+import numpy as np
+
+def relu(z):
+    return max(0,z)
+
+def feed_forward(x, Wh, Wo):
+    # Hidden layer
+    Zh = x * Wh
+    H = relu(Zh)
+
+    # Output layer
+    Zo = H * Wo
+    output = relu(Zo)
+    return output
+
+def relu_prime(z):
+    if z > 0:
+        return z
+    return 1
+
+def cost(yHat, y):
+    return 0.5 * (yHat - y)**2
+
+def cost_prime(yHat, y):
+    return yHat - y
+
+def backprop(x, y, Wh, Wo, lr):
+    yHat = feed_forward(x, Wh, Wo)
+
+    # Layer Error
+    Eo = (O - y) * relu_prime(Zo)
+    Eh = Eo * Wo * relu_prime(Zh)
+
+    # Cost derivative for weights
+    dWo = Eo * H
+    dWh = Eh * x
+
+    # Update weights
+    Wh -= lr * dWh
+    Wo -= lr * dWo
diff --git a/docs/activation_functions.rst b/docs/activation_functions.rst
@@ -19,67 +19,70 @@ LeakyReLU
 
 Be the first to contribute!
 
+.. _relu:
 
 ReLU
 ====
 
+.. image:: images/relu.png
+    :align: center
+
 A recent invention which stands for Rectified Linear Units. The formula is deceptively simple: :math:`max(0,z)`. Despite its name and appearance, it’s not linear and provides the same benefits as Sigmoid but with better performance.
 
 .. math::
 
-  R(z) & = max(0,z) \\
-
-::
+  R(z) = \begin{Bmatrix}
+  z & z > 0 \\
+  0 & otherwise \\
+  \end{Bmatrix}\\
 
-  def relu(z):
-    if z > 0:
-        return z
-    return 0
+.. literalinclude:: ../code/activation_functions.py
+    :language: python
+    :pyobject: relu
 
 **Derivative**
 
 The derivative of relu...
 
 .. math::
 
-  R'(z) & = \begin{Bmatrix}
+  R'(z) = \begin{Bmatrix}
   1 & z>0 \\
   0 & z<0 \\
   \end{Bmatrix}
 
-::
+.. literalinclude:: ../code/activation_functions.py
+    :language: python
+    :pyobject: relu_prime
 
-  def relu_prime(z):
-    if z > 0:
-      return 1
-    return 0
 
+.. _sigmoid:
 
 Sigmoid
 =======
 
-There are many types of activation functions to choose from, but one of the most popular among textbook-writers is the logistic sigmoid function. Sigmoid takes in a real value and outputs another value between 0 and 1. It’s easy to work with and has all the nice properties above: it’s non-linear, continuously differentiable, monotonic, and has a fixed output range.
+.. image:: images/sigmoid.png
+    :align: center
+
+Sigmoid takes a real value as input and outputs another value between 0 and 1. It’s easy to work with and has all the nice properties of activation functions: it’s non-linear, continuously differentiable, monotonic, and has a fixed output range.
 
 .. math::
 
   S(z) = \frac{1} {1 + e^{-z}}
 
-::
-
-  def sigmoid(z):
-    return 1.0 / (1.0 + np.exp(-z))
-
+.. literalinclude:: ../code/activation_functions.py
+    :language: python
+    :pyobject: sigmoid
 
 **Derivative**
 
 .. math::
 
   S'(z) = S(z) * (1 - S(z))
 
-::
-
-  def sigmoid_prime(z):
-    return sigmoid(z) * (1 - sigmoid(z))
+.. literalinclude:: ../code/activation_functions.py
+    :language: python
+    :pyobject: sigmoid_prime
 
 
 Softmax

diff --git a/docs/glossary.rst b/docs/glossary.rst
@@ -30,7 +30,7 @@ Bias Metric
 
   - **High bias** (with low variance) suggests your model may be underfitting and you're using the wrong architecture for the job.
 
-.. _bias_term:
+.. _ bias_term:
 
 Bias Term
   Allow models to represent patterns that do not pass through the origin. For example, if all my features were 0, would my output also be zero? Is it possible there is some base value upon which my features have an effect? Bias terms typically accompany weights and are attached to neurons or filters.

diff --git a/docs/images/backprop_3_equations.png b/docs/images/backprop_3_equations.png
diff --git a/docs/images/backprop_ff_equations.png b/docs/images/backprop_ff_equations.png
diff --git a/docs/images/backprop_visually.png b/docs/images/backprop_visually.png
diff --git a/docs/images/memoization.png b/docs/images/memoization.png
diff --git a/docs/images/neural_network_complex.png b/docs/images/neural_network_complex.png
diff --git a/docs/images/neural_network_simple.png b/docs/images/neural_network_simple.png
diff --git a/docs/images/relu.png b/docs/images/relu.png
diff --git a/docs/images/sigmoid.png b/docs/images/sigmoid.png
diff --git a/docs/images/simple_nn_diagram_zo_zh_defined.png b/docs/images/simple_nn_diagram_zo_zh_defined.png
diff --git a/docs/index.rst b/docs/index.rst
@@ -1,17 +1,13 @@
 Machine Learning Cheatsheet
 ===========================
 
-.. toctree::
-  :maxdepth: 2
-  :titlesonly:
-
 A glossary of short visual explanations of machine learning concepts with diagrams, code examples and links to resources for learning more.
 
 Topics
 ======
 
 .. toctree::
-  :maxdepth: 2
+  :maxdepth: 3
 
   glossary
   basics

diff --git a/docs/loss_functions.rst b/docs/loss_functions.rst
@@ -4,6 +4,8 @@
 Loss functions
 ==============
 
+Metrics used to quantify how "good" or "bad" our model is at making predictions. The smaller the loss, the better our model (unless we overfit).
+
 
 Cross-Entropy Loss
 ==================

diff --git a/docs/nn.rst b/docs/nn.rst
@@ -4,12 +4,17 @@
 Neural networks
 ===============
 
-Neural networks are a class of machine learning algorithms used to model complex patterns in datasets using multiple hidden layers and non-linear activation functions. A neural network takes an input, passes it through multiple layers of hidden neurons (mini-functions with unique coefficients that must be learned), and outputs a prediction representing the combined input of all the neurons. Neural networks are trained iteratively using optimization techniques like gradient descent. After each cycle of training, an error metric is calculated based on the difference between prediction and target. The derivatives of this error metric are calculated and propagated back through the network using a technique called backpropagation. Each neuron's coefficients (weights) are then adjusted relative to how much they contributed to the total error. This process is repeated iteratively until the network error drops below an acceptable threshold.
+Neural networks are a class of machine learning algorithms used to model complex patterns in datasets using multiple hidden layers and non-linear activation functions. A neural network takes an input, passes it through multiple layers of hidden neurons (mini-functions with unique coefficients that must be learned), and outputs a prediction representing the combined input of all the neurons.
 
-** Topics **
+.. image:: images/neural_network_complex.png
+    :align: center
+
+Neural networks are trained iteratively using optimization techniques like gradient descent. After each cycle of training, an error metric is calculated based on the difference between prediction and target. The derivatives of this error metric are calculated and propagated back through the network using a technique called backpropagation. Each neuron's coefficients (weights) are then adjusted relative to how much they contributed to the total error. This process is repeated iteratively until the network error drops below an acceptable threshold.
+
+**Topics**
 
 .. toctree::
-  :maxdepth: 1
+  :maxdepth: 2
   :titlesonly:
 
   nn_concepts