# Chapter 1: Foundations

"The aim of this chapter is to explain some foundational mental models that are essential for understanding how neural networks work. Specifically, we'll cover *nested mathematical functions and their derivatives*."

For foundational concepts, we'll introduce via three perspectives:
1. Math, in the form of equations
2. Code, with as little extra syntax as possible
3. A diagram explaining what is going on

"one of the challenges of understanding neural networks is that it requires multiple mental models"

## Dependencies

In [28]:
import numpy as np
from numpy import ndarray
from typing import Callable

In [22]:


print("Python list operations")
a = [1,2,3]
b = [4,5,6]
print("a+b", a+b)

Python list operations
a+b [1, 2, 3, 4, 5, 6]


In [3]:
try:
    print(a*b)
except TypeError:
    print("a*b has no meaning for Python lists")

a*b has no meaning for Python lists


In [6]:
print("numpy array operations")
a = np.array([1,2,3])
b = np.array([4,5,6])
print("a + b =", a+b)
print("a * b =", a*b)

numpy array operations
a + b = [5 7 9]
a * b = [ 4 10 18]


In [18]:
a = np.array([[1,2,3],
              [4,5,6]]) 
print(a)


[[1 2 3]
 [4 5 6]]


Each dimension of the array has an associated axis, making it possible to do intuitive numerical calculations along the different axes. For a 2D array, `axis = 0` corresponds to rows, `axis = 1` corresponds to columns.

In [19]:
print('a:')
print(a)
print('a.sum(axis = 0):', a.sum(axis = 0))
print('a.sum(axis = 1):', a.sum(axis = 1))

a:
[[1 2 3]
 [4 5 6]]
a.sum(axis = 0): [5 7 9]
a.sum(axis = 1): [ 6 15]


In [20]:
b = np.array([10, 20, 30])
print("a + b:\n", a + b)

a + b:
 [[11 22 33]
 [14 25 36]]


Some basic functions in `numpy`

In [23]:
def square(x: ndarray) -> ndarray:
    '''
    Square each element in the input ndarray.
    '''
    return np.power(x, 2)


def leaky_relu(x: ndarray) -> ndarray:
    '''
    Apply "Leaky ReLU" function to each element in ndarray.
    '''
    return np.maximum(0.2 * x, x)

In [25]:
square(np.array([1,2,3,4,5,6]))

array([ 1,  4,  9, 16, 25, 36])

In [27]:
leaky_relu(np.array([1,2,-3,4,-5,6]))

array([ 1. ,  2. , -0.6,  4. , -1. ,  6. ])

## Derivatives

In [29]:
def derivative(func: Callable[[ndarray], ndarray],
               input_: ndarray,
               delta: float = 0.001) -> ndarray:
    '''
    Evaluatves the derivative of a function "func" at every element in 
    the "input_" array.
    '''
    return (func(input_ + delta) - func(input_ - delta)) / (2 * delta)

In [31]:
derivative(square, np.array([1,2,4,8,20]) )

array([ 2.,  4.,  8., 16., 40.])