# Chapter 1: Foundations
This chapter aims to explain *nested mathematical functions and their derivatives*. These mental models are essential for understanding neural network function. 

## Functions
Functions describe mathematically the transformation between an input and its output. For example, the equation $f_1(x) = x^2$ transforms any arbitrary $x$ into $x^2$. 

Another way to think of them are in terms of black boxes with internal logic. The `Square` box takes $n$ as an input and turns it into $n^2$. The `ReLU` box takes $n$ as an input and turns it into $max(x,0)$ as output.

### Code
#### NumPy
> The data we deal with in neural networks will always be held in a multidimensional array that is almost always either one-, two-, three-, or four-dimensional, but especially two- and three-dimensional.
NumPy's `ndarray` class allows us to operate on these arrays in quick and intuitive ways.

In [24]:
import numpy as np

print("Python list operations:")
a = [1, 2, 3]
b = [4, 5, 6]
print("a+b:", a+b)
try:
    print(a*b)
except TypeError:
    print("a*b has no meaning for Python lists.")
print()
print("numpy array operations:")
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print("a+b:", a+b)
print("a*b:", a*b) 

Python list operations:
a+b: [1, 2, 3, 4, 5, 6]
a*b has no meaning for Python lists.

numpy array operations:
a+b: [5 7 9]
a*b: [ 4 10 18]


`ndarrays` are zero-indexed `n`-dimensional arrays. 2D arrays are a common feature in deep learning, so we can quickly visualize a 2D `ndarray` as `axis = 0` as the rows and `axis = 1` as the columns. `ndarrays` makes it easy to intuitively apply functions along these axes.

In [10]:
a = np.array([[1, 2], [3, 4]])
print('a:', a)
print('a.sum(axis=0):', a.sum(axis=0))
print('a.sum(axis=1):', a.sum(axis=1))

a: [[1 2]
 [3 4]]
a.sum(axis=0): [4 6]
a.sum(axis=1): [3 7]


`ndarrays` supports adding a 1D array to the last axis. For a 2D array with `R` rows and `C` columns, we can add a 1D array `b` of length `C`, adding elements to each row of `a`.

In [23]:
a = np.array([[1,2,3], [4,5,6]])
b = np.array([10, 20, 30])
print("a+b:\n", a+b)

a+b:
 [[11 22 33]
 [14 25 36]]


#### Type-checked functions
We will need informative type signatures to convey what functions actually do. Compare the transparency of the type signature 
```
def __init__(self,
            layers: List[Layer],
            loss: Loss,
            learning_rate: float = 0.01) -> None:
```

to a more succinct type signature to define the operation: `def operation(x1,x2)`. Modifying it to `def operation(x1: ndarray, x2: ndarray) -> ndarray:` gives you an idea that the function takes in two `ndarray`s. This book uses type-checked functions to increase clarity.

### Basic Functions in NumPy
Revisting the `square` and `relu` functions discussed earlier...

In [29]:
from numpy import ndarray
def square(x: ndarray) -> ndarray:
    '''Square each element in the input ndarray.'''
    return np.power(x,2)

def leaky_relu(x:ndarray) -> ndarray:
    '''Apply "Leaky ReLU" function to each element in ndarray.'''
    return np.maximum(0.2*x, x)
    
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print("square(a): ", square(a))
print("leaky_relu(b): ", leaky_relu(b))

square(a):  [1 4 9]
leaky_relu(b):  [4. 5. 6.]


## Derivatives
The rate of change of the output of a function. Remember localized slopes. Here's how to approximate a derivative using code.

In [35]:
import numpy as np
from typing import Callable
import scipy.special

def deriv(func: Callable[[ndarray], ndarray],
        input_: ndarray,
        delta: float = 0.001) ->ndarray:
        '''Evaluates the derivative of a function "func" at every element in the "input_" array.'''
        return (func(input_ + delta) - func(input_ - delta)) / (2 * delta)

def f(input_: ndarray) -> ndarray:
        a = np.power(input_, 3)
        return a

deriv(f,3)

27.000000999995777

## Nested Functions
For any functions $f_1, f_2, ... f_n$, a nested function is one where the output of one function becomes the input of another. E.g., $f_2(f_1(x)) = y$.

In [None]:
from typing import List

# A function takes in an ndarray as an argument and produces an ndarray
Array_Function = Callable[[ndarray], ndarray]

# A Chain is a list of functions
Chain = List[Array_Function]

def chain_length_2(chain: Chain,
                a: ndarray) -> ndarray:
    '''Evaluates two functions in a row, in a "Chain".'''
    assert len(chain) == 2, \
    "Length of input 'chain' should be 2"
    
    f1 = chain[0]
    f2 = chain[1]
    return f2(f1(x))