# Multivariate Calculus

Author: Olatomiwa Bifarin <br>

_This is  a Draft Copy_

## Notebook Outline

1. [Introduction](#1)
2. [Derivatives](#2)

_Resources_

-  Multivariable calculus from coursera
-  __[Mathematical Python](https://www.math.ubc.ca/~pwalls/math-python/)__
-  https://mml-book.github.io/book/mml-book_printed.pdf
-  https://gwthomas.github.io/docs/math4ml.pdf

## 1. Introduction
<a id="1"></a>

In machine learning, the goal is to find (and learn) certain pattern in the input data in order to make certain predictions. __Cost functions__ are an integral part of this process as it measures how poorly the model is predicting. The goal of calculus is to minimize this cost, by which prediction is improved. Examples can be found in regression problems (curve fitting), and neural networks. The mathematical tools for this tasks will be the focus in this notebook. 

In summary, the gradient measures how the output changes as the input parameters changes, as such we want the function's gradients to tend towards the direction of the steepest ascents. 

## 2. Derivatives
<a id="2"></a>

The gradient at $ x $: 

$$
f'(x) = \lim_{\Delta x \to 0} \frac{f(x+\Delta x) - f(x)}{\Delta x}
$$

The derivative gives the gradient, which tends towards the direction of the steepest ascents. 

___Derivatives Function___ <br>
Show in graphs

### a. Differentiation Rules

___Sum Rule___ 

$$
\frac{d ({f(x)+g(x)})}{d {x}} =
\frac{d {f(x)}}{d x} +
\frac{d {g(x)}}{d x}
$$

___Power Rule___ 

$$
f(x) = ax^b
$$

$$
f'(x) = abx^{b-1}
$$

___Product Rule___ 

$$
A(x) = f(x)g(x)
$$

$$
A'(x) = f(x)g'(x) + g(x)f'(x)
$$

___Chain Rule___ 

If $ h=h(p)$ and $ p=p(m) $:

$$
\frac{d  h}{d \mathbf m } = \frac{d \mathbf h}{d \mathbf p } X \frac{d \mathbf p}{d \mathbf m }
$$


___Quotient Rule___

$$
{[\frac{f(x)}{g(x)}]}' = \frac{g(x)f'(x) - f(x)g'(x)}{[g(x)]^2}
$$


___Examples and Special cases___

$$ f(x) = \frac{1}{x} $$
$$ f'(x) = \frac{-1}{x^2} $$


$$ f(x) = e^x $$
$$ f'(x) = e^x $$
$$ f''(x) = e^x $$
$$ ... $$

$$ f(x) = sin(x) $$
$$ f'(x) = cos(x) $$
$$ f''(x) = -sin(x) $$
$$ f'''(x) = -cos(x) $$
$$ f''''(x) = sin(x) $$

### b. Derivatives: Pythonic Examples

### c. Derivatives: Partial and Total Derivates

## 2. Jacobians and Hessians

### a. vectors of derivatives

### b. The sand pit: Intuition for jacobian and gradient descent

### c. The Hessians

## 3. Multivariate Chain Rule and its Application

### a. Neural Network

_Training neural network (change values and annotations)_

In [None]:
# First we set the state of the network
σ = np.tanh
w1 = 1.3
b1 = -0.1

# Then we define the neuron activation.
def a1(a0) :
  z = w1 * a0 + b1
  return σ(z)

# Experiment with different values of x below.
x = 0
a1(x)


In [None]:
# Define the activation function.
sigma = np.tanh

# Let's use a random initial weight and bias.
W = np.array([[-0.94529712, -0.2667356 , -0.91219181],
              [ 2.05529992,  1.21797092,  0.22914497]])
b = np.array([ 0.61273249,  1.6422662 ])

# define our feed forward function
def a1 (a0) :
  # Notice the next line is almost the same as previously,
  # except we are using matrix multiplication rather than scalar multiplication
  # hence the '@' operator, and not the '*' operator.
  z = W @ a0 + b
  # Everything else is the same though,
  return sigma(z)

# Next, if a training example is,
x = np.array([0.1, 0.5, 0.6])
y = np.array([0.25, 0.75])

# Then the cost function is,
d = a1(x) - y # Vector difference between observed and expected activation
C = d @ d # Absolute value squared of the difference.


### b. Back Propagation

"In this assignment, you will train a neural network to draw a curve by implementing backpropagation by the chain rule to calculate Jacobians of the cost function.

The neural network will then be trained using a (pre-implemented) stochastic steepest descent method, and will draw a series of curves to show the progress of the training.

You will have to copy and edit pre-written python code in this assessment, but will not need to write new code. It should be tractable even to learners with little coding experience, so long as you don't panic about the code and work through each line diligently.

Best of luck!!"


## 4. Taylor series and linearisation

https://www.youtube.com/watch?v=3d6DsjIBzJ4&t=165s

### a. Maclaurin Series

### b. Taylor Series

### c. Multivariable Taylor Series

## 5. Optimization

### a. Newton-Raphson

### b. Gradient Descent

## References and Resources

-  Multivariable calculus from coursera
-  __[Mathematical Python](https://www.math.ubc.ca/~pwalls/math-python/)__
-  https://mml-book.github.io/book/mml-book_printed.pdf
-  https://gwthomas.github.io/docs/math4ml.pdf