# Introduction to Math

## Math as a Tool

Over the course of your career analyzing data, you will come across many tools for you to use to complete your assigned tasks. Tools that will almost always require some math, whether that be an understanding of set theory, Calculus, Linear Algebra or Statistics.

The goal of this workshop is to get you familiar with learning the fundamental mathematics involved with the more advanced analysis techniques you'll come across.

## Calculus

To put it in very simple terms, calculus is the mathematics that describes infinitesimal change. If you drive, you'll see an example of calculus in your car's very own speedometer. **A measure of your speed as a change in distance over a change in time**

In calculus, you will see two major techniques: Differentiation and Integration. Differential Calculus tells us how an infinitesimal change in a variable changes an output. Integral Calculus gives us a measure of the total change of our output given an infinitesimal change in a variable.

### Differential Calculus 

Differential Calculus is the mathematics behind how a change in one variable changes an output. Going back to the example of the car speedometer, speed is a measure of a change in distance over a change in time:

<center>V = $\frac{ds}{dt}$<center>

**NOTE:** The `d` symbol above stands for "delta". This is a term often used in math and physics to describe change.

Another way of phrasing this is to say that speed is the **derivative** of distance with respect to time. Derivatives are powerful tool that not only tell you how a function changes, but can also tell you the value of a variable for which a function reaches a maximum or minimum value as the derivative will be equal to 0. The slope of a line is another example of a derivative as it is the change in y over the change in x.

Formally, the equation for the derivative of a Function is written as:

$$f'(x) = \lim_{h \to 0} \frac{f(x + h) - f(x)}{h}$$

Here is an example of a visualization to determine the maximum/minimum of a function.

In [None]:
# Don't worry about the package imports for now.
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt

# Here, I define a function in python where f(x) = 9 - (x-3)^2
def f(x):
  return 9 - (x-3)**2

# Here, I define the derivative function as f' or fprime.
def fprime(x):
  h = 1e-5
  return (f(x+h) - f(x))/h

x = np.linspace(-3,9,19)

fig, axs = plt.subplots(2)
plt.figure(figsize=(30,20))
fig.suptitle('Plots of the function f(x) and its derivative, fprime(x)')
axs[0].plot(x, f(x))
axs[1].plot(x, fprime(x))

At the value of x = 3, our function f(x), reaches its maximum value of 9 whereas our derivative hits a value of 0 at x = 3. We will be using derivatives a good amount when we start talking about reducing error in our models.

### Integral Calculus

Integral Calculus is the opposite of differentiation. Rather than the slope of a curve, the integral is the area underneath the curve. You can think of an integral as a measure of accumulated change, i.e. the end result of a large number of changes to some quantity of interest that have built up over time (or along some dimension more generally) and whether those changes are positive or negative.

Going back to the Speedometer example, if we were to integrate the function for speed over some interval of time, we would get the **Total** distance the car covered in that amount of time.

While there are many ways to calculate an integral, I will use a [Riemann Sum](https://en.wikipedia.org/wiki/Riemann_sum#:~:text=In%20mathematics%2C%20a%20Riemann%20sum,of%20curves%20and%20other%20approximations) to represent an integral.

 $$\int_a^b f(x) dx \approx \lim_{n \to \infty} \sum_{i=1}^n f(x_i)*\Delta x $$  

 $$\Delta x = \frac{b-a}{n} $$

 $$x_i = a + i \Delta x$$

There isn't a clear cut way to get the area under a curve. This approximation tries to fit an "n" number of rectangles under a curve and gets the sum of the individual rectangles. The more rectangles used, the more accurate the integral. Here is a picture visually showing how the integral becomes more accurate as the number of rectangles goes up.

![Riemann sum rectangles](../assets/riemann-sum-rectangles.png)

Integrals will be key in understanding probability and probability functions.

## Linear Algebra

Linear algebra is a branch of math that deals with solving linear equations in the following form by using matrices and vectors.

$$a_1x_1 + a_2x_2+a_3x_3+...+a_nx_n = b$$

### Example

Solve the following system of equations for values of a, b and c.

```
2a + b + 2c = 20
a + 5b + 3c = 38
4a + b + 2c = 26
```

We could solve it with the substitution method, but that would take too long. Instead, let's represent this system of linear equations as a matrix multiplied by a vector:

$$\begin{bmatrix} 2 & 1 & 2 \\ 1 & 5 & 3 \\ 4 & 1 & 2 \end{bmatrix} \begin{bmatrix} a \\ b \\ c \end{bmatrix} = \begin {bmatrix} 20\\ 38\\ 26 \end{bmatrix} $$

⚠️ If you're unsure of how the above equation is equal to the system of equations we had before, this is an example of matrix multiplication. [This video by PatrickJMT](https://www.youtube.com/watch?v=YtMYfvypgM4&ab_channel=patrickJMT) covers it visually.

Essentially, you multiply each row of the left matrix by the columns of the matrix on the right. The matrix on the right with only 1 column can also be referred to as a vector. Try doing the matrix multiplication by hand after watching the video.

To solve the above equation, there are a number of possible procedures. The most basic of them is the [Gauss Jordan Elimination](https://en.wikipedia.org/wiki/Gaussian_elimination) where we convert the equation to the form of: 

$$\begin{bmatrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} a \\ b \\ c \end{bmatrix} = \begin {bmatrix} A\\ B\\ C \end{bmatrix} $$

And in this case, we have `a = A`, `b = B` and `c = C`. 

[This video](https://www.youtube.com/watch?v=eYSASx8_nyg&ab_channel=TheOrganicChemistryTutor) demonstrates how to solve this system of equations by hand.  

Luckily we have python! 🎉 We won't have to solve these by hand! 🎉 We'll get the computer to solve it instead.

This is how it is done in Python:

In [None]:
# We use numpy to solve the matrix version of our system of equations
import numpy as np

Matrix = np.array([[2,1,2], [1,5,3], [4,1,2]])
Solutions = np.array([20, 38, 26])
x = np.linalg.solve(Matrix, Solutions)
x

Run the code block above. You will see that we obtain `a = 3`, `b = 4` & `c = 5` as expected.

### Note on Matrices

Some matrices also have an interesting property where they can be rewritten in the form of:

$$\begin{bmatrix} x & 0 & 0 \\ 0 & y & 0 \\ 0 & 0 & z \end{bmatrix}$$

where x, y and z are non zero entries and can also be called [Eigenvalues](https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors#:~:text=In%20linear%20algebra%2C%20an%20eigenvector,which%20the%20eigenvector%20is%20scaled) of the matrix.

If you would like to know how diagonalization works by hand, feel free to watch [the following video](https://www.youtube.com/watch?v=WTLl03D4TNA&ab_channel=ProfessorDaveExplains).

Linear Algebra will be a big part of Machine Learning and A.I in general  It will be crucial for us to understand how matrix multiplication works, how to convert systems of equations to matrices and how to solve them.

## Statistics

Statistics uses a wide variety of math in its application. You may have already seen the most basic calculations involved in statistics such as Mean, Median, Mode, Variance, Standard Deviation and Interquartile Range.

### Mean

The mean of a set of numbers is also referred to as the average value. In Statistics, it can also be referred to as an **Expectation Value**.

The formula for the mean can be written as:

$$\mu = \frac {{\sum_{i=1}^n}x_i}{n}$$

where, x are a set of numbers and n is the total amount of numbers in the set.

Or as:

$$\mu = \sum_{i=1}^n p_ix_i$$

where p is the probability of getting a result x. We will dive more into this in the probability lecture.

#### Example

Given a set of numbers, (1, 0, 4, 5, 6, 8, 10, 6) we can calculate the average as:

$$\mu = \frac {1+0+4+5+6+8+10+6}{8} = 5$$

### Median

The median of an ***ordered*** set of numbers refers to the number that is in the middle of the set. _An Ordered set is a set uf numbers that are in ascending order --from the lowest to greatest. 

#### Example

To find the median of this set `(1,4,3,5,8)`, we must first reorder it in ascending order: `(1,3,4,5,8)` and then we can see that the middle value, the median, is 4. 

Another way to represent the median is by saying that it is the number at which 50% of the total numbers in the set are equal to or less than it.

### Variance & Standard Deviation
[Variance](https://en.wikipedia.org/wiki/Variance) and [Standard Deviation](https://en.wikipedia.org/wiki/Standard_deviation) are measures of the spread of the set of numbers from the mean. 

The variance can be calculated as:

$$var = \sigma^2 = \frac {\sum_{i=1}^n (x_i - \mu)^2}{n}$$

The Standard Deviation is the square root of the variance. _The mean and standard deviation are very susceptible to outliers in your data._

### Interquartile Range

The Interquartile Range is a measure of the spread of the median and is more resistant to outliers. The median being the number at which 50% of the total numbers are equal to or less than it, we can also refer to the median as the _Second Quartile (Q2)_. The interquartile range is obtained by subtracting the third quartile (Q3) from the first quartile (Q1). The first quartile is the number at which 25% of the total numbers in a set are equal to or less than it while the third quartile is the number at which this is true for 75% of the numbers.

#### Example

Given an ordered set of data, (2,4,4,5,6,7,8) the number 4 is the first quartile, 5 is the second quartile and 7 is the third quartile. Here is a visual representation:

![image.png](../assets/interquartile-range-1.png)

The median and interquartile range are represented in boxplots:

![image.png](../assets/interquartile-range-2.png)

## Exercises

### Exercise 1

Solve the following system of equations for a, b and c using the matrix method we saw above. _You can use the same code, and adjust the numbers that are in the matrix._

```
2a + 3b + 4c = 8
3a + 4b + 3c = 11
1a + 1b + 2c = 3
```

Use the code block below.

In [None]:
# exercise 1


### Exercise 2

Compute the mean and variance of the following set of numbers and identify the 2nd quartile.

**Hints** 

- Uncomment the `print` command to see the set of numbers.
- Use the numpy functions for mean and variance to compute them, `set.mean()` and `set.var()`.

In [None]:
# exercise 2
Numbers = np.linspace(0,10,11)
# print(Numbers)
# continue here...


### Exercise 3

Give me the value of x for which the function, f(x) = x^2-4x + 5 is at a minimum using its derivative. 

In [None]:
# exercise 3


We will go over all of the material in this notebook in class.