## An Introduction to mathematical modeling and Python 

In this class, we will do our coding in Python. This is a coding language that has a lot of advantages. First, it has a relatively easy learning curve. If you've done coding before, you should catch on quickly. If not, it is an intuitive language. Second, it is *open-source*. This means that the entire coding language is freely accessible for anyone. That means you can use it on any machine without any cost. It also means that people can develop code for it and share their code with others. Python makes it extremely easy to *import* the code written by others. Third, because code is being constantly developed and shared, it is quite easy to start writing powerful code since we can take advantage of the wonderful programs written by others!

The document you are looking at now is called a Jupyter Notebook. This is a fourth great advantage of Python. We can mix text and code in this document, explaining steps as we go along. At any point, you can edit the code you see and press Shift + Enter to evaluate it and see how your new code changes the result. 

To get started, click on the cell (the rectangular box) below press Shift + Enter to evaluate its content (run the code).

Try playing around with different numbers below.

In [None]:
3 + 5*2

Try changing the code in the box above to calculate other simple things. Remember to press Shift + Enter to evaluate your new code! We can easily use Python as a calculator in this way.

Remember how we said we wanted to use code written by others? One of the most popular sets of code is called numpy (numerical Python). This is a package that includes a *ton* of useful math functions. Getting access to this great code is as simple as requesting Python import it! Press Shift + Enter on the box below:

In [None]:
import numpy as np

Importing 'as' np means that we can use the pieces of code in this big library of code simply by typing 'np' first. 

Now we can make use of numpy functions. All numpy code begins with np.y where y is the name of the function you want to use. For example, press Shift + Enter on the below block. Discuss with your group what this code is doing.

In [None]:
np.exp(1)

We are going to use Python and numpy to obtain the results from the math worksheet in a computer. You will hopefully start to get a feel for how useful it is to outsource math to a machine!

We can break down the first problem in the math worksheet, using the Pythagorean Theorem, into simply a series of four calculator evaluations - $a^2$, $b^2$, adding $a^2 + b^2$, and taking the square root ($a$ and $b$ are just numbers). Complete each one in the cells below. Remember to press Shift + Enter to evaluate them. If a function doesn't seem to work, try putting "np." in front of it. Many of the mathematical functions you expect are included in numpy but not basic Python. Finally, the exponent in Python is defined by **.

Mathematical models take the form of *functions*. Functions can be created in Python several different ways. One way is to use the following to define a function:

The following line creates (using *def*) the function "Area_Of_Square" which takes in one input: length, and returns one output, the total area of a square. 

In Machine Learning, the input is called *input data* and the output is called the *target variable*.

In [None]:
def Area_Of_Square(length):
    return length*length

So, we are defining in code the function $\text{area of square} = \text{length*length}$, where $\text{Area_Of_Square}$ is the name of our function and our output, and $\text{length}$ is our input variable. Note that when we run this code, there isn't an output - that's okay! We have *defined* a function, but we haven't tried to *evaluate* it yet - that's when we expect an output.

To evaluate the function, we will call its name and give it an input. The function will return the output.

We can now use the function "Area_Of_Square" to find the area of a square (the target variable) from the length of one side (the input data).

In [None]:
Area_Of_Square(3)

Now it's your turn. Fill in the following function which determines the area of a rectangle (the target variable) from the two pieces input data: the length (l) and width (w) of its sides.

In [None]:
def Area_Of_Rectangle(l,w):
    

Using this format, fill in the cells below to define the three functions from problem two.

In [None]:
def f(x):
    

In [None]:
def g(x):
    

In [None]:
def h(x):
    

What if we want to automate filling the table from problem 2? We could just plug in the value of x for each of the three functions and record the answer, but it would be a little bit tedious. Instead, we are going to use an *array*. An array is a list of numbers, for example: A = [1,2,3,4] is an array. 

In order to work with arrays, we'll also need to use numpy. We can define an array called `my_array` as follows:

In [None]:
my_array = np.array([1,2,3,4])

We can now use this array in our functions! For example, we can find the area of the squares with length 1, 2, 3, and 4, all together.

In [None]:
Area_Of_Square(my_array)

Using this, create an array, called my_array2, that consists of all the values of x you want to plug in to our three functions.

Now plug this new array into your functions to get the same results you had before! 

Note that when we define functions, the name of the variable in parentheses is just a placeholder – in other words, we can define a function as *f(x)*, then ask it to give us the output for a specific input called *x0* by calculating *f(x0)* – note that we've replaced *x* with *x0* now that we want to get specific outputs.

## "Learning"

In machine learning, we work with data instead of explicit forms for the functions, as we have above. 

Each of the above models can be put in the form $y = f(x)$. Then if we have the inputs 

$$x_1 = 1, ~x_2 = 2,~ x_3 = 3,~ x_4 = 4$$

we have output 

$$y_1 = f(x_1),~ y_2 = f(x_2),~ y_3 = f(x_3),~ y_4 = f(x_4).$$

For example, using the "Area_Of_Square" model, we have 

$$(x_1,x_2,x_3,x_4) = (1,2,3,4)$$ and $$(y_1,y_2,y_3,y_4) = (1,4,9,16).$$ 

This information (the data) can be used to create a table,

| $x$ | $y$ | 
| --- | --- |
|  1  |  1  |
|  2  |  4  |
|  3  |  9  |
|  4  |  16 |

and a scatter plot. In order to create the scatter plot, we need to import some more code, this is used for plotting.

In [None]:
import matplotlib.pyplot as plt # Import the code we need to make plots
%matplotlib inline 
# The line above allows us to see plots.

We'll create a list of our inputs $x_i$, then our outputs $y_i$, where our inputs $x_i$ give the lengths and the outputs $y_i$ give us the area of a square for those lengths. Then, we plot the input lengths and the output areas!

In [None]:
xi = np.array([1,2,3,4])
yi = Area_Of_Square(xi)
plt.plot(xi,yi,'o')

We engineered the models that were given above. In other words, we constructed what we thought was a reasonable model. The key difference between what we've done above and "Machine Learning" is that in Machine Learning, we use data to try to determine the model. This step is called *learning*.

Another way to say this is that in Machine Learning, you have the table

| $x$ | $y$ | 
| --- | --- |
|  1  |  1  |
|  2  |  4  |
|  3  |  9  |
|  4  |  16 |


and the goal is to *find* the function $f$ such that $y = f(x)$.

The $f$ that works for the above scatter plot is $f(x) = x^2$. We can plot this to check it.

In order to plot $f(x) = x^2$, we first need to define a bunch of $x$ points. We want to create a curve, so we define a bunch of points that are really close together. We can do this using numpy's "*arange*" function.

In [None]:
many_points = np.arange(0,4.1,0.1)
many_points

What did this function do? Discuss it with your friends. You can also check what a function does with the help function:

In [None]:
help(np.arange)

In order to make the plot, we need to also get the y-values for many_points.

In [None]:
many_y_points = Area_Of_Square(many_points)

In [None]:
plt.plot(xi,yi,'o')
plt.plot(many_points,many_y_points,'r') # 'r' makes the color red
plt.legend(['data', 'f'])

We see that the curve above goes through each one of our data points, so we think we have found the model $f$. 

Consider the following table:

| $x$ | $y$ | 
| --- | --- |
|  -10  |  -2  |
|  -8  |  -1  |
|  -7  |  -0.5  |
|  -4  |  1 |
|  0  |  3  |
|  1  |  3.5  |
|  2  |  4  |
|  4  |  5 |
|  6  |  6  |

Can you guess what function $y = f(x)$ this represents? To answer this, you may want to plot the data similar to what we did above. Once you have determined the function $f$, plot the data and the curve like you saw above.