# Python Basics for KP3150

In this notebook, we will ~~introduce~~ review some Python commands and resources that will be used during this course. We start with basic commands, then we look into some ``numpy`` resources to handle data (specifically arrays), move to ``matplotlib`` for plotting, and lastly we check out some numerical solvers from ``scipy``. 

This is supposed to be a reference for the notebooks we will work with. If you are having trouble understanding some functions, you should find them described here.

**Summary**

- [Basic commands](#basic-commands)
    - [Iteration and Sequence](#iteration-and-sequences)
    - [Functions](#functions)
        - [Lambda Functions](#lambda-functions)
- [Numpy](#numpy)
    - [Arrays](#arrays)
    - [Combining Arrays](#combining-arrays)
- [Plotting](#plotting)
    - [2D Plots](#2d-plots)
    - [3D Plots](#3d-plots)
- [Solvers](#solvers)
    - [fsolve](#fsolve)
    - [minimize](#minimize)
    - [curve_fit](#curve-fit)


<a name="basic-commands"></a>
### Basic commands

<a name="iterations-and-sequences"></a>
#### Iteration and Sequences

We begin with the ``for`` loop. Loops can iterate over a list or sequence of numbers. 

In [None]:
A = ["a", "b", "c", "d"]
for i in A:
    print(i)

Here we defined a list of strings ``A`` and iterated over it to print each element. 

We can create lists of any type of object. For example, a list of floats:

In [None]:
B = [1.2, 4.5, 7.3, 3.6]

We can add new elements to the end of a list using the method ``append``.

In [None]:
B.append(2.6)
print(B)

However, if a numerical sequence follow some kind of pattern, we can use the function ``range`` to define it. We can also directly iterate over it if it does not need to be stored. 

In [None]:
b = 0
for i in range(3):
    b += i          # this is a short way of writing b = b + i
print(b)

Why is the result __not__ 6? Let's look at the elements of ``range(3)``:

In [None]:
for i in range(3):
    print(i)

In Python, indexing starts at 0. When we define ``range(3)``, we are telling Python to build a numerical sequence that starts at index 0 and ends one index before 3 (has 3 elements).

The ``range`` function, however, can be used with other arguments too. Let's take a look at the following output.

In [None]:
for i in range(2,5):
    print(i)

Here we are saying that we want a numerical sequence that starts at index 2 and ends one index before 5. In Python, the value passed as the end of the sequence is not included.

In [None]:
C = range(2,8,2)
print("Elements:", len(C))      #   len is a function that returns the length of a range or list 
                                #   you can print different objects in the same line by separating them with commas
for i in C:
    print(i)

If we call ``range`` with 3 arguments, the first one is the beginning of the sequence, the second argument is the end of the sequence (not included), and the last argument is the step size. In this case, we built a sequence with 3 elements which starts at index 2, includes even indices until 8 is reached but not included.

<a name="functions"></a>
#### Functions

Functions are a way of dividing your code into blocks and reuse them. They also make the code more readable and organized. We define a function using the keyword ``def`` and creating a block as follows:

In [None]:
def hello():
    print("Hello from KP3150 students!")

When we run this piece of code, we are only registering a function called ``hello``. To use it, we need to call it.

In [None]:
hello()

This function takes no arguments (there is nothing between the parenthesis after the function name). We can define functions that take an argument and use information we provide when calling them.

In [None]:
def personalized_hello(name):
    print(f"Hello from KP3150 students, {name}!")  # This is how you can add the value of a variable to a string (you need the 'f' in front of the string), but it only works for Python 3.6 and later

personalized_hello("Mary")

With functions, blocks of code can be run and give information back. For that, you use the keyword ``return`` in the end of the function, followed by the information to be returned.

In [None]:
def product(a, b):
    return a*b

This function takes two numerical arguments and return their product. Try calling it with no arguments and with any two numbers to see what happens.

In [None]:
product()

You can store the returned value in a variable:

In [None]:
p = product(3,4)
print(p)

A useful feature for defining functions is that you can set default values for arguments. Let's redefine the ``product`` function with a default value for b.

In [None]:
def product(a, b = 2):
    return a*b

Now, we can call ``product`` with only one argument because ``b`` has a backup value in case the second argument is not passed when this function is called.

In [None]:
product(5)

But, we can still normally pass two arguments if would like to set a different value for ``b``.

In [None]:
product(4,3)

For built-in functions in Python and its libraries, you can usually find default values in their documentation.

<a name="lambda-functions"></a>
##### Lambda Functions

We can also define short functions inline using ``lambda`` functions. They are called anonymous functions, since we do not define them with a name. Instead, we store an expression that can take several arguments and return its output. 

The format is 

*name* = lambda *arg1*, *arg2*,  ..., *argn* : *expression*

such as

In [None]:
prod = lambda a, b : a*b

We can now call ``prod`` with two numerical arguments:

In [None]:
prod(2,3)

``lambda`` functions are most useful inside another function. Suppose you need different functions that modify the result of the product of two numbers. You can define the function:

In [None]:
def mod_product(n):
    return lambda a, b : a*b*n

This function returns a lambda function that modify $a*b$ by multiplying by ``n``. We can then create different expressions based on different values of ``n`` as follows:

In [None]:
negative_product  = mod_product(-1)
double_product =  mod_product(2)

print(negative_product(2,3))
print(double_product(2,3))

We attributed to ``negative_product`` a lambda function with expression $-1*a*b$ and to ``double_product`` a lambda function that doubles product $a*b$. We then called these functions with two numerical values as arguments and obtained the negative and the double of the product respectively.

<a name="numpy"></a>
### Numpy

``numpy`` is a library for working with arrays, providing many specific functionalities, such as linear algebra and matrix operations. Therefore, we usually use numpy arrays instead of native lists or ranges in Python when working with numbers.

Documentation for this library can be found [here](https://numpy.org/doc/stable/index.html)

<a name="arrays"></a>
#### Arrays

We start by importing the ``numpy`` library:

In [None]:
import numpy as np    # it is common practice to give a shorter alias to libraries

There are several ways we can create numpy arrays. For example, we can convert lists:

In [None]:
arr = np.array(range(5))
print(arr)

We can check that ``arr`` is indeed a numpy array by checking its type.

In [None]:
type(arr)

A better way of creating numpy arrays that are sequences is by using ``numpy``'s built-in function ``arange``.

In [None]:
arr = np.arange(6)
print(arr)

Indexing here follows the same rules as native Python ``range`` function discussed above. Likewise, we can also pass start, end and step values

In [None]:
arr2 = np.arange(3,6,0.5)
print(arr2)

Note that we do not always need to use integer numbers. 

We can also create arrays from functions by just passgin the arrays as follows:

In [None]:
prod(arr,arr2)

However, some functions do not accept arrays as arguments. For example, the logarithmic function. Mathematical functions need to be imported from the ``math`` library. Let's start with that.

In [None]:
from math import log    # this is the syntax we can use when we do not wish to import the whole package but only a function 

If we try to pass ``arr2`` as argument to ``log``, we get the following error:

In [None]:
log(arr2)

But what can we do if we still need to build an array using the ``log`` function?

One way of doing that is to build a list by passing each element seperately and convert it to a numpy array. 

In [None]:
list_log = [log(i) for i in arr2]    # this is a compact syntax for creating a list by passing each element of an array to a function
print(list_log)
print(type(list_log))
array_log = np.array(list_log)
print(array_log)
print(type(array_log))

We could also have written everything in one line as follows

In [None]:
array_log = np.array([log(i) for i in arr2])
print(array_log)

``numpy`` also have specific functions to build special types of arrays, such as arrays of only zeros or ones.

In [None]:
print(np.zeros(5))
print(np.ones(10))

<a name="combining-arrays"></a>
#### Combining Arrays

Sometimes we need to combine arrays. For instance, we can take two arrays of size 5 and create a matrix of size 2 $\times$ 5 or a longer array of size 10. The function we use for this job is ``concatenate``.

In [None]:
np.concatenate((arr,arr2))

Here we created a long array by concatenating ``arr`` and ``arr2``. 

The arrays we have seen so far are one-dimensional arrays. They can act as vectors but there is not the concept of row or column vector for one-dimensional arrays. Therefore, when we concatenate two one-dimensional arrays of sizes $n$ and $m$, we put them together in a series and get an array of size $n+m$.

If we want to combine two or more one-dimensional arrays into two-dimensional matrices, we need to turn them into two-dimensional arrays first.

In [None]:
vec = np.reshape(arr,(1,-1))     # The second argument is the shape of the matrix (rows,columns)
vec2 = np.reshape(arr2, (1,-1))  # -1 is an undefined value. It means that the number of columns is determined by 
                                 # the given number of rows and the original size of the first argument
print(vec)
print(vec2)

Note that the arrays have double square brackets now. So that means they are now row vectors or matrices of size 1 $\times$ 6.

If we want ``arr`` and ``arr2`` to be column vectors (6 $\times$ 1), we can define:

In [None]:
cvec = np.reshape(arr,(-1,1))
cvec2 = np.reshape(arr2,(-1,1))

print(cvec)
print(cvec2)

We can now use ``concatenate`` to combine these vectors to create a 2 $\times$ 6 and a 6 $\times$ 2 matrices. 

In [None]:
print(np.concatenate((vec, vec2), axis = 0))
print(np.concatenate((cvec, cvec2), axis = 1))

Note that we are passing a new argument called ``axis``, which tells the function how we should put these matrices together. 

For ``axis`` = 0, the number of rows are added and the columns will be kept the same. Therefore, the matrices being concatenated must have the same number of columns in this case. 

For ``axis`` = 1, the number of rows is kept and the matrices are concatenated side by side.

If we do not pass the ``axis`` argument, the default value is 0.

In [None]:
print(np.concatenate((cvec,cvec2)))

But what happens if the dimension that must match does not?

In [None]:
np.concatenate((cvec, vec), axis = 0)

As expected, an error is thrown letting you know about the issue (**remember to always read the error message**).

<a name="plotting"></a>
### Plotting

In Python, plotting graphs is done using the library ``matplotlib``. It is a powerful tool and many types of graphs and charts can be created with this library. But here we will look into basic 2D and 3D plotting features. 

Documentation for this library can be found [here](https://matplotlib.org/stable/index.html).

---
**NOTE**: If you are writing a Python script to run from an IDE or the terminal, plotting should work normally with the functions described here. However, when we plot in a Jupyter notebook, how it works depend on the environment being used. In Anaconda, we are able to interact with the plot, such as zooming in and changing elements after the plot is created. In Google colab, plots are static and cannot do that. Read the comments in the next code block and set up the first line according to your environment.

---

<a name="2d-plots"></a>
#### 2D Plots

As always, we need to begin by importing the library.

In [None]:
%matplotlib notebook
# %matplotlib inline
# the lines above is only required for plotting using Jupyter notebook. If you're writing a script, it's not necessary
# if you are using Anaconda, keep the first line and comment out the second one
# if you are using Google colab, comment out the first line and keep the second one
# if you are using a different environment, try them both and see how it works

import matplotlib.pyplot as plt
import numpy as np    # in case you are starting from here and haven't imported it yet

Note that we are not importing the entire ``matplotlib`` library, only a smaller set of resources that we will use. To avoid having to write ``matplotlib.pyplot`` every time, we use the standard alias ``plt``.

Now, let's create the data that we will plot.

In [None]:
power = lambda x, n : x**n   # ** is the symbol for raising a number to the power of n. Most languages, however, use ^
xdata = np.arange(-5,5.2,0.2)
ydata = power(xdata, 2)
ydata2 = power(xdata, 3)

We defined a function that takes a two numbers and raises the first one to the power of the second. We then defined a range for our independent variable $x$ and created two ranges of dependent variables $y$, one being the square of $x$ and the other the cube of $x$. 

Now that we have data to plot, let's do it. First, we need to create a figure with subplots.

In [None]:
fig, ax = plt.subplots(1,2) 

The figure is the area where the subplots are. A subplot consists of its elements, such as axis, lines and points representing the data used, legend and so on. 

When we call ``subplots``, we passed two arguments. The first number is how many rows of subplots we want in the figure, the sencond is the number of columns of subplots, like a matrix. Here, the figure object is stored in ``fig``, while the plots are stored in ``ax`` as an array.

To add elements to the subplots, we need to specify the subplot with the corresponding index and call a method. Let's add the data we have to the subplots as lines.

In [None]:
ax[0].plot(xdata, ydata, label="x**2")
ax[1].scatter(xdata, ydata2, label="x**3")

Note that we used different methods for each subplot. In the first subplot, we used ``plot`` to plot $x^2$ as a continuous line, while in the second one we used ``scatter`` to add only the points we used to calculate $x^3$. 

Both methods take as first argument the data points of corresponding to the x-axis and as second argument the corresponding $y$ points. 

We also passed an extra argument called ``label`` that gives the data series a name. There are several properties we can define for the date series. The documentation pages for both [plot](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.plot.html) and [scatter](https://matplotlib.org/stable/api/_as_gen/matplotlib.pyplot.scatter.html) methods have further details.

We can now add other elements to both subplots, such as legend and labels.

In [None]:
ax[0].set_xlabel("x")
ax[0].set_ylabel("y")
ax[0].legend()

ax[1].set_xlabel("x")
ax[1].set_ylabel("y")
ax[1].legend()

The first two methods define the name of the x- and y-axis respectively. ``legend`` shows the box with identification of each data series.

Note that the y-axis of the second subplot overlaps with the first subplot. An easy way to fix this issue is use the function ``tight_layout``, which finds the best fit for all subplots within the figure. 

In [None]:
plt.tight_layout()

---
If the interactive plotting did not work (for example, because you are in the google colab enviroment), here is a code block with all the plotting commands. The whole figure is assembled and displayed when all commands are run sequentially in the same code block. 

**Remember to change the first line of the first code block of this section to** ``%matplotlib inline``!

In [None]:
fig, ax = plt.subplots(1,2) 
ax[0].plot(xdata, ydata, label="x**2")
ax[1].scatter(xdata, ydata2, label="x**3")

ax[0].set_xlabel("x")
ax[0].set_ylabel("y")
ax[0].legend()

ax[1].set_xlabel("x")
ax[1].set_ylabel("y")
ax[1].legend()

plt.tight_layout()
plt.show()    # sometimes the figure might not show up without this command

---

<a name="3d-plots"></a>
#### 3D Plots

3D plots work very similarly to 2D plots, with very few slight differences in terms of the plotting methods. What we need to be more careful with is the plotting data.

There are different types of 3D plotting, being contour and surfaces examples of such plots. Contours are 2D images that projects the values of the third variable $z$ into the xy-plane (see [this page](https://matplotlib.org/stable/gallery/images_contours_and_fields/contourf_demo.html?highlight=example%20contour) for an example). Here, we will work with surfaces (you can check an example [here](https://matplotlib.org/stable/gallery/mplot3d/surface3d_2.html?highlight=example%20surfaces) as well) and lines.

If you have not run the [2D Plots](#2d-plots) code blocks, we need to import the libraries.

In [None]:
%matplotlib notebook
# %matplotlib inline
# see the comments in the first code block of the 2D Plots subsection for an explanation of these lines

import matplotlib.pyplot as plt
import numpy as np 

Let's start with the data we will plot.

Suppose we have an expression like $z = x + y$. Here we have to independent variables, $x$ and $y$, and one dependent variable $z$. So let's write a lambda function for defining $z$ (we could write a standard function too).

In [None]:
f = lambda x, y : x + y

We can create arrays for the independent and dependent variables:

In [None]:
x = np.arange(10)
y = np.arange(10)
z = f(x,y)

Let's now create the figure and the 3D plot.

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')

Note that now we need to create the figure and the subplot objects seperately. We use ``figure`` and add a subplot with ``add_subplot``. The first argument of this function is composed of 3 numbers: the number of rows of subplots, the number of columns and a number that specifies which subplot is being assigned to ``ax``. Because we only have one subplot, then this number is 1. The second argument is a parameter that specifies that this subplot has 3 axis. 

We can now plot the data we have:

In [None]:
ax.plot(x, y, z)

Since the data we passed is composed of 3 one-dimensional arrays, we used the ``plot`` function to plot a line in a three-dimensional space.  

Let's see what happens if we try to plot a surface using the ``plot_surface`` function with the one-dimensional arrays we have.

In [None]:
ax.plot_surface(x,y,z)

The error message (last line) tells us that the data for $z$ must be two-dimensional.

How can we do that?

Well, we need a value for $z$ for every pair of independent variables $(x,y)$.

We can create the pairs of independent variables with the numpy function ``meshgrid``.

In [None]:
X, Y = np.meshgrid(x,y)
print(X)
print(Y)

This function essentially repeats the x array in the row direction to form a square matrix and the y array in the column direction. So now we can take, for example, position (3,2) in both ``X`` and ``Y`` and obtain a pair. When we consider every position, we have every possible combination of ``x`` and ``y``.

We can now pass these variables as arguments to the ``f`` function and get 2D data for $z$.

In [None]:
Z = f(X,Y)
print(Z)

We can now use ``X``, ``Y`` and ``Z`` to plot the surface representing function ``f``.

In [None]:
ax.plot_surface(X,Y,Z,alpha = 0.3)

The last argument is a parameter to set a degree of transparency to the surface. ``alpha`` = 1 is the default and it generates a solid surface. 

---
Again, if the interactive plot is not woking, here is the condensed code for creating a static 3D plot.

**Remember to change the first line of the first code block of this section to** ``%matplotlib inline``!

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
ax.plot(x, y, z)
X,Y = np.meshgrid(x,y)
Z = f(X,Y)
ax.plot_surface(X,Y,Z,alpha=0.3)

---

<a name="solvers"></a>
### Solvers

During this course we will use three types of solvers from the ``scipy`` library, specifically from the ``optimize`` subclass. 

We will not go into details of how they work here. For more information, you can check the [documentation](https://docs.scipy.org/doc/scipy/reference/optimize.html).

We will look into syntax and briefly discuss what they do.

<a name="fsolve"></a>
#### ``fsolve``

Function ``fsolve`` is a function that finds the roots of a function, i.e., the value of $x$ such as $F(x) = 0$ for a given function $F$.

Before defining that function $F$, we need to import the libraries we will use in this section.

(For details on the plotting commands, see [Section Plotting](#plotting))

In [None]:
%matplotlib inline
import numpy as np    # in case you haven't imported it in this session yet
import matplotlib.pyplot as plt    # in case you haven't imported it in this session yet
from scipy.optimize import fsolve

Consider $F(x) = x^3 + 4(x - 1)^2 - 8$. We first create a lambda function for it.

In [None]:
F = lambda x : x**3 + 4*(x - 1)**2 - 8


Let's look at how this function looks like by plotting it

In [None]:
X = np.arange(-8,5,0.5)
plt.plot(X, F(X))       # if we only need to see the data (no legend, no axis label, no title), 
                        # we can just use the plot function directly and the figure will be 
                        # created automatically
plt.plot(X, np.zeros(len(X)), linewidth = 0.5)  # this is the plot of a line for y = 0. The roots
                                                # are the x values for where this line crosses 
                                                # the original function 

As expected, this function has 3 roots: one close to $x = -6$, one just before $x = 0$, and one close to $x = 2$. 

When solving $F(x) = 0$ numerically, solvers usually return only one root. This happens because numerical solvers usually start from an initial point $F(x_\text{init})$ and then calculate steps that go towards $F(x) = 0$ until it reaches it within a certain tolerance. If it has more than one root, it usually goes to the closest one. 

When using ``fsolve``, we need to provide this initial value. 

Let's first try to find the root close to -6 by using $x_\text{init} = -6$.

In [None]:
sol1 = fsolve(F, -6)   
print(sol1)

When we call ``fsolve`` the first argument is the function and the second is the initial value for $x$. It returns the solution, which we stored in ``sol`` here. 

We can see that, indeed, the solver found the solution closest to -6. 

To find the other solutions, we can use different initial values.

In [None]:
sol2 = fsolve(F, 0)
print(sol2)

sol3 = fsolve(F, 2)
print(sol3)

Let's now take another look at our function $F(x) = x^3 + 4(x - 1)^2 - 8$. Suppose we need it to be more generic and we parametrize it and write it as $F(x) = x^3 + \alpha(x - 1)^2 - \beta$. We can say that so far we worked with a function with parameters $\alpha = 4$ and $\beta = 8$.

If we wish function ``F`` should take the parameters values as argument too, we need to define it as follows:

In [None]:
def F_par(x, par):
    alpha, beta = par   # here we know par has 2 elements and we are assigning each element to an individual variable
    return x**3 + alpha*(x - 1)**2 - beta

Here we defined ``F`` as taking two arguments: the first is the independent variable and te second one is the parameters this function has. The second line unpacks the parameters, that is, we need to pass all the parameters together as a list or array. For readability, we give them names before writing the mathematical expression. We could also have written the expression as ``x**3 + par[0]*(x-1)**2 - par[1]`` and avoided the second line. The function then returns its mathematical expression.

If we want to use ``fsolve`` to find the roots of this expression, we now need to pass the parameters as arguments too.

In [None]:
sol = fsolve(F_par, 2, [4,8])
print(sol)
print(sol3)

Note that by passing ``[4,8]`` we defined the same parameters as before, so we got the same root as before for $x_\text{init} = 2$. 

But we can change the parameters, for example $\alpha = 2$ and $\beta = 4$.

In [None]:
sol = fsolve(F_par, 2, [2,4])
print(sol)

As expected, the function changed its shape and the roots are now different (you can try different initial values to find the other roots and even plot the new function if you would like to see it for yourself).

<a name="minimize"></a>
#### ``minimize``

``minimize`` is another type of solver that finds the minimum value of a function and the corresponding $x$ value. This is essentially the goal of an entire field called [Optimization](https://en.wikipedia.org/wiki/Mathematical_optimization). Here we will keep to the very basics.

Before we see how it works, we need to import this function from the ``scipy`` library.

(If you are starting from here, run the previous code blocks in this [Section](#solvers))

In [None]:
from scipy.optimize import minimize

We will keep working with the same function $F(x) = x^3 + \alpha(x - 1)^2 - \beta$.

Let's take a look at its plot one more time.

In [None]:
X = np.arange(-8,5,0.5)
plt.plot(X, F_par(X, [4,8]))

If you remember from Calculus, a minimum corresponds to a point where the derivative is 0, i.e. $\frac{dF}{dx} = 0$, and curvature is convex. We can see there is such point between $x = 0$ and $x = 2$. Let's find it. 

``minimize`` works similarly to ``fsolve`` in the sense that we need to pass the initial value of $x$ and the function parameters as arguments.

In [None]:
sol = minimize(F_par, 0, [4,8])
print(sol)

The solution for this function has some more information than just the value of $x$. We will look at the three most relevant for us. 

``success`` tells us if the solver was able to find a minimum. In this case, it did.

``fun`` is the value of $F(x)$ at the minimum.

``x`` is the value of $x$ at the minimum.

You can retrive a specfic value of the solution as follows:

In [None]:
sol.fun

This solver starts from an initial value, in this example we used 0, and take steps at decendent directions until it finds a minimum. This initial value is very important and we need to be careful when defining it. 

For instance, see what happens when we set $x_\text{init} = -6$:

In [None]:
sol = minimize(F_par, -6, [4,8])
print(sol)

We see tha ``success`` is false, so the solver was not able to find a minimum. This happened because the decent direction goes to $-\infty$ so it cannot find an actual minimum value in that direction. 

What if we want to find a maximum value for $F(x)$? By inspection, we know there is a maximum value close to $-4$. 

We can use the same function and just look for a minimum of $-F(x)$. Let's plot $-F(x)$ and check it.

In [None]:
X = np.arange(-8,5,0.5)
plt.plot(X, -F_par(X, [4,8]))   # we can just add the minus sign here because it 
                            # calculates F(x) and then it multiplies by -1

Note that we just flipped the curve horizontally. So now we can find a minimum of $-F(x)$ since it corresponds to a maximum of $F(x)$.

In [None]:
sol = minimize(lambda y : -F_par(y, [4,8]), -6)
print(sol)

To get the negative function of $F(x)$ we needed to create a new function. The easiest way is to use a lambda function, which takes a value x, pass it to ``F`` along with the parameters, and the returned value is multiplied by -1.

We then passed this negative function to ``minimize`` along with the initial value for $x$. 

Note that the independent variable here is now called ``y``, but the output always calls it ``x``.

<a name="curve-fit"></a>
#### ``curve_fit``

The last solver we will take a look at is ``curve_fit``. It works like the ['add a trendline'](https://support.microsoft.com/en-us/office/add-a-trend-or-moving-average-line-to-a-chart-fa59f86c-5852-4b68-a6d4-901a745842ad) functionality in Excel, but we can provide any mathematical expression we wish for it to fit.

Essentially, this function takes data representing $x$ and $F(x)$ and calculates the values of the parameters of an expression that best describe the data. 

Here we also begin by importing the solver.

(If you are starting from here, run the first code blocks in the beginning of this [Section](#solvers))

In [None]:
from scipy.optimize import curve_fit

As before, we will keep using the expression we have been using in this Section, $F(x) = x^3 + \alpha(x - 1)^2 - \beta$.

However, when using ``curve_fit`` we need to declare it a little differently than we did before. Here, we **cannot** group the parameters together in a list or array. Instead, they need to be defined as individual arguments.

In [None]:
def F_c(x, alpha, beta):
    return x**3 + alpha*(x - 1)**2 - beta

We now need data points for finding the parameters. 

Let's create them with the ``F_c`` function using the same parameters $\alpha = 4$ and $\beta = 8$ we have been using. 

In [None]:
X = np.arange(-8,5,1)
Y = F_c(X, 4, 8)
plt.scatter(X,Y)

Here we created 13 data points. 

We can now pass them to ``curve_fit`` along with ``F_c`` to find the values of $\alpha$ and $\beta$ for this data set.

In [None]:
pars, cov = curve_fit(F_c,X,Y)
print(pars)

We can see that this function was able to find the original parameters.

The other information that ``curve_fit`` returns is the covariance matrix of the parameters. It is a measure of the uncertainty of the parameters, but we will not look into that in this course.

This kind of procedure for finding parameters of a mathematical expression is usually done with experimental data and experimental data **always** contain errors. So let's simulate that. 

In [None]:
Y = F_c(X, 4, 8) + np.random.randn(len(X))*10
plt.scatter(X,Y)

Here we added a randon variable with normal distribution to ``Y`` after calculating $F(x)$ to simulate experimental errors. We can see that the data points are not aligned in a smooth curve anymore.

Let's use this data set to estimate the parameters now.

In [None]:
pars, cov = curve_fit(F_c,X,Y)
print(pars)

We can see now that we do not get the exact original parameters anymore. $\alpha$ value is somewhat similar, but $\beta$ can be further from the original value. Because we generated random variables for defining the new ``Y``, everytime we run the code block that creates this ``Y``, we will get a different data set and different parameter values (you can try re-running the last two code blocks).

Although they are the original values for the parameter, we can argue they represent the data well (and that is what we are looking for).

Let's plot and see the data set and $F(x)$ with the parameters calculated with ``curve_fit``. 

In [None]:
X_est = np.arange(-8,5,0.5)
Y_est = F_c(X_est, pars[0], pars[1])
plt.scatter(X,Y)
plt.plot(X_est,Y_est, color = "orange")

Just a last remark about this these solvers we have worked with. We used a function that takes only one independent variable $x$. But $x$ could have been an array and we could have something like $F(x) = x_1^2 + x_2^2 + x_3^2$, meaning that we can have multiple independent variables.

For ``curve_fit``, our ``X`` data would then be a 2-dimensional array with rows representing the independent variables and columns containing the data points.