# Plotting in Python with Matplotlib

Plotting is an essential tool in science. For this we will use the matplotlib package, a standard for many people.

in particular the module `matplotlib.pyplot` has a lot of useful plotting tools, the simplest of which is `plot(x,y)` which will make a line plot according to a list (or array) of x and y coordinates.

In [None]:
#make a very simple plot

import matplotlib.pyplot as plt
import numpy as np

#the plot(x,y) function takes as its arguments two lists or arrays and plots them x (first) vs y (second)
plt.plot(np.arange(10), np.arange(10))

Note that the tickmarks are already labeled for you! How convenient! Let's try getting a little more complicated, using what we learned about functions to make a more fun plot.

In [None]:
# Plotting a function

def f(x): # define our function
    return x**3 - x**2 - x + 3

myxspace = np.linspace(-2,2,100) # 100 points between -10 and 10
plt.plot(myxspace, f(myxspace))

## Customizing Your Plot

There's a lot more to do to make your plot readable, such as labeling axes, making titles, and creating legends.


In [None]:
#Add some axes labels

plt.plot(myxspace,f(myxspace))
plt.xlabel("x")
plt.ylabel("f(x)")

Plotting more than one curve on the same axes is very simple. Just add extra lines as follows: 

In [None]:
plt.plot(myxspace,f(myxspace))
plt.plot(myxspace,-f(myxspace))
plt.plot(myxspace,f(myxspace)-2)
plt.xlabel("x")
plt.ylabel("f(x)")

Again, `matplotlib` is pretty smart and chooses different colors for your lines automatically. How nice of it!

You may want to zoom into a particular area, and you may also want to save the resulting figure as a pdf file, so that you can use it in other documents.

In [None]:
plt.plot(myxspace,f(myxspace))
plt.plot(myxspace,-f(myxspace))
plt.plot(myxspace,f(myxspace)-2)
plt.xlabel("x")
plt.ylabel("f(x)")
plt.xlim(-1.5,0.5)
plt.ylim(-6,6)
plt.savefig("Myfig.pdf") #save a file in your current working directory

Customising the style of the curves can be done by editing the `linestyle` property. This is a nice way to further differentiate between lines, especially when you find yourself running out of distinct colors!

In [None]:
plt.plot(myxspace,f(myxspace))
plt.plot(myxspace,-f(myxspace),linestyle="--")
plt.plot(myxspace,f(myxspace)-2,linestyle=":")
plt.xlabel("x")
plt.ylabel("f(x)")

As you can imagine, there are a lot more options for plotting that you can tune; for example, say you would like to add a legend to your graph, or maybe you want to do bar-charts, maybe you want to add error bars or other things to your plot. All this is indeed possible in Python, but it would be tedious to try to memorise all the different options available; especially since you might only use some of the many options rarely, and some of them never at all. 

The recommended approach is the following:

1. Learn some of the basic plotting commands, so that you can quickly see the data for yourself.
2. To make production ready nice graphs, consult the [Matplotlib examples website](https://matplotlib.org/stable/gallery/index.html). 
3. Try to find an example of the feature you are looking for.
3. Run the code provided (on the website) for that example unchanged.
4. Try to understand the code that they use to implement the feature in question.
5. Adapt your code so that it implements the feature you are looking for. 

Let's start making legends now.

In [None]:
def g(x): #define another function
    return x**2+10*x

plt.plot(myxspace,f(myxspace), label='f(x)') #plot each one, with their associated label
plt.plot(myxspace,g(myxspace), label='g(x)')
plt.xlabel('x')
plt.ylabel('Function Value')
plt.legend()

In [None]:
# you might want to change the location of the legend
plt.plot(myxspace,f(myxspace), label='f(x)')
plt.plot(myxspace,g(myxspace), label='g(x)')
plt.xlabel('x')
plt.ylabel('Function Value')
plt.legend(loc='lower right') # options are combinations of upper, lower, center, right, left

In [None]:
# You might also want to change the label sizes, add a title
plt.title('My Really Great Plot', fontsize='xx-large')
plt.plot(myxspace,f(myxspace), label='f(x)')
plt.plot(myxspace,g(myxspace), label='g(x)')
plt.xlabel('x', fontsize=20) # you can determine fontsize by a number (kind of guess and check)
plt.ylabel('Function Value', fontsize=20)
plt.legend(loc='lower right', fontsize='x-large') # or by a string like 'small', 'large', 'xx-large'
plt.yticks(fontsize=10) #We can even change the labels of the ticks!
plt.xticks(fontsize=10)

## Colors

You can set the colors yourself, or let matplotlib use the defaults. The defaults should be colorblind friendly and generally nice. But you may want to change the order, or still prefer some specific colors.

You can see some colors available to you here: https://matplotlib.org/stable/gallery/color/named_colors.html

You can set the colors with the argument within plot() `color='b'` (for blue, for example)

Some basic colors would be blue ('b'), red ('r'), cyan ('c'), orange ('o'), green ('g'), black ('k').

You can also use the default colors in different order by using strings 'C0' (the first default), 'C1' (second), 'C2' (third), and so on. The defaults are nice and should be pretty colorblind friendly.

In [None]:
# You might also want to change the label sizes, add a title
plt.title('My Really Great Plot', fontsize='xx-large')
plt.plot(myxspace,f(myxspace), label='f(x)', color='b') #blue
plt.plot(myxspace,g(myxspace), label='g(x)', color='r') #red
plt.plot(myxspace,g(myxspace)+100, label='g(x)+10', color='C2') #third default

plt.xlabel('x', fontsize=20) # you can determine fontsize by a number (kind of guess and check)
plt.ylabel('Function Value', fontsize=20)
plt.legend(loc='center right', fontsize='x-large') # or by a string like 'small', 'large', 'xx-large'
plt.yticks(fontsize=10) #We can even change the labels of the ticks!
plt.xticks(fontsize=10)

# Loading Data from a File

This semester you'll be taking a lot of data and it is easy using Excel to export this data into a CSV file ("comma separated value"). These are easy to read in with python!

We have an example file for you to work with called Canada.csv. It has data on tree rings over the last 400 years in Canada. The data was downloaded downloaded from the Internet at (http://www.climatedata.info/proxies/data-downloads/).

You can take a look at what the file looks like in jupyter lab by double clicking on it. The next step is to open up the file and load its contents into python so we can plot it. For this we will use a built-in `numpy` function called `genfromtxt`. This will turn a csv data file into arrays.

This function takes a few important keywords. The `delimiter` keyword tells it about the format of the file. How are the different values separated. Well, in a COMMA separated value file this is ",". The other important feature is the `skip_header` keyword, which will tell the function how many of the first lines to skp because they are "headers", i.e. the are telling us about the data rather than values themselves (for example, headers saying what each column represents). When you make your own files in Excel it is usually a good idea to label your columns! Just like comments in code, it will help you later and future-you will thank you.

In [None]:
# The function will load in the data into a 2-d array.

A=np.genfromtxt("Canada.csv",delimiter=",",skip_header=4)

In [None]:
A[0]

This is the first row of the data. Note the "nan" value, which stands for "Not A Number". This is what is done when there is no data in a column. You can see when you view the file in jupyter that indeed there are a few entries with no data in these first few rows.

In [None]:
A[:,0] #the first column

In [None]:
A[:,1] # second column

In [None]:
A.shape # there are 402 rows and 5 columns

Now we can finally make a plot of the tree ring width in Canada vs time!

In [None]:
plt.plot(A[:,0],A[:,1],label="20 year moving average")
plt.xlabel("year")
plt.ylabel("tree ring width in Canada")
plt.legend()

We can plot multiple columns together. Here because we want to really be able to see the 20 year moving average line, I'll make it thicker using the `linewidth` keyword

In [None]:

plt.plot(A[:,0],A[:,3],label="Campbell")
plt.plot(A[:,0],A[:,4],label="Mt Cain")

plt.xlabel("year")
plt.ylabel("tree ring width in Canada")

plt.plot(A[:,0],A[:,1],label="20 year moving average", color='k', linewidth=5)
plt.legend()

If you'd prefer to not work with a 2D array with data, you can *UNPACK* the data into separate variables by using the `unpack=True` keyword. This is a boolean value (remember those?) that is defaulted to `False`, but we can switch it to `True`

This is handy because now you can call your variables more useful names that will make them more intuitive to work with.

In [None]:
time, ave, canada, campbell, mtcain =np.genfromtxt("Canada.csv",delimiter=",",skip_header=4, unpack=True)

plt.plot(time,campbell,label="Campbell")
plt.plot(time,mtcain,label="Mt Cain")

plt.xlabel("year")
plt.ylabel("tree ring width in Canada")

plt.plot(time,ave,label="20 year moving average", color='k', linewidth=5)
plt.legend()

# Exercises

a) using the data from `Canada.csv` plot the difference between each of the three data sets and the 20 year moving average vs time. Label the plot correctly and create a legend. **No *for* or *while* loops please!**

b) Consider functions for the height $h(t)$ and velocity $v(t)$ vs time ($t\geq0$) of an object dropped straight down from an initial height $h_0$ at $t=0$\. Assume $g=9.8$ m/s and that $h$ is in units of meters. Generate a plot of velocity vs time for different initial heights. Generate another plot for height vs time. Make sure your results are realistic! Velocity should be 0 once the object hits the ground and the height cannot be negative! Label and axes and create a legend for your different lines that makes sense to a reader. **Try to avoid *for* and *while* loops and make some functions!**