# Visualizating your data

## Why visualization is important

Visualizating your data is very important. A mistake a lot of people make is to just keep it as numbers or process it in aggregate without _looking_ at the actual results of their experiments or data.

## Plotting with matplotlib

The base plotting package that (arguably) works best with Jupyter and python is `matplotlib`.  You should install it with `mamba install matplotlib` in your terminal. The best way to integrate it into our notebooks is to use the python "magic" command `%matplotlib inline` like so:

In [None]:
%matplotlib inline

This tells jupyter to put the plots _right in the notebook_ for us, allowing us to visualize the data as we are exploring it.  
Let's look at at an initial simple plot.  
First we have to import the modules we need - numpy, pandas, and matplotlib.  Install them with `mamba install numpy pandas matplotlib` in your terminal.

Then we'll import them as:

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

Let's start by plotting some basic triginometric functions!  
The first thing we have to do is setup x values from 0 to 2$\pi$ - let's make 50 values. To do this, we'll use the `linspace` function in numpy, which needs a start, an end point, and the number of points:

In [None]:
rad=np.linspace(0.0, 2*np.pi, 50)
rad

Ok - now let's find the cosine of these numbers:

In [None]:
y_sin=np.sin(rad)
y_sin

In [None]:
y_cos=np.cos(rad)
y_cos

It's nice to be able to look at these values, but wouldn't it be even better to **plot** them?  
Let's try matplotlib first

In [None]:
plt.plot(rad, y_sin)

But what if we want to plot sin _and_ cos on the same plot? Easy:

In [None]:
y_cos=np.cos(rad)
plt.plot(rad,y_sin)
plt.plot(rad,y_cos)

Now let's say I want to change _how_ it's plotted. Let's instead plot the cos as _red points_ and the sin as a _green line_.   
Because I can never remember the rules for this, I'll just look in the documentation for the `plot` command: <https://matplotlib.org/3.3.1/api/_as_gen/matplotlib.pyplot.plot.html>   
It turns out for green I use `g` and for red I use `r`

In [None]:
plt.plot(rad,y_sin,'g')
plt.plot(rad,y_cos,'r')

Now I want to have circular markers instead of a line for cosine - that's `o` and for sine I want a line - that's `-`.

In [None]:
plt.plot(rad,y_sin,'g--')
plt.plot(rad,y_cos,'r^')

Ok - now what if I don't want to plot the _entire_ range of my data, but instead I want to plot just a subset - to zoom in on a specific part?  We can do that by setting the *limits* of the plot using `xlim` and `ylim` (<https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.xlim.html#matplotlib.pyplot.xlim>) - here I'll zoom in on just the region between 2 and 4 on the x-axis and -1.1 and 0 on the y-axis

In [None]:
plt.plot(rad,y_sin,'g-')
plt.plot(rad,y_cos,'ro')
plt.xlim(2,4)
plt.ylim(-1.1,0)

But these x-axis labels aren't super useful actually - we are plotting in radians, so we should probably label as $\pi$, etc.  We can do that by setting what the axis "tick-marks" are using `xticks` <https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.xticks.html#matplotlib.pyplot.xticks> - notice that we enclose the "list" of locations in square brackets:

In [None]:
plt.plot(rad,y_sin,'g-')
plt.plot(rad,y_cos,'ro')
plt.xticks([0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi])

But I don't like those numbers (though I do remember that $\pi \approx 3.142$) so I want to _also_ tell the plot what text to label the plot with

In [None]:
plt.plot(rad,y_sin,'g-')
plt.plot(rad,y_cos,'ro')
plt.xticks([0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi], ['0', '$\\pi$/2', '$\\pi$', '3$\\pi$/2', '2$\\pi$'])

Now I want to put nice titles on the plot, because my PI never knows what the axes mean. We can do this with `xlabel` and `ylabel` <https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.xlabel.html#matplotlib.pyplot.xlabel>.  We can even put a title on the plot with `title`

In [None]:
plt.plot(rad,y_sin,'g-')
plt.plot(rad,y_cos,'ro')
plt.xticks([0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi], ['0', '$\\pi$/2', '$\\pi$', '3$\\pi$/2', '2$\\pi$'])
plt.xlabel('Radians')
plt.ylabel('Values')
plt.title('My Trigonometry Plot')

This is fine, but I don't know what the red dots and the green lines are from just looking at the plot (though the beauty of python notebooks is that I can figure it out by looking at the code and my variable names).  Let's add a legend, using the `legend` command <https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.legend.html#matplotlib.pyplot.legend>: 

In [None]:
plt.plot(rad,y_sin,'g-')
plt.plot(rad,y_cos,'ro')
plt.xticks([0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi], ['0', '$\\pi$/2', '$\\pi$', '3$\\pi$/2', '2$\\pi$'])
plt.xlabel('Radians')
plt.ylabel('Values')
plt.title('My Trigonometry Plot')
plt.legend(['Sin', 'Cos'])

Finally, we might want to not just have the plot in our notebook, but actually save it to a file to put in a presentation or in a paper. We can do that with `savefig` <https://matplotlib.org/3.1.1/api/_as_gen/matplotlib.pyplot.savefig.html#matplotlib.pyplot.savefig>.  Savefig is pretty smart, it can figure out from the name of the file you give it what type of file to make, I suggest mostly using either `png` or `pdf`

In [None]:
plt.plot(rad,y_sin,'g-')
plt.plot(rad,y_cos,'ro')
plt.xticks([0, np.pi/2, np.pi, 3*np.pi/2, 2*np.pi], ['0', '$\\pi$/2', '$\\pi$', '3$\\pi$/2', '2$\\pi$'])
plt.xlabel('Radians')
plt.ylabel('Values')
plt.title('My Trigonometry Plot')
plt.legend(['Sin', 'Cos'])
plt.savefig('trig.pdf')

### Exercise: 
From what you have learned, plot sin^2, cos^2 and sin^2+cos^2.  As a tip, look at numpy.square to square the values (<https://numpy.org/doc/stable/reference/generated/numpy.square.html>)