Python for Beginners Exercise 5: Plotting

Made by: Julian Liber

Date Created: 03/16/2020

## Hello Everyone!


#### This activity should teach you:
- How to use matplotlib
- Some basic plotting with histograms and scatterplots

Python is great for a lot of things, some of which you may want to plot!

I don't generally use Python plotting to present data, but it can be useful during your data analysis to ensure that everything looks right.

The preferred library for this is Matplotlib:

<img src="https://matplotlib.org/_static/logo2.svg" width=80% alt="Hulahoop"><p style="text-align: right;">From: https://matplotlib.org/_static/logo2.svg</p>

Generally, this works by importing `matplotlib.pyplot`, `numpy`, and `pandas`. Because we are using Jupyter/IPython, we need and additional "magic" function: `%matplotlib inline`

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

Let's read in some data: `tip_tracking_speed_data.csv`

In [None]:
data = pd.read_csv("tip_tracking_speed_data.csv")
data.head()

For our first plot, we can make a histogram to show the distribution of our `Speed` variable.

In [None]:
plt.hist(data.Speed)

We can make this more readable, by adding axes labels, a title, and transforming the scale.

In [None]:
plt.hist(np.log10(data.Speed), bins = 30)
plt.xlabel("Speed (Log 10 um/min)")
plt.ylabel("Count")
plt.title("Tip speed distribution")

Now we can compare a single variable over time, in this case median tip speed for each frame. Here we can either include both `x` and `y` variables, or just `y` and x will be the index (in this case the index will be the same as time.

In [None]:
time_grouped = data.groupby("Time") # Each frame will have the same time from start
median_data = time_grouped.median() # Take the median, because the distribution is right-skewed
plt.plot(median_data.Speed, label="Median Speed") # Plot median speed over time
plt.legend()
plt.xlabel("Time since start (min)")
plt.ylabel("Speed (um/min)")
plt.title("Tip speed over time")

We can also make scatterplots, most easily by using the `plt.scatter()` function. Here, we are displaying the directions of individual tip movements, which fall on a unit circle.

In [None]:
plt.scatter(x = data.X_component, y = data.Y_component, alpha=0.1)
plt.axis("square"); # Make the circle circular, adding the ";" cleans up the output.

While there are many more ways to plot with matplotlib (see the [gallery](https://matplotlib.org/gallery/index.html)), I would still recommend the functionality in R stats for plotting.

### Thanks for doing Exercise 5!

#### More will follow soon!