# Basic Plotting in Python

Making explatory plots is a common task in data science and many good presentations usually feature excellent plots.

For us the most important plotting package is `matplotlib`, which is python's attempt to copy MATLAB's plotting functionality. Also of note is the package `seaborn`, but we won't be using this package nearly as much as `matplotlib`. We'll briefly touch on a `seaborn` feature that I like, but won't go beyond that.

First let's check that you have both packages installed.

In [None]:
## It is standard to import matplotlib.pyplot as plt
import matplotlib.pyplot as plt

## It is standard to import seaborn as sns
import seaborn as sns

In [None]:
## Let's perform a version check
import matplotlib

# I had 3.3.2 when I wrote this
print("Your matplotlib version is",matplotlib.__version__)

# I had 0.11.0 when I wrote this
print("Your seaborn version is",sns.__version__)

##### Be sure you can run both of the above code chunks before continuing with this notebook, again it should be fine if your package version is slightly different than mine.

##### As a second note, you'll be able to run a majority of the notebook with just matplotlib. I'll put the seaborn content at the bottom of the notebook.

In [None]:
## We'll be using what we learned in the 
## previous two notebooks to help
## generate data
import numpy as np
import pandas as pd

## A First Plot 

Before getting into the nitty gritty, let's look at a first plot made with `matplotlib`.

In [None]:
## Here's our data
x = [0,1,2,3,4,5,6,7,8,9,10]
y = [2*i - 3 for i in x]

## plt.plot will make the plot
## First put what you want on the x-axis, then the y-axis
plt.plot(x,y)

## Always end your plotting block with plt.show
## in jupyter this makes sure that the plot displays 
## properly
plt.show()

##### What Happened?

So what happened when we ran the above code?

`matplotlib` creates a figure object, and on that object it places a subplot object, and finally it places the points on the subplot then connects the points with straight lines.

We'll return to the topic of subplots later in the notebook

Now you try plotting the following `x` and `y`.

In [None]:
## Run this code first
## np.linspace makes an array that
## goes from -5 to 5 broken into 
## 100 evenly spaced steps
x = 10*np.linspace(-5,5,100)
y = x**2 - 3

In [None]:
## You code
## Plot y against x





## Getting More Control of your Figures

So while you can certainly use the simple code above to generate figures, the best presentations typically have excellent graphics demonstrating the outcome. So why don't we learn how to control our figures a little bit more.

This process typically involves explicitly defining a figure and subplot object. Let's see.

In [None]:
## plt.figure() will make the figure object
## figsize can control how large it is (width,height)
## here we make a 10 x 12 window
plt.figure(figsize = (10,12))

## This still creates the subplot object
## that we plot on
plt.plot(x,y)

## we can add axis labels
## and control their fontsize
## A good rule of thumb is the bigger the better
## You want your plots to be readable
## As a note: matplotlib can use LaTeX commands
## so if you place math text in dollar signs it will
## be in a LaTeX environment
plt.xlabel("$x$", fontsize = 16)
plt.ylabel("$y$", fontsize = 16)

## we can set the plot axis limits like so
## This makes the x axis bounded between -20 and 20
plt.xlim((-20,20))

## this makes the y axis bounded between -100 and 100
plt.ylim(-100,100)

## Also a title
## again make it large font
plt.title("A Plot Title", fontsize = 20)

## Now we show the plot
plt.show()

#### Controlling How the Plotted Data Looks

We can control the appearance of what is plotted. Here's a quick cheatsheet of easy to use options:



| Color           | Description  |
| :-------------: |:------------:|
| r               | red          |
| b               | blue         |
| k               | black        |
| g               | green        |
| y               | yellow       |
| m               | magenta      |
| c               | cyan         |
| w               | white        |

|Line Style | Description   |
|:---------:|:-------------:|
| -         | Solid line    |
| --        | Dashed line   |
| :         | Dotted line   |
| -.        | Dash-dot line |

| Marker | Description    |
|:------:|:--------------:|
|o       | Circle         |
|+       | Plus Sign      |
|*       | Asterisk       |
|.       | Point          |
| x      | Cross          |
| s      | Square         |
|d       | Diamond        |
|^       | Up Triangle    |
|<       | Right Triangle |
|>       | Left Triangle  |
|p       | Pentagram      |
| h      | hexagram       |

Let's try the above plot one more time, but using some of these to jazz it up.

In [None]:
## plt.figure() will make the figure object
## figsize can control how large it is (width,height)
plt.figure(figsize = (10,12))

## The third argument to plot(), 'mp' here
## tells matplotlib to make the points magenta
## and to use pentagrams, the absence of a line character
## means there will be no line connecting these points
## we can also add a label, and insert a legend later
plt.plot(x,y,'mp', label="points")

## We can even plot two things on the same plot
## here the third argument tells matplotlib to make a
## green dotted line
plt.plot(x+10,y-100,'g--', label="shifted line")

## we can add axis labels
## and control their fontsize
plt.xlabel("$x$", fontsize = 16)
plt.ylabel("$y$", fontsize = 16)

## Also a title
plt.title("A Plot Title", fontsize = 20)

## plt.legend() adds the legend to the plot
## This will display the labels we had above
plt.legend(fontsize=14)


# Now we show the plot
plt.show()

In [None]:
## You code
## Redefine x and y to be this data
x = 10*np.random.random(100) - 5
y = x**3 - x**2 + x

In [None]:
## You code
## Plot y against x here
## play around with different colors and markers











## Subplots

Sometimes you'll want to plot multiple things in the same Figure. Luckily `matplotlib` has the functionality to create subplots.

In [None]:
## plt.subplots makes a figure object
## then populates it with subplots
## the first number is the number of rows
## the second number is the number of columns
## so this makes a 2 by 2 subplot matrix
## fig is the figure object
## axes is a matrix containing the four subplots
fig, axes = plt.subplots(2, 2, figsize = (10,8))

## We can plot like before but instead of plt.plot
## we use axes[i,j].plot
## A random cumulative sum on axes[0,0]
axes[0,0].plot(np.random.randn(20).cumsum(),'r--')
## note I didn't have an x, y pair here
## so what happened was, matplotlib populated
## the x-values for us, and used the input
## as the y-values.



## I can set x and y labels on subplots like so
## Notice that here I must use set_xlabel instead of 
## simply xlabel
axes[0,0].set_xlabel("$x$", fontsize=14)
axes[0,0].set_ylabel("$y$", fontsize=14)

## show the plot
plt.show()

In [None]:
## plt can also make a number of other useful graph types


fig, axes = plt.subplots(2, 2, figsize = (10,8))


axes[0,0].plot(np.random.randn(20).cumsum(),'r--')

## like scatter plots
## for these put the x, then the y
## you can then specify the "c"olor, "s"ize, and "marker"shape
## it is also good practice to let long code go onto multiple lines
## in python, you can go to a new line following a comma in a
## function call
axes[0,1].scatter(np.random.random(10),  # start a new line now
                  np.random.random(10),
                  c = "purple", # color
                  s = 50, # marker size
                  marker = "*") # marker shape

## or histograms
## this can be done with .hist
## you input the data you want a histogram of
## and you can specify the number of bins with
## bins
axes[1,0].hist(np.random.randint(0,100,100), bins = 40)


## and text
## for this you call .text()
## you input the x, y position of the text
## then the text itself, then you can specify the fontsize
axes[1,1].text(.5, .5, "Hi Mom!", fontsize=20)

plt.show()

As a note all of the plotting capabilities shown above (`hist()`, `scatter()`, and `text()`) are available outside of subplots as well. You'd just call `plt.hist()`, `plt.scatter()` or `plt.text()` instead.

In [None]:
## You code
## Make a 2 x 2 subplot
## Use numpy to generate data and plot 
## a cubic function in the 0,0 plot
## a scatter plot of two 100 pulls from random normal distribution
## in the 0,1 plot
## a histogram of 1000 pulls from the random normal distribution
## in the 1,0 plot
## and whatever text you'd like in the 1,1 plot









### Saving a Figure

We can also save a figure after we've plotted it with `plt.savefig(figure_name)`.

In [None]:
## We'll make a simple figure
## then save it
plt.figure(figsize=(8,8))

plt.plot([1,2,3,4], [1,2,3,4], 'k--')


## all you'll need is the figure name
## the default is to save the image as a png file
plt.savefig("my_first_matplotlib_plot.png")

plt.show()

If you check your repository you should now see `my_first_matplotlib_plot.png`. Open it up to admire its beauty.


That's really all we'll need to know for making plots in the boot camp. Of course we've come nowhere close to understanding the totality of `matplotlib`, so if you're interested check out the documentation, <a href="https://matplotlib.org/">https://matplotlib.org/</a>.

## `seaborn`

`seaborn` is a pretty user friendly package that can make nice plots quickly, however, we won't explore it much in this notebook. But we will introduce a useful function that allows you to give your plot gridlines for easier reading.

For those interesting in seeing fun `seaborn` plots check out this link, <a href="https://seaborn.pydata.org/examples/index.html">https://seaborn.pydata.org/examples/index.html</a>.

In [None]:
## Let's recall this plot from before
x = 10*np.linspace(-5,5,100)
y = x**2 - 3

plt.figure(figsize = (10,12))


plt.plot(x,y,'mp', label="points")

plt.plot(x+10,y-100,'g--', label="shifted line")

plt.xlabel("$x$", fontsize = 16)
plt.ylabel("$y$", fontsize = 16)


plt.title("A Plot Title", fontsize = 20)


plt.legend(fontsize=14)


plt.show()

Now we can use `seaborn` to add gridlines to the figure, which will allow for easier reading of plots like the one above.

In [None]:
## Run this code
sns.set_style("whitegrid")

In [None]:
## Now rerun the plot
x = 10*np.linspace(-5,5,100)
y = x**2 - 3

plt.figure(figsize = (10,12))


plt.plot(x,y,'mp', label="points")

plt.plot(x+10,y-100,'g--', label="shifted line")

plt.xlabel("$x$", fontsize = 16)
plt.ylabel("$y$", fontsize = 16)


plt.title("A Plot Title", fontsize = 20)


plt.legend(fontsize=14)


plt.show()

See the difference?

In [None]:
## You code
## see what this does to your plots
sns.set_style("darkgrid")

In [None]:
## Now rerun the plot
x = 10*np.linspace(-5,5,100)
y = x**2 - 3

plt.figure(figsize = (10,12))


plt.plot(x,y,'mp', label="points")

plt.plot(x+10,y-100,'g--', label="shifted line")

plt.xlabel("$x$", fontsize = 16)
plt.ylabel("$y$", fontsize = 16)


plt.title("A Plot Title", fontsize = 20)


plt.legend(fontsize=14)


plt.show()






## That's it!

That's all for this notebook. You now have a firm grasp of the basics of plotting figures with `matplotlib`. With a little practice you'll be a `matplotlib` pro in no time.

This notebook was written for the Erd&#337;s Institute C&#337;de Data Science Boot Camp by Matthew Osborne, Ph. D., 2021.

Redistribution of the material contained in this repository is conditional on acknowledgement of Matthew Tyler Osborne, Ph.D.'s original authorship and sponsorship of the Erdős Institute as subject to the license (see License.md)