## Python Week 7: Visualization with Matplotlib and Seaborn

Visualization of data is at the heart of science. What is the point of doing experiments if we can't share our conclusions in attractive, easily-digestable, and clear figures? Many of you are familiar with GraphPad Prism. If you only ever need to make barcharts this software is fine. But the beauty of using an actual programming language is that we have far more control. Need to create a volcano plot, visualize the predicted travel from a naive bayes classifier, or what about showing representative calcium traces? All of these things can be easily accomplished in matplotlib and seaborn.

In [None]:
# as always lets import the necessary libraries
import pandas as pd
import numpy as np
import seaborn as sb
import matplotlib.pyplot as plt
%matplotlib inline

In [None]:
"""A quick note about interactive plotting in pycharm, but if you want an interactive figure that pops out of the pycharm plot window run the
following two commands below. I've commented them out below so that it doesn't ruin the notebook plotting but as an FYI"""
# from IPython import get_ipython
# get_ipython().run_line_magic('matplotlib', 'osx')

I want to do something a little less typical when teaching visualization libraries on Python. I want to start with seaborn. Why? Well because I want to teach an important lesson about the use cases for these libraries. Seaborn completely replaces Prism for statistical relationships. It also is very easy to use. The point is that Seaborn is plug-and-play. You can create publication quality figures in a single line of code. 

### Categorical Plots

In [None]:
# import dataset
mp = pd.read_csv('Data_Cortex_Nuclear.csv')

In [None]:
# the humble barplot, the most useless of all categorical plots

sb.set()
sb.barplot(x='Genotype', y='pCFOS_N', hue='Treatment', data=mp, capsize=0.08, ci=68)

In [None]:
# the box plot, slightly more useful

sb.boxplot(x='Genotype', y='pCFOS_N', data=mp, hue='Treatment')

In [None]:
# the violin plot, most under-utilized plot in all of science

sb.violinplot(x='Genotype', y='pCFOS_N', data=mp, hue='Treatment')

In [None]:
# the strip plot, kinda useless except in a minute

sb.violinplot(x='Genotype', y='pCFOS_N', data=mp, hue='Treatment')
sb.stripplot(x='Genotype', y='pCFOS_N', data=mp, hue='Treatment', dodge=True)

In [None]:
# swarm plot, similar to strip but with a better representtion of the distribution
sb.violinplot(x='Genotype', y='pCFOS_N', data=mp, hue='Treatment')

sb.swarmplot(x='Genotype', y='pCFOS_N', data=mp, hue='Treatment', dodge=True, size=3)

In [None]:
# a point plot

# some real fake data for this, whoach look at that bidirecitonal freezing
freezing_df = pd.DataFrame({"% Freezing": [20, 31, 42, 15, 23, 60, 78, 43, 74, 80, 15, 35, 33, 45, 29], 
               "Animal": ["M1", "M2", "M3", "M4", "M5", "M1", "M2", "M3", "M4", "M5", "M1", "M2", "M3", "M4", "M5"], 
               "Time Epoch": ["0-2 min.", "0-2 min.", "0-2 min.", "0-2 min.", "0-2 min.", "2-4 min.", "2-4 min.", "2-4 min.", "2-4 min.", "2-4 min.", "4-6 min.", "4-6 min.", "4-6 min.", "4-6 min.", "4-6 min."]})
sb.pointplot(x="Time Epoch", y="% Freezing", data=freezing_df, capsize=0.1)

## Relational Plots

In [None]:
# scatter plot 

sb.scatterplot(x="pCFOS_N", y="ARC_N", data=mp, hue='Genotype')

In [None]:
from numpy import random

# ah yes, lines
data_trace = pd.DataFrame({"Time": list(np.linspace(0,100, 200)), "Response": list(abs(random.randn(200)))})

sb.lineplot(x="Time", y="Response", data=data_trace)

In [None]:
# that lin regression though

sb.lmplot(x="pCREB_N", y="pCFOS_N", data=mp, hue="Genotype")

In [None]:
# heatplot 

heat = random.rand(10,10)
cmap = sb.diverging_palette(230, 20, as_cmap=True)

sb.heatmap(heat, cmap=cmap)

## Figure Aesthetics in Seaborn

In [None]:
# we can very quickly change sizes in seaborn with set style
# darkgrid, whitegrid, ticks, dark, white

sb.set_style('white')
sb.barplot(x='Genotype', y='pCFOS_N', data=mp, hue='Treatment', capsize=0.08, ci=68)

In [None]:
# we can very quickly change sizes in seaborn with set
# paper, notebook, talk, poster
# as a note these renderings are specific to the screen because seaborn created these paramters dynamically based on the brightness etc. 

sb.set('notebook')
sb.barplot(x='Genotype', y='pCFOS_N', data=mp, hue='Treatment', capsize=0.08, ci=68)

In [None]:
# we can easily remove the spines from a figure

sb.set_style('white')
sb.barplot(x='Genotype', y='pCFOS_N', data=mp, hue='Treatment', capsize=0.08, ci=68)
sb.despine()

In [None]:
# seaborn has a wide variety of color palettes, the color palette function will allow you to view any of the presets or custom palettes
sb.color_palette('colorblind')

## Matplotlib: What is it good for?

Control. This isn't to say that seaborn doesn't have lots of customizability--it does. However, theres a lot of bells and whistles matplotlib has that allow you to make incredibly complex graphs. Alas, all is not lost we won't be doomed to using 500 lines of code to make a single plot. Why? Because Seaborn is built on Matplotlib, its like a wrapper for the insane latitude of control that matplotlib affords you...so if you need to change the legend location of a seaborn graphic matplotlib can help...

In [None]:
# heres a basic matplotlib graph

y = random.randn(10)
x = np.linspace(1, 10, 10)

plt.plot(x, y)

In [None]:
# we can make any graph in matplotlib have the seaborn "feel" if you so choose or even ggplot2 etc. 

sb.set()
plt.plot(x, y)


In [None]:
# matplotlib has lots of preset styles that can be used for your specific wants/needs here is a mock ggplot2 style ported from R
plt.style.use('ggplot')
fig = plt.figure()
plt.plot(x,y)

In [None]:
# one thing about matplot lib is that it uses a figure class when we need to do many operations on a figure or when we need to make multiple figures at once
# that we can reference later. Everything we run until a new figure class will be on that plot
fig = plt.figure()
plt.plot(x,y)
fig2 = plt.figure()
plt.plot(y,x)

In [None]:
# with a fig instance we can do things like add subplots, specify size of the plot, etc. please check out the documentation for a full list of features,
# its incredibbly expansive

# heres an example of a figure that has a 600 dpi (dots per inch), and a 10x15 
fig = plt.figure(dpi=600, figsize=(5,7))
plt.plot([1,2,3,4],[5,6,7,8])


In [None]:
# we can also create suplots with matplotlib
fig, ax = plt.subplots(2,2)

ax[0,0].plot(x,y)
ax[0,1].plot(y,x)
ax[1,0].plot(x,x)
ax[1,1].plot(y,y)

