### [Matplotlib](https://matplotlib.org/)
- The most widely-used Python plotting library
- Initially modeled on MATLAB's plotting system
- Designed to provide complete control over a plot

In [None]:
# this line enable plots in the notebook
%matplotlib inline
import matplotlib.pyplot as plt

In [None]:
# using arange and linspace to create arrays
import numpy as np
x = np.arange(10)
y = np.linspace(0, 5, 10)

In [None]:
# plot y vs x array
plt.plot(x, y, marker=".")


In [None]:
# changing color, markers or line style, use additional argument to the function plot: color (you can also check the documentation)

In [None]:
# set plt.ylim to change the y axis range

In [None]:
# explore gallery and run https://matplotlib.org/examples/subplots_axes_and_figures/subplot_demo.html

### [Pandas](https://pandas.pydata.org/)
- a popular Python package for data science
- fast and efficient support for DataFrame objects 
- loading data  from different file formats.
- allows for grouping data for aggregation and transformations.

In [None]:
import pandas as pd

In [None]:
# reading the table from a file
df = pd.read_table('https://s3.amazonaws.com/fcp-indi/data/Projects/ABIDE2/RawData/ABIDEII-KKI_1/participants.tsv', na_values="n/a")

The data is described here:http://fcon_1000.projects.nitrc.org/indi/abide/ABIDE_LEGEND_V1.02.pdf

In [None]:
# explore the data frame: use head, and keys
df.head()

In [None]:
df.keys()

In [None]:
# remove columns that have NaNs: use dropna 
sub_df = df.dropna(axis=1)

In [None]:
sub_df.head()

In [None]:
sub_df.keys()

In [None]:
# 1. use plt.scatter to plot 'age_at_scan ' vs "viq"
# 2. use figure to set the figsize
# 3. add xlable, ylabel and the title


In [None]:
# 1. use plt.scatter to plot 'age_at_scan ' vs "viq"


In [None]:
# 2. use figure to set the figsize
# plt.figure(figsize=(10, 5))
# plt.scatter(sub_df["age_at_scan "], sub_df["viq"])

# 3. add xlable, ylabel and the title
# plt.xlabel("age")
# plt.ylabel("viq")
# plt.title("VIQ vs age")

In [None]:
# use `groupby` to group the data by diagnostic groups - "dx_group" to calculate means in both groups 
sub_df.groupby("dx_group")["viq", "piq"].mean()

In [None]:
# plot the scatter plot again, this time separate the groups

plt.figure(figsize=(10, 5))
# More scatter plots, breaking up by species
colors = ['blue', 'green']
for i, (s, grp) in enumerate(sub_df.groupby('dx_group')):
    plt.scatter(grp['age_at_scan '], grp['viq'], c=colors[i])
plt.legend(["autism", "control"])
plt.xlabel("age")
plt.ylabel("viq")
plt.title("VIQ vs age")

In [None]:
# use plt.sublots to plot  "viq" and "piq" next to each other
# use examples from matplotlib galleries, 
# e.g. https://matplotlib.org/3.1.1/gallery/subplots_axes_and_figures/figure_title.html#sphx-glr-gallery-subplots-axes-and-figures-figure-title-py

In [None]:
# Set up a figure with 2 columns
fig, axes = plt.subplots(1, 2, figsize=(15, 4))

# setting everything for the first plot
axes[0].scatter(sub_df["age_at_scan "], sub_df["viq"])
axes[0].set_xlabel("age")
axes[0].set_ylabel("viq")
axes[0].set_title("VIQ vs age")

# setting everything for the second plot
axes[1].scatter(sub_df["age_at_scan "], sub_df["piq"])
axes[1].set_xlabel("age")
axes[1].set_ylabel("piq")
axes[1].set_title("PIQ vs age")


#### Matplotlib Pros
- provides low-level control over virtually every element of a plot
- close integration with numpy
- large and active community
- big range of functionality (figure compositing, layering, annotation, coordinate transformations, color mapping, etc.)

#### Matplotlib Cons
- steep learning curve
- API is extremely unpredictable -- redundancy and inconsistency are common
- some simple things are hard; some complex things are easy
- default styles are kind of ugly
- documentation is often hard to use (check the gallery!)

#### Pandas as an alternative
- DataFrame integration
- often the easiest approach for simple data exploration

In [None]:
# using pandas to for simple scatter plot
sub_df.plot('age_at_scan ', "piq", kind="scatter")

### [Seaborn](https://seaborn.pydata.org/index.html) as an alternative
-  data visualization library based on matplotlib
- provides a high-level interface for drawing attractive and informative statistical graphics
- generates beautiful plots in very little code (default style is much nicer than in Matplotlib)
- Very good documentation! :-)


In [None]:
import seaborn as sns

In [None]:
# reset the default parameters
sns.set()
# http://seaborn.pydata.org/tutorial/aesthetics.html#scaling-plot-elements
sns.set_context("paper")
# http://seaborn.pydata.org/tutorial/aesthetics.html#seaborn-figure-styles
sns.set_style("dark")

In [None]:
# this is just a copy of the previous plot, but uses seaborn styles
plt.figure(figsize=(10, 5))
plt.scatter(sub_df['age_at_scan '], sub_df.viq)
plt.xlabel('Age at scan')
plt.ylabel('Verbal IQ')
plt.title('Comparing Age and Verbal IQ')

In [None]:
# use seaborn joinplot
sns.jointplot(x='age_at_scan ', y='viq', data=sub_df)

In [None]:
# use an example from seaborn gallery (e.g. https://seaborn.pydata.org/examples/multiple_regression.html) 
# and change it with sub_df 

In [None]:
sns.set()
sns.set_context("paper")
sns.set_style("dark")

# Plot sepal with as a function of sepal_length across days
g = sns.lmplot(x="age_at_scan ", y="viq", hue="dx_group",
               truncate=True, height=5, data=sub_df)

# Use more informative axis labels than are provided by default
g.set_axis_labels("Age [year]", "VIQ")


#### examples taken from various great resources, mostly from:
- scipy-lecture: https://scipy-lectures.org/intro/numpy/array_object.html
- notebooks on neuro data science visualization: 
    - https://github.com/neurohackweek/visualization-in-python
    - https://github.com/neuro-data-science/neuroviz