# Plotting data with Python - `matplotlib`

* Python has [LOTS](https://pyviz.org/overviews/index.html) of data visualization (plotting) libraries
* Lots of them are built on top of `matplotlib`
* `matplotlib` tries to make easy things easy and hard things possible

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from astropy.table import QTable

### Simple Plotting

$$\large
[\ 0 < x < 3\pi\ ] \hspace{1cm}
y = e^{-x/3} \cos(\pi x)
$$

In [None]:
my_x = np.linspace(0, 3*np.pi, 250)
my_y = np.cos(np.pi*my_x) * np.exp(-my_x/3)

In [None]:
my_x[::20]

In [None]:
my_y[::20]

In [None]:
plt.plot(my_x, my_y)

### Simple plotting - with *style*

* The default style of `matplotlib` is a bit lacking in style.
* The new version of `matplotlib` has added some new styles that you can use in place of the default.
* Changing the style will effect all of the rest of the plots on the notebook.

Examples of the various styles can be found [here](http://matplotlib.org/examples/style_sheets/style_sheets_reference.html)

In [None]:
plt.style.available

In [None]:
plt.style.use('ggplot')

In [None]:
plt.plot(my_x, my_y);

Adding the'`;` at the end suppresses the `Out[]` line

---

# Better plotting

* The simple plots: `plt.plot()` are great for a quick look at data, but it does not provide
much control over the plot.
* The `fig,ax` interface lets you control everything!

In [None]:
fig,ax = plt.subplots(1,1)                    # One window
fig.set_size_inches(8,5)                      # (width,height)
fig.tight_layout()                            # Make better use of space on plot

ax.set_xlim(0.0, 6.0)
ax.set_ylim(-1.5, 1.5)

ax.set_xlabel('Time (s)')
ax.set_ylabel('Voltage (mV)')
ax.set_title('Circut Output')

ax.plot(my_x, my_y);

---

#### Colors, Markers, and Linestyles

##### [Complete Marker List](https://matplotlib.org/api/markers_api.html)

---

### In addition, you can specify colors in many different ways:

- Grayscale intensities: `color = '0.8'`
- RGB triplets: `color = (0.3, 0.1, 0.9)`
- RGB triplets (with transparency): `color = (0.3, 0.1, 0.9, 0.4)`
- Hex strings: `color = '#7ff00'`
- [HTML color names](https://en.wikipedia.org/wiki/Web_colors): `color = 'Chartreuse'`
- a name from the [xkcd color survey](https://xkcd.com/color/rgb/): `color = 'xkcd:poison green'`)

---

### Font stuff (not all fonts/sizes have all properties)

* `fontfamily` {FONTNAME, 'serif', 'sans-serif', 'monospace'}
* `fontsize` {size in points, 'xx-small', 'x-small', 'small', 'medium', 'large', 'x-large', 'xx-large'}
* `fontstyle` {'normal', 'italic', 'oblique'}
* `fontweight` {a numeric value in range 0-1000, 'ultralight', 'light', 'normal', 'regular', 'book', 'medium', 'roman', 'semibold', 'demibold', 'demi', 'bold', 'heavy', 'extra bold', 'black'}

---

In [None]:
fig,ax = plt.subplots(1,1)                    # One window
fig.set_size_inches(8,5)                      # (width,height)
fig.tight_layout()                            # Make better use of space on plot

ax.set_xlim(0.0, 6.0)
ax.set_ylim(-1.5, 1.5)

ax.set_xlabel('Time (s)',
              fontfamily = 'monospace',
              fontsize = 18)

ax.set_ylabel('Voltage (mV)',
              fontfamily = 'serif',
              fontstyle = 'italic',
              fontsize = 18)

ax.set_title('Circut Output', 
             fontsize = 24, 
             fontweight = 'bold')

ax.plot(my_x, my_y,
        color = 'MidnightBlue',
        marker = 'None',
        linestyle = '--');

---

## Histograms

In [None]:
grade_table = pd.read_csv('./Data/Grades.csv')

In [None]:
grade_table.head(3)

In [None]:
fig,ax = plt.subplots(1,1)                    # One window
fig.set_size_inches(8,5)                      # (width,height)
fig.tight_layout()                            # Make better use of space on plot

ax.hist(grade_table['Exam1'],
        bins = 30,
        facecolor = 'MediumOrchid',
        label = "Exam 1")

ax.hist(grade_table['Exam2'],
        bins = 30,
        histtype = 'step',
        color = 'MidnightBlue',
        linewidth = 4,
        label = "Exam 2")

ax.legend(loc=0, shadow=True);

### Side Topic - Histogram Bins

* Plotting a histogram of two datasets with a differnt number of elements using the same bin number can lead to a misleading plot
* You can fix this by defining your bins

In [None]:
# Create a sub-set of the data

some_grades = (
    grade_table
    .query("Exam2 > 40")
    .query("Exam2 < 80")
)

some_grades.head(2)

In [None]:
grade_table['Quarter'].count()

In [None]:
some_grades['Quarter'].count()

In [None]:
fig,ax = plt.subplots(1,1)                    # One window
fig.set_size_inches(8,5)                      # (width,height)
fig.tight_layout()                            # Make better use of space on plot

ax.hist(grade_table['Exam2'],
        bins = 30,
        facecolor = 'MediumOrchid',
        label = "All Exams")

ax.hist(some_grades['Exam2'],
        bins = 30,
        histtype = 'step',
        color = 'MidnightBlue',
        linewidth = 4,
        label = "Score [40 - 80]")

ax.legend(loc=0, shadow=True);

In [None]:
my_bins = np.arange(18,95,2)
my_bins

In [None]:
fig,ax = plt.subplots(1,1)                    # One window
fig.set_size_inches(8,5)                      # (width,height)
fig.tight_layout()                            # Make better use of space on plot

ax.hist(grade_table['Exam2'],
        bins = my_bins,
        facecolor = 'MediumOrchid',
        label = "All Exams")

ax.hist(some_grades['Exam2'],
        bins = my_bins,
        histtype = 'step',
        color = 'MidnightBlue',
        linewidth = 4,
        label = "Score [40 - 80]")

ax.legend(loc=0, shadow=True);

---

## Adding text and lines

* `.vlines(x, ymin, ymax)`
* `.hlines(y, xmin, xmax)`
* `.text(X, Y, 'text')`

In [None]:
my_average = grade_table['Exam2'].mean()
my_std = grade_table['Exam2'].std()

In [None]:
my_average, my_std

In [None]:
fig,ax = plt.subplots(1,1)                    # One window
fig.set_size_inches(8,5)                      # (width,height)
fig.tight_layout()                            # Make better use of space on plot

ax.hist(grade_table['Exam2'],
        bins = my_bins,
        facecolor = 'MediumOrchid')

ax.vlines(my_average, 0, 28,
          color = 'LawnGreen',
          linewidth = 5,
          linestyle = '-')

ax.vlines(my_average + (1.5 * my_std), 0, 14,
          color = 'Navy',
          linewidth = 3,
          linestyle = '--')

ax.text(75, 8,
       '1.5-Sigma',
        color='HotPink',
        fontsize = 24);

----

## Subplots
- `subplot(rows,columns)`
- Access each subplot like a matrix. `[x,y]`
- Labels have to be added to each subplot separately
- For example: `subplot(2,2)` makes four panels with the coordinates:

In [None]:
fig, ax = plt.subplots(2,2)                # 2 rows 2 columns
fig.set_size_inches(11,8.5)                # width, height

# w_pad (width), h_pad (height) Extra space around subplots

fig.tight_layout(w_pad=3.5, h_pad=3.5)

# Plot at [0,0]

ax[0,0].plot(my_x, my_y,
             color = 'b',
             marker = 'None',
             linestyle = '--')

ax[0,0].set_xlabel('Time (s)')
ax[0,0].set_ylabel('Voltage (mV)')
ax[0,0].set_title('Circut Output')

# Plot at [0,1]

ax[0,1].hist(grade_table['Exam2'],
             bins = 30,
             facecolor = 'MediumOrchid')

ax[0,1].set_xlabel('Exam 2 Score')
ax[0,1].set_ylabel('Number of Students')
ax[0,1].set_title('Exam 2 Grades')

# Plot at [1,0]

ax[1,0].hist(grade_table['Exam2'],
             bins = 30,
             linewidth = 5,
             histtype = 'step')

ax[1,0].vlines(my_average, 0, 30.5,
               color = 'LawnGreen',
               linewidth = 5,
               linestyle = '-')

ax[1,0].set_title('Exam 2 Grades')
ax[1,0].set_xlabel('Exam 2 Score')
ax[1,0].set_ylabel('Number of Students')

# Plot at [1,1] - x-axis set to log

ax[1,1].set_xscale('log')

ax[1,1].plot(my_x, my_y,
             color = 'r',
             marker = 'None',
             linestyle = '--')

ax[1,1].set_xlabel('Time (s)')
ax[1,1].set_ylabel('Voltage (mV)')
ax[1,1].set_title('Circut Output (log)');

### Subplots with one row do not need a first index (`[0,0] -> [0]`)

In [None]:
fig, ax = plt.subplots(1,2)              # 1 row 2 columns
fig.set_size_inches(11,4)                # width, height

# A little extra width padding

fig.tight_layout(w_pad = 15.0) # <- Made this large to show effect


# Plot at [0]

ax[0].plot(my_x, my_y,
           color = 'b',
           marker = 'None',
           linestyle = '--')

ax[0].set_xlabel('Time (s)')
ax[0].set_ylabel('Voltage (mV)')
ax[0].set_title('Circut Output')

# Plot at [1]

ax[1].hist(grade_table['Exam2'],
           bins = 30,
           facecolor = 'MediumOrchid')

ax[1].set_title('Exam 2 Grades')
ax[1].set_xlabel('Exam 2 Score')
ax[1].set_ylabel('Number of Students');

---
## An Astronomical Example - Color Magnitude Diagram

<img style="float: right;" src="./images/M15_Image.jpg" width="200"/>

M15 is a globular cluster in the constellation Pegasus. 

It is one of the most densely packed globulars known in the Milky Way galaxy, and at an estimated 12.5 billion years old, it is one of the oldest known globular clusters.

#### Read in table using `QTables`

In [None]:
star_table = QTable.read('./Data/M15_Bright.csv', format='ascii.csv')

star_table[0:2]

#### Create a `B-V` column

In [None]:
star_table['BV'] = star_table['Bmag'] - star_table['Vmag']

In [None]:
star_table[0:2]

#### Create a subset of the data where `V < 16.25` and `B-V < 0.55`

In [None]:
red_clump = star_table[
    (star_table['Vmag'] < 16.25) & 
    (star_table['BV'] < 0.55)
]

In [None]:
red_clump[0:2]

In [None]:
fig, ax = plt.subplots(1,1)
fig.set_size_inches(15,10)
fig.tight_layout()

ax.set_xlim(-0.25,1.5)
ax.set_ylim(12,19)

ax.set_aspect(1/6)         # Make 1 unit in X = 6 units in Y
ax.invert_yaxis()          # Magnitudes increase to smaller values

ax.set_xlabel("B-V color")
ax.set_ylabel("V mag")
ax.set_title("M15 CMD")

ax.plot(star_table['BV'],star_table['Vmag'],
        color = "SteelBlue",
        marker = "o",
        linestyle = "None",
        markersize = 5,
        label = "All Data");

ax.plot(red_clump['BV'],red_clump['Vmag'],
        color = "DeepPink",
        marker = "*",
        linestyle = "None",
        markersize = 10,
        label = "Red Clump")

ax.legend(loc=0, shadow=True);

In [None]:
fig.savefig('M15_CMD.png', bbox_inches='tight')

---
# Alternative Projections

## Polar Plots (`r`, $\theta$)

In [None]:
my_theta = np.linspace(0, 2*np.pi, 1000)

#### The axis (`ax`) command is different for alternative projections:

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111, projection='polar')

fig.set_size_inches(6,6)
fig.tight_layout()

my_r = my_theta

ax.plot(my_r, my_theta/15.0,
        label="spiral")

ax.plot(my_r, np.cos(8*my_theta),
        label="flower")

ax.legend(loc=0, shadow=True);

## 3D plots `(X,Y,Z)`

In [None]:
from mpl_toolkits.mplot3d import Axes3D

In [None]:
fig = plt.figure()
ax = fig.add_subplot(111,projection='3d')

fig.set_size_inches(9,9)
fig.tight_layout()

ax.set_xlabel('This is X')
ax.set_ylabel('This is Y')
ax.set_zlabel('This is Z')

my_x = my_theta
my_y = np.cos(3 * my_theta)
my_z = np.sin(2 * my_theta)

ax.plot(my_x, my_y, my_z,
        color = 'Firebrick',
        marker = 'None',
        linestyle = '--');

ax.view_init(azim = 30, elev = 40)

---
## Tons of examples of `matplotlib` plots can be found [here](https://matplotlib.org/stable/gallery/index.html)

---
# Plotting data with Python - `Seaborn`

* `Seaborn` is a library for making statistical graphics in Python
* `Seaborn` is tightly intergrated with the `pandas` library
* [Seaborn Tutorial](https://seaborn.pydata.org/tutorial.html)

In [None]:
import seaborn as sns

### `relplot()` =  relationship plot

In [None]:
grade_table.head(3)

In [None]:
sns.relplot(x="Exam1", y="Exam2",
            data = grade_table,
            height = 8);

In [None]:
sns.relplot(x="Exam1", y="Exam2",
            data = grade_table,
            height = 8,
            hue = "Quarter");

In [None]:
sns.relplot(x="Exam1", y="Exam2",
            data = grade_table,
            height = 8,
            hue = "Quarter",
            style = "Quarter");

### `catplot()` =  category plot

In [None]:
sns.catplot(x="Quarter", y="Exam2",
            data=grade_table,
            height = 8,
            kind='swarm');

In [None]:
sns.catplot(x="Quarter", y="Exam2",
            data=grade_table,
            height = 8,
            kind='box');

In [None]:
sns.catplot(x="Quarter", y="Exam2",
            data=grade_table,
            height = 8,
            kind='boxen');

In [None]:
sns.catplot(x="Quarter", y="Exam2",
            data=grade_table,
            height = 8,
            kind='violin');

----

## All of the matplotlib plots can be made interactive. See the **bonus lecture** on the class Canvas page for instructions.