# Plotting exercises using matplotlib and seaborn - a

## The basics

`matplotlib` and `seaborn` are plotting library that play really well with the other packages such as `numpy` and `pandas`. In these exercises we will go through the basics of how to install them, import them and use them.

### Installing and importing

In [None]:
! pip install matplotlib seaborn

In Colab, you will see that these tools are already installed.

In [None]:
# The usual way to import them:
import matplotlib.pyplot as plt
import seaborn as sns

# Let's also import
import numpy as np
import pandas as pd

### My first `matplotlib` plot

The most basic `matplotlib` plot uses two `numpy` arrays.

**Exercise 1:** Plot the figure for the function `f(x) = sin(x)` for `x` between `0` and `2*np.pi`.

In [None]:
# cosine plot

**Exercise 2:** change the line style to be dotted, and the color to be black.

In [None]:
# cosine plot with styling

**Exercise 3:** Use a `for` loop to plot `sin(x + k)` for `k` between `0` and `5`

In [None]:
# multiple sin functions

**Exercise 4:** In the previous plot, change the `y` axis such that only `-1` and `1` appear in the ticks. Set appropiate titles, y-labels and x-labels.

In [None]:
# multiple sin with styling

## More advanced plots

In these next plots, we will use more methods of `matplotlib` and more keyword arguments therein.

**Exercise 5:** Create a vector of 1000 samples of a uniform distribution using `np.random.random`, and plot a histogram of them.

In [None]:
# histogram of 1000 samples from U(0, 1)

**Exercise 6:** Change the color of the previous plot, and change the width of the rectangles.

In [None]:
# histogram of 1000 samples from U(0, 1) with styling

Let's talk about creating **subplots**. For these exercises we will load a dataset from `https://data.cityofnewyork.us/api/views/7yig-nj52/rows.csv`

In [None]:
math_scores = pd.read_csv("https://data.cityofnewyork.us/api/views/7yig-nj52/rows.csv")
math_scores.shape

**Exercise 7:**
1. What are the columns and data types in the DataFrame `math_scores`?

In [None]:
# columsn and data types
math_scores.info()
math_scores.head()

Let's modify this dataframe to focus on only one district:

In [None]:
math_scores = math_scores[math_scores["District"] == 1]

2. Use `plt.subplots` to create a 2x2 plot grid.
3. Plot in these 4 subplots the variable `number tested` against `year` for the grades 5, 6, 7, 8.

In [None]:
fig, axes = plt.subplots(2, 2, sharey=True, sharex=True)

# A silly way of doing it.
# df = math_scores[math_scores["Grade"] == "5"]
# axes[0, 0].plot(df["Year"], df["Number Tested"])
# axes[0, 0].set_title("Grade 5")

# df = math_scores[math_scores["Grade"] == "6"]
# axes[0, 1].plot(df["Year"], df["Number Tested"])
# axes[0, 1].set_title("Grade 6")

# df = math_scores[math_scores["Grade"] == "7"]
# axes[1, 0].plot(df["Year"], df["Number Tested"])
# axes[1, 0].set_title("Grade 7")

# df = math_scores[math_scores["Grade"] == "8"]
# axes[1, 1].plot(df["Year"], df["Number Tested"])
# axes[1, 1].set_title("Grade 8")


# A better way of doing it
axes = [axes[0, 0], axes[0, 1], axes[1, 0], axes[1, 1]]
for ax, grade in zip(axes, ["5", "6", "7", "8"]):
    df = math_scores[math_scores["Grade"] == grade]
    ax.plot(df["Year"], df["Number Tested"])
    ax.set_title(f"Grade {grade}")

plt.tight_layout()

## Using seaborn

`Seaborn` is a plotting library that focuses on making some common statistics plots (e.g. violin plots, box plots, scatters, kernel density estimations...) easier to get.

In the following examples we will use a DataFrame provided by `seaborn` itself:

In [None]:
df_penguins = sns.load_dataset("penguins")

**Exercise 8:**
1. What are the columns and data types in `df_penguins`?
2. Clean this DataFrame from `NaN` values (hint: use dropna()). How many rows were lost?

**Exercise 9:** Use `sns.pairplot` to understand the relationship between the columns `"bill_length_mm", "bill_depth_mm", "flipper_length_mm"`. Illuminate by species.

In [None]:
sns.pairplot(
    data=df_penguins,
    hue="species",
    vars=["bill_length_mm", "bill_depth_mm", "flipper_length_mm"]
)