# Week 6, Class 3: Introduction to Seaborn and Cartopy

## Matplotlib color codes

### Ways to specify a color

| Format           | Example                             | Notes                                                        |
| ---------------- | ----------------------------------- | ------------------------------------------------------------ |
| Single-letter    | `'r'`, `'g'`, `'b'`                 | Legacy shorthands for 8 basics below.                        |
| Named color      | `'red'`, `'gold'`, `'slategray'`    | Any CSS4/XKCD name supported.                                |
| Hex (RGB)        | `'#1f77b4'`                         | `#RRGGBB`.                                                   |
| Hex (RGBA)       | `'#1f77b480'`                       | `#RRGGBBAA` (AA = alpha).                                    |
| Grayscale string | `'0.0'` … `'1.0'`                   | `'0'`=black → `'1'`=white.                                   |
| RGB / RGBA tuple | `(0.12, 0.47, 0.71)`, `(1,0,0,0.5)` | Floats 0–1.                                                  |

### Single-letter shorthands (and their hex)

| Code | Name    | Hex       |
| ---- | ------- | --------- |
| `b`  | blue    | `#0000FF` |
| `g`  | green   | `#008000` |
| `r`  | red     | `#FF0000` |
| `c`  | cyan    | `#00FFFF` |
| `m`  | magenta | `#FF00FF` |
| `y`  | yellow  | `#FFFF00` |
| `k`  | black   | `#000000` |
| `w`  | white   | `#FFFFFF` |

Usage:
```python
ax.bar(xs, ys, color='#9467bd')
ax.hist(data, color=(0.2, 0.4, 0.6, 0.8))
```

## Matplotlib colormaps

### **Perceptually Uniform Sequential**

| Name      | Description           | Notes                        |
| --------- | --------------------- | ---------------------------- |
| `viridis` | Blue → Green → Yellow | Default since Matplotlib 2.0 |
| `plasma`  | Purple → Yellow       | High contrast                |
| `inferno` | Black → Red → Yellow  | Good for visibility          |
| `magma`   | Black → Red → White   | Darker variant               |
| `cividis` | Blue → Yellow         | Colorblind-friendly          |

### **Sequential**

| Name      | Description             | Notes     |
| --------- | ----------------------- | --------- |
| `Greys`   | Black → White           | Grayscale |
| `Purples` | Light → Dark Purple     |           |
| `Blues`   | Light → Dark Blue       |           |
| `Greens`  | Light → Dark Green      |           |
| `Oranges` | Light → Dark Orange     |           |
| `Reds`    | Light → Dark Red        |           |
| `YlOrBr`  | Yellow → Orange → Brown |           |
| `YlOrRd`  | Yellow → Orange → Red   |           |
| `OrRd`    | Orange → Red            |           |
| `PuRd`    | Purple → Red            |           |
| `BuPu`    | Blue → Purple           |           |
| `GnBu`    | Green → Blue            |           |
| `PuBu`    | Purple → Blue           |           |
| `YlGnBu`  | Yellow → Green → Blue   |           |
| `PuBuGn`  | Purple → Blue → Green   |           |
| `BuGn`    | Blue → Green            |           |
| `YlGn`    | Yellow → Green          |           |

### **Diverging**

| Name       | Description          | Notes |
| ---------- | -------------------- | ----- |
| `PiYG`     | Pink ↔ Green         |       |
| `PRGn`     | Purple ↔ Green       |       |
| `BrBG`     | Brown ↔ Blue-Green   |       |
| `PuOr`     | Purple ↔ Orange      |       |
| `RdGy`     | Red ↔ Gray           |       |
| `RdBu`     | Red ↔ Blue           |       |
| `RdYlBu`   | Red ↔ Yellow ↔ Blue  |       |
| `RdYlGn`   | Red ↔ Yellow ↔ Green |       |
| `Spectral` | Multicolor diverging |       |

### **Cyclic**

| Name               | Description            | Notes                   |
| ------------------ | ---------------------- | ----------------------- |
| `twilight`         | Cyclic purple → orange | Smooth transitions      |
| `twilight_shifted` | Shifted variant        |                         |
| `hsv`              | Hue circle             | Discontinuous at 0/360° |

### **Qualitative**

| Name                        | Description              | Notes         |
| --------------------------- | ------------------------ | ------------- |
| `Pastel1`, `Pastel2`        | Pastel tones             |               |
| `Paired`                    | Strong contrasting pairs |               |
| `Accent`                    | Bold accents             |               |
| `Dark2`                     | Darker tones             |               |
| `Set1`, `Set2`, `Set3`      | Categorical sets         |               |
| `tab10`                     | 10 Tableau colors        | Default cycle |
| `tab20`, `tab20b`, `tab20c` | 20 Tableau colors        | Larger sets   |

### **Miscellaneous**

| Name                  | Description                       | Notes        |
| --------------------- | --------------------------------- | ------------ |
| `flag`                | Red, White, Blue, Black repeating |              |
| `prism`               | High-saturation rainbow           |              |
| `ocean`               | Blue shades                       |              |
| `gist_earth`          | Earth tones                       |              |
| `terrain`             | Green → Brown → White             |              |
| `gist_stern`          | Blue/Purple hues                  |              |
| `gnuplot`, `gnuplot2` | Replicates GNUplot                |              |
| `CMRmap`              | Replicates CMR map                |              |
| `cubehelix`           | Perceptually uniform helix        |              |
| `brg`                 | Blue → Red → Green                |              |
| `gist_rainbow`        | Rainbow variant                   |              |
| `rainbow`             | Full rainbow                      |              |
| `jet`                 | Classic (but discouraged) rainbow | Non-uniform! |


Usage:

```python
plt.scatter(x, y, c=z, cmap="viridis")
plt.imshow(data, cmap="RdBu_r")  # "_r" for reversed
```

In [None]:
# Create a grid of data for a 2D function, representing a 3D surface
x = np.linspace(-5, 5, 50)
y = np.linspace(-5, 5, 50)
X, Y = np.meshgrid(x, y)
Z = np.sin(np.sqrt(X**2 + Y**2)) # A 3D wave function

# Create a contour plot
plt.figure(figsize=(8, 6))
contour_plot = plt.contourf(X, Y, Z, cmap='viridis') # `contourf` fills the contours
plt.colorbar(contour_plot, label='Function Value (Z)') # Add a colorbar to show the scale
plt.title("Contour Plot of a 3D Function")
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.show()

## 1. What is Seaborn?
**Seaborn** is a statistical plotting library that is deeply integrated with Pandas DataFrames. Its main goal is to make it easy to create plots that tell a story about your data, especially for common statistical tasks.

* **Matplotlib vs. Seaborn**: Think of Matplotlib as a blank canvas with a paintbrush, giving you complete control. Seaborn is like a pre-designed template that automatically applies great aesthetics and computes statistical summaries for you, allowing you to create complex plots with just a single function call. You'll often use both together.

By convention, we import Seaborn as `sns`.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns
import cartopy.crs as ccrs
import cartopy.feature as cfeature

# Let's set a style for all our plots using Seaborn's aesthetics
sns.set_style("whitegrid")
sns.set_theme(style="whitegrid", palette="muted")

## 2. Plotting with a `DataFrame`: The Core of Seaborn
Seaborn is designed to work with tidy data, where each variable is a column and each observation is a row. This makes it a perfect companion for Pandas DataFrames.

Let's create a sample DataFrame for our examples.

In [None]:
# Create a sample DataFrame of mock experimental data
data = {
    'dose_mg': [5, 10, 15, 20, 25, 5, 10, 15, 20, 25],
    'treatment': ['A', 'A', 'A', 'A', 'A', 'B', 'B', 'B', 'B', 'B'],
    'response': [12, 25, 38, 45, 55, 15, 28, 40, 52, 60],
    'time_min': [10, 20, 30, 40, 50, 10, 20, 30, 40, 50]
}
exp_df = pd.DataFrame(data)

print("Our sample DataFrame:")
print(exp_df)

## 3. Relational Plots: Visualizing Relationships

Seaborn's `relplot` is a flexible function for visualizing statistical relationships. The most common types are scatter plots and line plots.

### 3.1. Scatter Plot (`kind='scatter'`)

We can easily create a scatter plot of `response` versus `dose_mg`, and even use another variable, `treatment`, to separate the data with different colors or markers.

In [None]:
# Create a scatter plot of response vs. dose, with points colored by treatment
sns.relplot(x='dose_mg', y='response', data=exp_df, hue='treatment', style='treatment', kind='scatter')

plt.title("Dose-Response Curve by Treatment") # We can still use Matplotlib functions
plt.xlabel("Dose (mg)")
plt.ylabel("Response")
plt.show()

## 4. Categorical Plots: Visualizing Grouped Data
When one of your main variables is categorical (like `treatment` or `species`), Seaborn's categorical plots are very useful. `catplot` is a high-level function that provides a unified interface for many types of categorical plots.

### 4.1. Bar Plots (`kind='bar'`)
A bar plot is great for showing the central tendency (e.g., mean) of a numerical variable for each category.

In [None]:
# Create a bar plot of the mean response for each treatment
sns.catplot(x='treatment', y='response', data=exp_df, kind='bar')

plt.title("Average Response by Treatment")
plt.xlabel("Treatment Type")
plt.ylabel("Average Response")
plt.show()

### 4.2. Box Plots (kind='box')
A box plot (or box-and-whisker plot) provides a visual summary of the distribution of numerical data through its quartiles. It is excellent for comparing distributions across different categories.

In [None]:
# Let's create some data with more variation to demonstrate
data_with_noise = {
    'treatment': ['A', 'A', 'A', 'B', 'B', 'B', 'C', 'C', 'C'],
    'response': [10, 12, 14, 15, 17, 19, 20, 21, 22]
}
exp_df_noise = pd.DataFrame(data_with_noise)

# Create a box plot of the response for each treatment
sns.catplot(x='treatment', y='response', data=exp_df_noise, kind='box')

plt.title("Response Distribution by Treatment")
plt.xlabel("Treatment Type")
plt.ylabel("Response")
plt.show()

## 5. Distribution Plots: Histograms and KDEs
These plots are used to understand the distribution of a single variable.

### 5.1. Histograms (`histplot`)
A histogram shows the frequency of data points within specified bins.

In [None]:
# Generate some random data for a histogram
measurements = np.random.normal(loc=50, scale=5, size=200)

# Create a histogram of the measurements
sns.histplot(measurements, bins=15, kde=True) # kde=True adds a kernel density estimate line

plt.title("Distribution of Measurements")
plt.xlabel("Measurement Value")
plt.ylabel("Frequency")
plt.show()

### 5.2. Kernel Density Estimate (`kdeplot`)
A KDE plot is a smoothed, continuous version of a histogram. It estimates the probability density function of a variable.

In [None]:
# Using the same `measurements` data from above
sns.kdeplot(measurements, fill=True) # fill=True shades the area under the curve

plt.title("Kernel Density Estimate of Measurements")
plt.xlabel("Measurement Value")
plt.ylabel("Density")
plt.show()

## 6. Heatmaps: Visualizing 2D Data
A heatmap is a graphical representation of data where the individual values in a matrix are represented as colors. It's fantastic for visualizing a correlation matrix, which is a common task in scientific analysis.

In [None]:
# Create some mock data
data_for_corr = {
    'var1': np.random.randn(50),
    'var2': np.random.randn(50) * 2,
    'var3': np.random.randn(50) * 0.5
}
df_corr = pd.DataFrame(data_for_corr)

# Calculate the correlation matrix
correlation_matrix = df_corr.corr()

# Create a heatmap of the correlation matrix
sns.heatmap(correlation_matrix, annot=True, cmap='viridis') # annot=True shows the values on the plot

plt.title("Correlation Matrix Heatmap")
plt.show()

## 7. Geospatial Visualization with Cartopy
While Seaborn and Matplotlib are excellent for general plotting, a special class of data requires specialized tools. Cartopy is a dedicated library for geospatial data visualization. It makes it easy to draw maps and plot data on them with different map projections.

### 7.1. What is Cartopy?
Cartopy is a Python library for creating maps and other geospatial data plots. It is built on top of Matplotlib and is a perfect tool for scientists working with climate data, geology, or any data with latitude and longitude coordinates.

### 7.2. Creating a Basic Map
The core of Cartopy is the concept of a **coordinate reference system (CRS)**, which defines how real-world coordinates are mapped to a 2D plane. We'll use a simple projection to create a world map, add some basic features like coastlines and borders, and then plot some sample data points.

In [None]:
import matplotlib.pyplot as plt
import cartopy.crs as ccrs
import cartopy.feature as cfeature

# Create a new figure and axes with a specific projection
fig = plt.figure(figsize=(10, 8))
ax = fig.add_subplot(1, 1, 1, projection=ccrs.PlateCarree())

# Add some built-in features to the map
ax.set_title("Geospatial Plot with Cartopy")
ax.add_feature(cfeature.COASTLINE)
ax.add_feature(cfeature.BORDERS, linestyle=':')
ax.add_feature(cfeature.LAND, edgecolor='black')
ax.add_feature(cfeature.OCEAN)
ax.add_feature(cfeature.LAKES, edgecolor='black')
ax.add_feature(cfeature.RIVERS)

# Set the extent of the map (optional)
ax.set_extent([-10, 40, 30, 60], crs=ccrs.PlateCarree()) # [min_lon, max_lon, min_lat, max_lat]

# Create some dummy data points (latitude, longitude)
lons = [-0.1, 2.3, 12.5, 28.9] # London, Paris, Rome, Istanbul
lats = [51.5, 48.8, 41.9, 41.0]

# Plot the points on the map
ax.scatter(lons, lats, color='red', marker='o', transform=ccrs.PlateCarree(), label='Cities')

# Add a legend
ax.legend()

plt.show()

## Summary and Key Takeaways
* **Seaborn** is a powerful, high-level library for creating statistical plots from DataFrames.
* Seaborn plots are often more aesthetically pleasing and require less code than raw Matplotlib.
* Use `sns.relplot()` for relational plots (scatter, line).
* Use `sns.catplot()` for categorical plots (bar, box).
* Use `sns.histplot()` and sns.kdeplot() for visualizing distributions.
* Use `sns.heatmap()` to visualize 2D data like a correlation matrix.
* **Cartopy** is a specialized library for creating maps and is essential for geospatial data. You create a map by defining a projection and adding features to it.

## Exercises
Complete the following exercises in a new Python script or a new Jupyter Notebook.

1. Categorical Bar Plot with Error Bars:
    * Create a DataFrame with columns `Treatment` and `Result`.
    * Use `sns.catplot()` with `kind='bar'` to visualize the average `Result` for each `Treatment`.
    * The plot will automatically add error bars representing the standard error of the mean.
    * Add a title and axis labels.

2. Visualizing a Distribution:
    * Create a NumPy array of 100 random numbers from a normal distribution (`np.random.randn(100)`).
    * Create a `sns.kdeplot` of this data.
    * Add a title: "Kernel Density Estimate of Random Data".
    * Add `fill=True` to fill the area under the curve.

3. Create a Correlation Heatmap:
    * Create a DataFrame with at least 4 columns of random numerical data (e.g., use `np.random.rand(20, 4)`).
    * Calculate the correlation matrix using `df.corr()`.
    * Create a `sns.heatmap` of the correlation matrix.
    * Add a title: "Random Data Correlation".
    * Make sure to use `annot=True` so the correlation values are displayed.

4. Create a Map with Cartopy:
    * Create a new figure and axes with the `ccrs.PlateCarree()` projection.
    * Add `cfeature.LAND`, `cfeature.OCEAN`, and `cfeature.COASTLINE` to the map.
    * Define a list of longitudes and latitudes for a few cities or scientific sites.
    * Plot these points on the map using a scatter plot (`ax.scatter()`), making sure to use `transform=ccrs.PlateCarree()`.
    * Add a title to the map: "Global Sites of Interest".
    * Show the plot.
      
5. Create a Contour Plot:
    * Generate a grid of X and Y values using `np.meshgrid()` for `x` and `y` ranging from -10 to 10.
    * Define a function for the Z-values, for example, `Z = np.sin(X) + np.cos(Y)`.
    * Create a filled contour plot (`plt.contourf`) of this data.
    * Use a colormap of your choice.
    * Add a colorbar and a title to the plot.
    * Show the plot.