
# Day 3: Data Visualization with Matplotlib and Seaborn

**Prepared By:** Dr. Kenechi Omeke  
**Date:** November 2024  

---

## Aim
Teach students how to visualize data effectively using popular Python libraries.

## Intended Learning Outcomes
- Create basic and advanced plots using Matplotlib and Seaborn.
- Interpret data visualizations to extract insights.
- Customize and style plots for clarity and impact.

## Topics Covered
- Introduction to Data Visualization
- Matplotlib Basics (with code examples)
- Seaborn for Statistical Visualization (with code examples)
- Customizing and Styling Plots
- Mini-Project: Visualizing Real Data

---

## 1. Why Visualize Data?
- Spot trends, patterns, and outliers
- Communicate findings clearly
- Make data-driven decisions

**Common plot types:** line, bar, scatter, histogram, box, heatmap

---

## 2. Matplotlib Basics
Matplotlib is the foundation of Python plotting. Let's start with the essentials.

In [None]:
import matplotlib.pyplot as plt

# Line plot
x = [1, 2, 3, 4, 5]
y = [10, 15, 13, 20, 18]
plt.plot(x, y)
plt.title('Basic Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()

In [None]:
# Scatter plot
x = [5, 7, 8, 7, 2, 17, 2, 9, 4, 11, 12, 9, 6]
y = [99, 86, 87, 88, 100, 86, 103, 87, 94, 78, 77, 85, 86]
plt.scatter(x, y)
plt.title('Basic Scatter Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.show()

In [None]:
# Bar plot
categories = ['A', 'B', 'C', 'D']
values = [5, 7, 3, 8]
plt.bar(categories, values)
plt.title('Basic Bar Plot')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.show()

In [None]:
# Histogram
import numpy as np
data = np.random.normal(0, 1, 100)
plt.hist(data, bins=10)
plt.title('Histogram')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.show()


## Exercise: Matplotlib Practice
1. Create a list of 10 numbers and plot them as a line plot.
2. Make a bar plot of your favorite fruits and their quantities.

# 3. Seaborn for Statistical Visualization
Seaborn builds on Matplotlib and makes beautiful, statistical plots easy.

In [None]:
import seaborn as sns
# Load a sample dataset
df = sns.load_dataset('iris')
# Box plot
sns.boxplot(x='species', y='sepal_length', data=df)
plt.title('Box Plot of Sepal Length by Species')
plt.show()

In [None]:
# Histogram with KDE
sns.histplot(df['sepal_length'], bins=10, kde=True)
plt.title('Distribution of Sepal Length')
plt.show()

In [None]:
# Scatter plot with grouping
sns.scatterplot(x='sepal_length', y='sepal_width', hue='species', data=df)
plt.title('Scatter Plot of Sepal Dimensions')
plt.show()

In [None]:
# Pair plot
sns.pairplot(df, hue='species')
plt.suptitle('Pair Plot of Iris Dataset', y=1.02)
plt.show()


## Exercise: Seaborn Practice
1. Use the Titanic dataset (`sns.load_dataset('titanic')`) to make a box plot of age by class.
2. Make a histogram of fare prices.

# 4. Customizing and Styling Plots
- Add titles, labels, and legends
- Change colors and styles
- Use Seaborn themes and palettes

In [None]:
sns.set_style('whitegrid')
sns.set_palette('coolwarm')
sns.barplot(x='species', y='petal_length', data=df)
plt.title('Styled Bar Plot')
plt.show()

# 5. Mini-Project: Visualizing Real Data
1. Load a dataset (Iris, Titanic, or your own CSV).
2. Create at least three different plot types (line, bar, scatter, box, histogram, pair).
3. Customize your plots for clarity.
4. Write a short summary of insights from your visualizations.

## Reflection & Next Steps
- Which plot type did you find most useful?
- Try visualizing a dataset from your own field or interest.
- Explore more in the Matplotlib and Seaborn documentation.

---

## References
- [Matplotlib Documentation](https://matplotlib.org/stable/contents.html)
- [Seaborn Documentation](https://seaborn.pydata.org/)