# Matplotlib: A Beginner's Guide to Data Visualization

Welcome to this introductory guide to Matplotlib, one of the most popular and versatile data visualization libraries in Python. This notebook is designed for B.S. and M.S. students who are new to data visualization and want to build a strong foundation in creating plots and charts.

**What is Matplotlib?**

Matplotlib is a low-level plotting library that provides a great deal of flexibility and control over your visualizations. It's the foundation upon which many other data visualization libraries, like Seaborn, are built. By understanding Matplotlib, you'll gain a deeper appreciation for how plotting works in Python.

**Why Learn Matplotlib?**

*   **Control:** Matplotlib gives you fine-grained control over every aspect of your plots, from colors and line styles to labels and annotations.
*   **Versatility:** You can create a wide variety of plots, including line plots, bar charts, scatter plots, histograms, and more.
*   **Foundation:** Understanding Matplotlib will make it easier to learn and use other data visualization libraries.

In this notebook, we'll explore some of the most common plots and learn how to create them using Matplotlib. We'll also discuss when to use each type of plot and how to interpret the results.

## 1. Line Plots: Visualizing Trends Over Time

**What is a Line Plot?**

A line plot is a simple yet powerful way to visualize data that changes over a continuous interval, such as time. It's ideal for showing trends, patterns, and relationships between two variables.

**When to Use a Line Plot:**

*   Tracking stock prices over a year
*   Monitoring temperature changes throughout a day
*   Visualizing website traffic over a month

Let's create a simple line plot to visualize the growth of a plant over several weeks.

In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Sample data: Plant growth over 10 weeks
weeks = np.arange(1, 11)
height = np.array([2, 4, 7, 11, 15, 19, 23, 26, 29, 31])

# Create the line plot
plt.figure(figsize=(8, 5))
plt.plot(weeks, height, marker='o', linestyle='-', color='g')

# Add titles and labels
plt.title('Plant Growth Over 10 Weeks')
plt.xlabel('Weeks')
plt.ylabel('Height (cm)')

# Add a grid for better readability
plt.grid(True)

# Show the plot
plt.show()

## 2. Bar Charts: Comparing Categorical Data

**What is a Bar Chart?**

A bar chart is used to compare the values of different categories. It represents each category as a bar, with the length of the bar corresponding to the value.

**When to Use a Bar Chart:**

*   Comparing the sales of different products
*   Showing the number of students in different majors
*   Visualizing the population of different countries

Let's create a bar chart to compare the number of students enrolled in different courses.

In [None]:
# Sample data: Student enrollment in different courses
courses = ['Math', 'Science', 'English', 'History', 'Art']
students = [80, 95, 70, 55, 65]

# Create the bar chart
plt.figure(figsize=(8, 5))
plt.bar(courses, students, color=['blue', 'green', 'red', 'purple', 'orange'])

# Add titles and labels
plt.title('Student Enrollment in Different Courses')
plt.xlabel('Courses')
plt.ylabel('Number of Students')

# Show the plot
plt.show()

## 3. Scatter Plots: Exploring Relationships

**What is a Scatter Plot?**

A scatter plot is used to visualize the relationship between two numerical variables. Each point on the plot represents an observation from the dataset.

**When to Use a Scatter Plot:**

*   Investigating the correlation between advertising spending and sales
*   Exploring the relationship between a student's study hours and their exam scores
*   Analyzing the connection between a car's weight and its fuel efficiency

Let's create a scatter plot to examine the relationship between hours spent studying and exam scores.

In [None]:
# Sample data: Study hours and exam scores
study_hours = np.array([2, 3, 5, 6, 8, 9, 10, 12])
exam_scores = np.array([65, 70, 75, 80, 85, 88, 90, 92])

# Create the scatter plot
plt.figure(figsize=(8, 5))
plt.scatter(study_hours, exam_scores, color='purple', marker='*')

# Add titles and labels
plt.title('Relationship Between Study Hours and Exam Scores')
plt.xlabel('Study Hours')
plt.ylabel('Exam Scores')

# Add a grid
plt.grid(True)

# Show the plot
plt.show()

## 4. Histograms: Understanding Distributions

**What is a Histogram?**

A histogram is used to visualize the distribution of a single numerical variable. It divides the data into a series of intervals (or "bins") and shows the number of observations that fall into each bin.

**When to Use a Histogram:**

*   Analyzing the distribution of student grades in a class
*   Visualizing the age distribution of a population
*   Examining the distribution of house prices in a city

Let's create a histogram to understand the distribution of exam scores in a class.

In [None]:
# Sample data: Exam scores of 100 students
exam_scores = np.random.normal(loc=75, scale=10, size=100)

# Create the histogram
plt.figure(figsize=(8, 5))
plt.hist(exam_scores, bins=10, color='skyblue', edgecolor='black')

# Add titles and labels
plt.title('Distribution of Exam Scores')
plt.xlabel('Exam Scores')
plt.ylabel('Frequency')

# Show the plot
plt.show()

## Conclusion and Next Steps

Congratulations! You've learned how to create some of the most common plots using Matplotlib. By mastering these fundamental visualizations, you're well on your way to becoming a proficient data visualizer.

**Exercises:**

1.  **Create a line plot:** Visualize the temperature changes over a 24-hour period. You can create your own dummy data or find a small dataset online.
2.  **Create a bar chart:** Compare the populations of five different cities.
3.  **Create a scatter plot:** Explore the relationship between a person's age and their income.
4.  **Create a histogram:** Analyze the distribution of heights in a group of people.

Feel free to experiment with different datasets and customize your plots with different colors, styles, and labels. The more you practice, the more comfortable you'll become with Matplotlib.

In the next notebook, we'll explore **Seaborn**, a library built on top of Matplotlib that makes it even easier to create beautiful and informative statistical plots.