# Visualizing Data with Matplotlib

Visualizing data is a powerful way to uncover interesting patterns.

As we saw during the pandemic, understanding public health data is crucial for informed decision-making and scientific literacy. Through visualizations of covid data, we can gain insights into the dynamics of the pandemic.

## Getting Started: Importing Libraries and Loading Data

First, we'll import `matplotlib.pyplot` for plotting and `pandas` for data manipulation. Most of the Matplotlib functions lies under the `pyplot` submodule, and are usually imported as `plt`. Then, we'll load our COVID-19 dataset.

# Line Plots: Tracking Trends Over Time

Line plots are ideal for visualizing how variables change sequentially, often over time. This helps us observe trends, peaks, and valleys in the data. Here, we'll plot the daily new COVID-19 cases and the test positivity rate.

### How a Plot is Constructed: A Step-by-Step Guide

When you create a plot using Matplotlib, you're essentially building an image layer by layer. Let's break down the fundamental components and the typical workflow:

**1. The Data**

The data typically comes in the form of lists, arrays, or columns in a dataframe.

**2. The Plotting Function: Drawing Your Data**

This is where you tell Matplotlib how to represent your data. Based on the type of relationship or distribution you want to show, you choose a specific function. Here we will create a line plot by using the `plot()` function:

**3. Customizations**

#### Legend

If you have multiple data series on one plot, `plt.legend()` uses the `label=` arguments from your plotting functions to create a key.

#### Figure Size:

Control the overall size of your plot window (the "Figure") using plt.figure(figsize=(width, height)) before any plotting commands. This is useful for making plots larger for presentations or smaller for embedding.

#### Appearance

Added as arguments to the `plot()` function after the `x`, `y`:
- Colors: `color='red'`

- Line Styles: `linestyle='--'` (dashed), `'-'` (solid)

- Markers: `marker='o'` (circles), `'s'` (squares)

Grid: `plt.grid(True)`


### Plotting Multiple Lines

## Scatter Plots: Exploring Relationships Between Variables

Scatter plots are used to visualize the relationship between two numerical variables. Each point represents an observation, and the pattern of points can suggest a correlation (positive, negative, or none). Here, let's look at the relationship between many demographic variables and covid positivity rate among different Chicago ZIP codes during the early days of the pandemic.

## Histograms: Understanding Data Distribution

Histograms are used to show the distribution of a single numerical variable. They divide the data into bins and count how many observations fall into each bin, helping us understand the frequency and spread of values. Let's examine the distribution of daily new COVID-19 cases.

___
## 💪 **Exercise** 💪

Look at the datasets in the 'data' folder. Create a **line plot**, **scatter plot** and **histogram** from these datasets. You so not have to use the same dataset for all three visualizations.
___