# Matplotlib
## Overview
### Introduction
Matplotlib is a Python library for visualizing one-dimensional data, such as Python lists, and two-dimensional data, such as CSV files or Excel sheets. 

### What you'll learn
In this section, you'll learn


1.   The matplotlib object hierarchy and examples of matplotlib objects.
2.   How to create and manipulate a subplot.



### Prerequisites
Before starting this section, you should have an understanding of


1.   Python



### Setup
Run the code block below to set up your environment for this activity.

In [0]:
# RUN ME
!pip3 install matplotlib

## Getting Started with Matplotlib

From `matplotlib`, we can `import pyplot`. We can use `pyplot` to create or update a figure or set of axes. You can only manipulate one Figure or Axes at a time.

In [0]:
# RUN ME
from matplotlib import pyplot as plt

### The Matplotlib Object Hierarchy
A matplotlib plot contains a series of matplotlib objects. This is shown below.

![Chart](https://files.realpython.com/media/fig_map.bc8c7cabd823.png)

The outermost container is the Figure. The Axes are the second outermost container. Axes in maplotlib are not what we would assume. An Axes is an individual plot or graph, so a figure could contain multiple axes. The hierarchy continues with smaller objects, such as individual lines, text boxes and legends. Below is another image showing more of the smaller matplotlib objects.

![Chart: anatomy of a figure](https://files.realpython.com/media/anatomy.7d033ebbfbc8.png)

### Creating our first subplot
Now we will finally apply what we've been learning! Yay! We can call the subplots function using pyplot. This is how we create make a Figure containing one Axes.



In [0]:
# RUN ME
hourly_temperature_dict = {
    0: 53,
    1: 52,
    2: 52,
    3: 50,
    4: 49,
    5: 50,
    6: 51,
    7: 53,
    8: 57,
    9: 59,
    10: 62,
    11: 66,
    12: 68,
    13: 70,
    14: 71,
    15: 72,
    16: 71,
    17: 70
}

hours = list(hourly_temperature_dict.keys())
temperatures = list(hourly_temperature_dict.values())

In [0]:
# Run this to see how the data is represented right now
print(hours)
print(temperatures)

In [0]:
# Set the style of the plot. 
# There's a full list here - https://tonysyu.github.io/raw_content/matplotlib-style-gallery/gallery.html
# Take a look and see if there's a style you like!

plt.style.use('fivethirtyeight')

In [0]:
# Plot the temperature data
plt.plot(hours, temperatures)

We have a graph! However, there's definitely room for improvement here. Let's start by adding a title and labeling the axes so that our audience can interpret our data.

In [0]:
# Set the title of the graph
plt.title("Fake Temperature Data, 00:00 - 17:00")

# Label the x-axis
plt.xlabel("Hour")

# Label the y-axis
plt.ylabel("Temperature (F)")

# Plot the temperature data
plt.plot(hours, temperatures)

The current hour increments are a bit awkward. Let's improve our graph so that each tick on the x-axis represents one hour.

In [0]:
# Set the title of the graph
plt.title("Fake Temperature Data, 00:00 - 17:00")

# Label the x-axis
plt.xlabel("Hour")

# Label the y-axis
plt.ylabel("Temperature (F)")

# Make each hour a tick on the x-axis
plt.xticks(hours)

# Plot the temperature data
plt.plot(hours, temperatures)

Now, let's get rid of that unnecessary whitespace before x=0 and after x=17.

In [0]:
# Set the title of the graph
plt.title("Fake Temperature Data, 00:00 - 17:00")

# Label the x-axis
plt.xlabel("Hour")

# Label the y-axis
plt.ylabel("Temperature (F)")

# Make each hour a tick on the x-axis
plt.xticks(hours)

# Adjust the graph's limits. 
# X-axis: The lower limit is 0, and the upper limit is 17
# Y-axis: The lower limit is the smallest value in temperatures, the upper limit is the largest value in temperatures
plt.axis([0, 17, min(temperatures), max(temperatures)])

# Plot the temperature data
plt.plot(hours, temperatures)

That's better, but now it looks like the min and max values are cut off. Let's make a slight adjustment by decreasing the lower limit by 1 and increasing the upper limit by 1.

In [0]:
# RUN ME

# Set the title of the graph
plt.title("Fake Temperature Data, 00:00 - 17:00")

# Label the x-axis
plt.xlabel("Hour")

# Label the y-axis
plt.ylabel("Temperature (F)")

# Make each hour a tick on the x-axis
plt.xticks(hours)

# Adjust the graph's limits. 
# X-axis: The lower limit is 0, and the upper limit is 17
# Y-axis: The lower limit is the smallest value in temperatures, the upper limit is the largest value in temperatures
plt.axis([0, 17, min(temperatures) - 1, max(temperatures) + 1])

# Plot the temperature data
plt.plot(hours, temperatures)

Let's say the list of temperatures we were using was from a given day in 2018. We are now given that day's temperatures from 2019:

In [0]:
# RUN ME

# Rename the variables for clarity/consistency
hours2018 = hours
temperatures2018 = temperatures

hourly_temperature_dict2019 = {
    0: 60,
    1: 59,
    2: 58,
    3: 58,
    4: 57,
    5: 57,
    6: 59,
    7: 61,
    8: 64,
    9: 68,
    10: 69,
    11: 72,
    12: 75,
    13: 79,
    14: 82,
    15: 83,
    16: 80,
    17: 79
}

hours2019 = list(hourly_temperature_dict2019.keys())
temperatures2019 = list(hourly_temperature_dict2019.values())

Let's add this new data to our graph as a separate line!

In [0]:
# Set the title of the graph
plt.title("Fake Temperature Data, 00:00 - 17:00")

# Label the x-axis
plt.xlabel("Hour")

# Label the y-axis
plt.ylabel("Temperature (F)")

# Make each hour a tick on the x-axis
plt.xticks(hours)

# Adjust the graph's limits. 
# X-axis: The lower limit is 0, and the upper limit is 17
# Y-axis: The lower limit is the smallest value in temperatures, the upper limit is the largest value in temperatures
plt.axis([0, 17, min(temperatures) - 1, max(temperatures) + 1])

# Plot the 2018 data
plt.plot(hours2018, temperatures2018)

# Plot the 2019 data
plt.plot(hours2019, temperatures2019)

We now have two issues with this plot. The first issue is that the 2019 data doesn't fit inside the current limit. To fix this, we have to improve our logic for determining the graph's limits.

Since we now have two temperature datasets, we need to calculate the minimum and maximum temperatures across the two datasets - it's possible that the minimum temperature is in a different dataset than the maximum temperature.

In [0]:
# Set the title of the graph
plt.title("Fake Temperature Data, 00:00 - 17:00")

# Label the x-axis
plt.xlabel("Hour")

# Label the y-axis
plt.ylabel("Temperature (F)")

# Make each hour a tick on the x-axis
plt.xticks(hours)

# Adjust the graph's limits. 
# X-axis: The lower limit is 0, and the upper limit is 17
# Y-axis: The lower limit is the smallest value in temperatures, the upper limit is the largest value in temperatures
ymin = min(temperatures2018 + temperatures2019) - 1
ymax = max(temperatures2018 + temperatures2019) + 1

plt.axis([0, 17, ymin, ymax])

# Plot the 2018 data
plt.plot(hours2018, temperatures2018)

# Plot the 2019 data
plt.plot(hours2019, temperatures2019)

The second issue is that now our audience doesn't know what each line represents. Let's fix that by adding a legend.

In [0]:
# Set the title of the graph
plt.title("Fake Temperature Data, 00:00 - 17:00")

# Label the x-axis
plt.xlabel("Hour")

# Label the y-axis
plt.ylabel("Temperature (F)")

# Make each hour a tick on the x-axis
plt.xticks(hours)

# Adjust the graph's limits. 
# X-axis: The lower limit is 0, and the upper limit is 17
# Y-axis: The lower limit is the smallest value in temperatures, the upper limit is the largest value in temperatures
ymin = min(temperatures2018 + temperatures2019) - 1
ymax = max(temperatures2018 + temperatures2019) + 1

plt.axis([0, 17, ymin, ymax])

# Plot the 2018 data
plt.plot(hours2018, temperatures2018, label="2018-10-02")  # Note the new label

# Plot the 2019 data
plt.plot(hours2019, temperatures2019, label="2019-10-02")  # Note the new label

# Have matplotlib generate our legend
plt.legend()

Now, let's do one last thing. Let's say we want to have a line representing the average temperature for every hour between the two datasets. First, let's write a function that takes two lists of equal length and contains the average of each pair of indices.

In [0]:
def average_list(list1: list, list2: list) -> list:
    avg_list = []
    
    for i in range(len(list1)):
        index_avg = (list1[i] + list2[i]) / 2
        avg_list.append(index_avg)
        
    return avg_list

# The hours for the average dataset are the same as hours2018 and hours2019, so we can just set hours_avg equal to either.
hours_avg = hours2018

# Find the average temperatures by hour
temperatures_avg = average_list(temperatures2018, temperatures2019)

print(temperatures_avg)

Now that we have the average data, it's time to plot it. To emphasize that it's the average of the two datasets, let's do two things:
1. Make the line color a combination of the two line colors (this activity uses the fivethirtyeight style by default, which has blue and red lines as the first two colors).

2. Make the line a dotted line rather than a solid line.

In [0]:
# Set the title of the graph
plt.title("Fake Temperature Data, 00:00 - 17:00")

# Label the x-axis
plt.xlabel("Hour")

# Label the y-axis
plt.ylabel("Temperature (F)")

# Make each hour a tick on the x-axis
plt.xticks(hours)

# Adjust the graph's limits. 
# X-axis: The lower limit is 0, and the upper limit is 17
# Y-axis: The lower limit is the smallest value in temperatures, the upper limit is the largest value in temperatures
ymin = min(temperatures2018 + temperatures2019) - 1
ymax = max(temperatures2018 + temperatures2019) + 1

plt.axis([0, 17, ymin, ymax])

# Plot the 2018 data
plt.plot(hours2018, temperatures2018, label="2018-10-02")

# Plot the 2019 data
plt.plot(hours2019, temperatures2019, label="2019-10-02")

# Plot the average data. 
#
# To find a color, refer to
#     1. https://matplotlib.org/2.0.2/api/colors_api.html for the default matplotlib colors
#     2. https://www.google.com/search?q=color+picker for a hexadecimal color picker
#
# To find a linestyle, refer to https://matplotlib.org/3.1.0/gallery/lines_bars_and_markers/linestyles.html

plt.plot(hours_avg, temperatures_avg, label="average", color="#ff00ff", linestyle='dotted')  # Note the new color and linestyle

# Have matplotlib generate our legend
plt.legend()