# Module 4.1 - Plotting Basics

Matplotlib is a Python 2D plotting library which can produce publication quality figures.

You can generate plots, histograms, power spectra, bar charts, errorcharts, scatterplots, etc, with just a few lines of code.

Check the link for more details: https://matplotlib.org/stable/gallery/index.html

-----------------------------------------------------------------------

## Creating a basic plot

- Use the following line to allow the plots to be displayed as part of the jupyter notebook:
```    %matplotlib inline ```
- After this, we need to import the plot module from matplotlib
```python
  %matplotlib inline
  import matplotlib.pyplot as plt
```
- First, we create a new figure and new axes by using the code: `fig, ax = plt.subplots()`
- The figure is (as the name says) the figure itself, while the axes is the place where data is plotted.
- We can now plot a line by inputting an array of x-coordinates and an array of y-coordinates using the `.plot()` function on `ax` like this:
``` ax.plot([1, 2, 3, 4], [2, 4, 6, 12]) ```

In [None]:
# The following line to allows the plots to be displayed as part of the jupyter notebook
%matplotlib inline

# Import matplotlib's function pyplot to make simple plots
import matplotlib.pyplot as plt # To avoid using big names rename the module as plt

fig, ax = plt.subplots()
ax.plot([1, 2, 3, 4], [2, 4, 6, 12]) 

## Saving a plot

Once you have created a figure (using the `fig, ax = plt.subplots()` function) and plotted some data (using the `ax.plot(SOME_DATA)` function), you can save your figure to a file. 

There are multiple filetypes you can save your plot in. Here are three common ones:
- Saving your figure as .png: `fig.savefig('basic_plot.png')`
- Saving your figure as .jpg: `fig.savefig('basic_plot.jpg')`
- Saving your figure as .pdf: `fig.savefig('basic_plot.pdf')` (Use this one for publication.)

Run the three cells below and try to find the created files in your file explorer.
![image.png](attachment:image.png)

In [None]:
# In order to save the plot in the folder in which the script is located, an additional command is needed
fig.savefig('basic_plot.png') # Save figure as .png!

In [None]:
# Now let's save our plot as a .jpg file
fig.savefig('basic_plot.jpg') # Save figure as .jpg!

In [None]:
# And as .pdf (the one most useful for publication!)
fig.savefig('basic_plot.pdf') # Save figure as .pdf!

## Plotting data

When plotting data using the `ax.plot()` function, you must give data as input to the function. There are two options:
1. You input only one data series, then this is considered as data for the y-axis and the data for the x-axis will be by default `[0, 1, 2, 3, 4, 5, ...]`, with the same length as the y-axis data. 
    - Example: `ax.plot([1, 1, 2, 3, 5, 8, 13])`


2. You input two data series, one for the x-axis and one for the y-axis. 

    - Example: `ax.plot([0, 2, 4, 6, 8, 10], [1, 2, 4, 8, 16, 32])`
    
In the cells below the examples are shown.

In [None]:
# Example with only y-axis data
fig1, ax = plt.subplots()
ax.plot([1, 1, 2, 3, 5, 8, 13])

In [None]:
# Example with both x-axis and y-axis data
fig2, ax = plt.subplots()
ax.plot([0, 2, 4, 6, 8, 10], [1, 2, 4, 8, 16, 32])

## Plotting data - Exercise
1. Create two lists with random numbers, name them `x` and `y`, and make sure they have the same length! 
2. Create a new fig and ax using the function `plt.subplots()` and plot only `y` using the function `ax.plot()`
3. Create a new fig and ax using the function `plt.subplots()` and plot both `x` and `y` using the function `ax.plot()`
    - When supplying both x and y data, always first input the x-axis data and then the y-axis data, `ax.plot(x, y)` 
    
    
4. Save the last plot as a PDF file, using the name `'my_plot.pdf'`

In [None]:
# 1. Create two lists with random numbers and name them x and y
x = [1000, 1337, 1411, 1448, 1485, 1559, 1600, 1596, 1633, 1707, 1744, 1781, 2000]
y = [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233]

In [None]:
# 2. Create a new fig and ax using the function plt.subplots() and plot only y using the function ax.plot()
fig, ax = plt.subplots()
ax.plot(y)

In [None]:
# 3. Create a new fig and ax using the function plt.subplots() and plot both x and 'y' using the function ax.plot()
fig, ax = plt.subplots()
ax.plot(x, y)

fig.savefig('my_plot.pdf')

In [None]:
# 4. Save the last plot as a PDF file, using the name 'my_plot.pdf'
# See cell above

## Titles, labels and plotting multiple lines

- There are many ways to change and upgrade your plot
- Using the function `ax.set_` you can add many things to your plot (try out `ax.set_` and press **tab** to see the options)
- Here are some examples:
    - Give the x-axis a label: `ax.set_xlabel('Hours')`
    - Give the y-axis a label: `ax.set_ylabel('Temperature')`
    - Give the plot a title: `ax.set_title('Changing temperature over time')`


In [None]:
# Create a figure and ax
fig, ax = plt.subplots()

# Plot a line
import numpy as np
x = range(25)
y = np.random.randint(-10, 45, 25)
ax.plot(x, y) 

# Add Label for the x-axis 
ax.set_xlabel('Hours') 

# Add Label for the y-axis 
ax.set_ylabel('Temperature') 

# Add Title to the plot 
ax.set_title('Changing temperature over time') 

### Plotting multiple lines

It is possible to plot multiple lines in the same plot. To distinguish the lines you should give every line a label. Matplotlib automatically gives every line a different color. It is also possible to add a legend to your plot.  
- To plot an extra line you just repeat again the plotting function: `ax.plot()`
- To give the line a label, add the `label=` argument to the plot function and provide a name: `ax.plot(x1, y1, label='Data 1')`
- You can add a legend to the plot by using the function `ax.legend()` (It will only contain the lines that have a label.)

Here are two examples:

In [None]:
# Example 1

# Creating fig and ax
fig_example1, ax = plt.subplots()

# Creating x and y
x = np.array(range(100))
y = np.array(range(100))

# Plot variations of x and y
ax.plot(x, y, label='Original data')
ax.plot(x, y*2, label='Data doubled')
ax.plot(x, y*3, label='Data tripled')
ax.plot(x, y*4, label='Data quadrupled')

# Add a legend
ax.legend()

In [None]:
# Example 2
fig_example2, ax = plt.subplots()

# First line
x1 = np.array(range(10))
y1 = [0] * 10
ax.plot(x1, y1, label='Baseline')

# Second line
x2 = np.array(range(10))
y2 = np.random.randint(-2, 3, 10)
ax.plot(x2, y2, label='Random line')

# Third line
x3 = np.random.randint(0, 10, 10)
y3 = np.random.uniform(-1.0, 1.0, 10)
ax.plot(x3, y3, label='Chaos line')

# Adding title, axis labels and a legend to the plot
ax.set_title('Three lines')
ax.set_xlabel('x-axis')
ax.set_ylabel('y-axis')
ax.legend()

## Plotting multiple lines - Exercise

Given are the lists `time`, `temp_max`, and `temp_min`. Complete the following exercises:
1. Create a fig and ax 
2. Plot `time` vs. `temp_max` and give the line a label (use the argument `label=`)
3. Now on the same axes also plot `time` vs. `temp_min` and again provide a label
4. Give the plot a title
5. Set the labels of the x-axis and y-axis
6. Add a legend to the plot

In [None]:
# Use these lists to create your plot
time = range(12)
temp_max = [18.3, 22.2, 26.7, 34.0, 37.1, 39.9, 41.0, 38.4, 36.0, 33.9, 31.5, 28.7]
temp_min = [14.1, 18.9, 23.2, 29.0, 31.6, 34.6, 34.0, 33.3, 31.9, 28.8, 27.1, 23.6]

In [None]:
# Create your plot here

#1
fig, ax = plt.subplots()

#2
ax.plot(time, temp_max, label='Maximum temperature')

#3
ax.plot(time, temp_min, label='Minimum temperature')

#4
ax.set_title('Maximum and minimum temperature')

#5
ax.set_xlabel('Time (hours)')
ax.set_ylabel('Temperature (degrees Celsius)')

#6
ax.legend()

## Changing line visuals

As we have seen before, a function can take different arguments. This is a way to influence what the function does. The function for plotting a line (`ax.plot()`) also can take many arguments. In the previous part we saw the argument `label=`. Now we will look at arguments to change how the line looks. 

A lot of settings of the line we can change. We can change:
1. The color of a line using the arguments `color=` or `c=`
    - A few of the basic colors:
    ![image-5.png](attachment:image-5.png)
    - For more color options, go to this website and scroll down, https://matplotlib.org/stable/gallery/color/named_colors.html
    
    
    
2. The style of a line using the arguments `linestyle=` or `ls=`
    - Some basic linestyles:
    ![image-7.png](attachment:image-7.png)
    - Instead of the names you can also use the following symbols as a string: `-`, `:`, `--`, `-.`
    
    

3. The width of a line using the arguments `linewidth=` or `lw=`
    - Provide a number to set the width (the default linewidth is 1.5)
    
    
    
4. Adding markers and setting marker shape using the argument `marker=`
    - Every datapoint gets its own marker. Some of the options are:
    ![image-11.png](attachment:image-11.png)

Let's go over some examples to see the different options. Read through the code carefully and **try to predict** what type of lines you will see when you run the code. 

In [None]:
# Start with creating fig and ax
fig, ax = plt.subplots()

# The first line. What will be the color and the linestyle?
x1 = np.array(range(14)) / 2
y1 = np.cos(x1)
ax.plot(x1, y1, label='Cosine', color='b', linestyle=':')

# The second line. What will be the color and the linestyle?
y2 = np.sin(x1)
ax.plot(x1, y2, label='Sine', c='y', ls='--')

# Adding title and legend
ax.set_title('Cosine and Sine')
ax.legend()

In [None]:
# Again start with creating fig and ax
fig, ax = plt.subplots()

# The first line. Markers? Marker type? Linewidth? 
x1 = np.array(range(-10, 11)) / 2
y1 = x1**2
ax.plot(x1, y1, label='Parabole 1', marker='v', linewidth=1)

# The second line. Markers? Marker type? Linewidth? 
x1 = np.array(range(-10, 11)) / 2
y1 = x1**2 + 2*x1 - 5
ax.plot(x1, y1, label='Parabole 2', marker='*', lw=2)

# The second line. Markers? Marker type? Linewidth? 
x1 = np.array(range(-10, 11)) / 2
y1 = -x1**2 + 2*x1 + 10
ax.plot(x1, y1, label='Parabole 3', lw=10)

# Adding title and legend
ax.set_title('Some parabolas')
ax.legend()

## Changing line visuals - Exercise

In the cell below there are 4 lines plotted. Change the code of the plot to complete the following exercises:
1. Change the colors of lines 1, 2, 3, and 4 to blue, green, red, and yellow, respectively
2. Change the linestyles of line 2 and 4 to dotted and dashed, respectively
3. Add star shaped markers to line 4
4. Set the linewidth of lines 1, 2, and 3 to a width of 5, 10 and 15, respectively

In [None]:
fig, ax = plt.subplots()

x = range(0, 40, 2)
y1 = np.array(range(0, 20, 1))**2
y2 = np.array(range(20, 0, -1))**2
y3 = np.array(range(0, 20, 1))**2-100
y4 = range(0, 160, 8)

# Line 1
ax.plot(x, y1, label='Line 1', c='b', lw=5)

# Line 2
ax.plot(x, y2, label='Line 2', c='g', ls=':', lw=10)

# Line 3
ax.plot(x, y3, label='Line 3', c='r', lw=15)

# Line 4
ax.plot(x, y4, label='Line 4', c='y', ls='dashed', marker='*')

ax.legend()

## Limits and ticks 

Matplotlib automatically sets the limits of the x-axis and y-axis and the ticks and tick labels. When you create a plot these settings will be determined by matplotlib:
```python
ax.plot([1, 2, 4, 8, 16, 32, 64, 128])
```
![image.png](attachment:image.png)

Matplotlib chose automatically:
- x limit is from 0 up to 7
- y limit is from 0 up to 130
- x ticks are [1, 2, 3, 4, 5, 6, 7]
- y ticks are [0, 20, 40, 60, 80, 100, 120]


However, in some cases these automatic limits, ticks and tick labels are not useful. Therefore we can change these.

### Changing x limit and y limit

We can set the x limit and y limit ourselves: 

- This can be done using the functions `ax.set_xlim()` and `ax.set_ylim()`
- These functions take two inputs: left and right, where the limit has to start and where to stop


Let's say we have some scores over time. The maximum score and maximum time that can be achieved is a score of 50, for 20 minutes. However when we plot the first set of scores, the limits are 0-15 for the x-axis and 0-26 for the y-axis. 

1. Run the code to see how matplotlib sets the limits: 
2. Now remove the `#` in front of the `ax.set_xlim()` and `ax.set_ylim()` functions. How does the plot change?

In [None]:
time = range(15)
score = np.array([0, 13, 19, 15, 20, 19, 14, 25, 16, 12, 12, 26, 16, 21, 13])

fig, ax = plt.subplots()
ax.plot(time, score)

ax.set_xlabel('Minutes')
ax.set_ylabel('Score')

# Remove the '#' from the two lines below to set the xlim and ylim by ourselves
ax.set_xlim(0, 20)
ax.set_ylim(0, 50)

### Changing ticks and tick labels

You can also set the ticks and tick labels by yourself. 
- Ticks can be set by using the functions `ax.set_yticks()` and `ax.set_xticks()`
- The labels can be set by using the functions `ax.set_yticklabels()` and `ax.set_xticklabels()`

Let's say that for the x-axis we don't want the default ticks, but we want a tick every 2 minutes. We can change this using the function `ax.set_xticks()` with `range(0, 22, 2)` as input. 

For the y-axis, instead of showing the score, we want to show the percentages, with a score of 50 being 100%. We can change these ticks and tick labels by first setting the yticks ourselves (`ax.set_yticks([10, 20, 30, 40, 50])`). Then we can use the function `ax.set_yticklabels()` with a list of labels as input. Let's use the labels `['20%', '40%', '60%', '80%', '100%']`.

In the cell below the code from the previous cell is copied and some functions added.  
1. Remove the `#` in front of the function that sets the x ticks and run the cell
2. Remove the `#` in front of the functions that set the y ticks and y labels and run the cell

In [None]:
time = range(15)
score = np.array([0, 13, 19, 15, 20, 19, 14, 25, 16, 12, 12, 26, 16, 21, 13])

fig, ax = plt.subplots()
ax.plot(time, score)

ax.set_xlabel('Minutes')
ax.set_ylabel('Score')

ax.set_xlim(0, 20)
ax.set_ylim(0, 50)

# Remove the '#' in front of the function, run the cell and see what changes
ax.set_xticks(range(0, 22, 2))

# Now also remove the '#' in front of these two fuctions and see what changes
ax.set_yticks([10, 20, 30, 40, 50])
ax.set_yticklabels(['20%', '40%', '60%', '80%', '100%'])

## Secondary axis

Sometimes, you want to plot two dataseries in the same figure, but they both have a very different scale. 

Let's look for example at the plot below.

In [None]:
time = range(12)
temperature = [18.3, 22.2, 26.7, 34.0, 37.1, 39.9, 41.0, 38.4, 36.0, 33.9, 31.5, 28.7]
humidity = [90.0, 77.3, 85.2, 83.6, 85.1, 86.8, 97.7, 79.2, 86.5, 76.1 , 76.6, 80.9]

fig, ax = plt.subplots()
ax.plot(time, temperature, label='Temperature')
ax.plot(time, humidity, label='Humidity', ls=':')

ax.set_title('Temperature and Humidity')
ax.set_xlabel('Hours')
ax.set_ylabel('% humidity and degrees Celsius')
ax.legend()

Now the values for the temperature and humidity are plotted on the same y-axis. This causes the two lines to be far apart from eachother. Ideally, we would have two y limits and two different ranges of y ticks. 

This can be done by creating a secondary axis. With the method `.twinx()`, a new axes is created, that shares the x-axis of the axes to which you used the method, but has a separate (secondary) y-axis.

Let's recreate the plot above, only now with a secondary axis for humidity. 

In [None]:
# Create the first ax
fig, ax = plt.subplots()
ax.plot(time, temperature, label='Temperature')

ax.set_title('Temperature and Humidity')
ax.set_xlabel('Hours')
ax.set_ylabel('degrees Celsius')
ax.set_ylim(15, 50)

# Create the secondary axis
ax_hum = ax.twinx()

# Plot on the secondary axis
ax_hum.plot(time, humidity, label='Humidity', ls=':', c='y')

# Set the y-axis label for the humidity
ax_hum.set_ylabel('% humidity')

# Because we have two axes, we need to use fig to add a legend
fig.legend()

## Limits, ticks and secondary axis - Exercise

Given are the datasets `days`, `costs`, and `customers` and a simple plot. Complete the following exercises:


1. Instead of plotting the customers on `ax`, create a secondary axis to plot the customer data on
    - Use the function `.twinx()`
2. Change the y limit of the secondary axis, making it start at 0 and end at 80
    - Use the function `.set_ylim()`
3. Set a label on the y-axis of the secondary axis
    - Use the function `.set_ylabel()`

Instead of `[1, 2, 3, 4, 5, 6, 7]` as x ticks, we want the days of the week as x ticks (`['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat']`). 

4. Set the x ticks to 1, 2, 3, 4, 5, 6, 7
    - Use the function `.set_xticks()`
5. Now change the tick labels into the days of the week 
    - Use the function `.set_xticklabels()`

In [None]:
days = [1, 2, 3, 4, 5, 6, 7]
costs = [2300, 3600, 4500, 5500, 6500, 7700, 9000]
customers = [13, 20, 28, 34, 42, 52, 61]

fig, ax = plt.subplots()
ax.plot(days, costs, label='Costs', c='y', ls='dashed')
ax.set_title('Costs and customers of my restaurant')
ax.set_ylabel('Costs (Birr)')
ax.set_xlabel('Time (days)')

# 1.
ax_cust = ax.twinx()
ax_cust.plot(days, customers, label='Customers')

# 2. 
ax_cust.set_ylim(0, 80)

# 3. 
ax_cust.set_ylabel('Number of customers')

# 4. 
ax.set_xticks([1, 2, 3, 4, 5, 6, 7])

# 5. 
ax.set_xticklabels(['Sun', 'Mon', 'Tue', 'Wed', 'Thu', 'Fri', 'Sat'])

fig.legend()

## Additional exercises

Last module we worked with the dataset called `Module3_3_meteoData.csv`. Now that we know how to make plots, we can use these skills to plot the data in `meteoData`. (Plotting `meteoData` will also be the main subject of module 4.2 later this week.)

1. Load the file `Module3_3_meteoData.csv` using the pandas function for reading CSV files.
2. Create a new column called `Timestamp` containing the dates from column `Datetime` converted into timestamps, and use the pandas function for changing a string into a timestamp (hint: don't forget to set `dayfirst=True`)

In [None]:
#1
import pandas as pd
meteoData = pd.read_csv('Module3_3_meteoData.csv')

#2
meteoData['Timestamp'] = pd.to_datetime(meteoData.Datetime, dayfirst=True)


Here is an example of how we can plot a column of `meteoData` vs time. We will plot the values of `Temp_Out` for the first 7 days of 2018 against time.
- Firstly, we select the data that we want to plot
- Secondly, we plot the data
- Thirdly, we make some changes to the plot

In [None]:
# Firstly, we select the data that we want to plot
starttime = pd.to_datetime('2018')
endtime = pd.to_datetime('2018') + pd.to_timedelta(1, unit='W')

Temp_Time_Jan_2018 = meteoData.loc[(meteoData.Timestamp >= starttime) & 
                                   (meteoData.Timestamp < endtime), ['Timestamp', 'Temp_Out']]

timestamps = Temp_Time_Jan_2018.Timestamp
temperatures = Temp_Time_Jan_2018.Temp_Out

# Secondly, we plot the data
fig, ax = plt.subplots(figsize=(12,6))
ax.plot(timestamps, temperatures, label='Temperatures', c='r')

# Thirdly, we make some changes to the plot
ax.set_title('Temperatures for the first 7 days of 2018')
ax.set_xlabel('Time')
ax.set_ylabel('Degrees Celsius')

3. Plot `Temp_Hi` vs time for the first three days of september 2016
4. To the same plot add `Temp_Low` for the same time period and give this line a different color and style
5. Add a title, x-axis label and y-axis label to the plot

In [None]:
# 3, 4, and 5
# Selecting the data that we want to plot
starttime = pd.to_datetime('2019-09')
endtime = starttime + pd.to_timedelta(3, unit='D')

MD_sep2016 = meteoData.loc[(meteoData.Timestamp >= starttime) & (meteoData.Timestamp < endtime)]

# Plotting the data
fig, ax = plt.subplots(figsize=(12,6))

ax.plot(MD_sep2016.Timestamp, MD_sep2016.Temp_Hi, label='High Temperatures', c='r')
ax.plot(MD_sep2016.Timestamp, MD_sep2016.Temp_Low, label='Low Temperatures', c='b', ls='--')

# Making some changes to the plot
ax.set_title('Temperature and humidity for first three day of September 2016')
ax.set_xlabel('Time')
ax.set_ylabel('Temperature (degrees Celsius)')

# 6, 7, and 8
ax_hum = ax.twinx()
ax_hum.plot(MD_sep2016.Timestamp, MD_sep2016.Hum_Out, label='Relative Humidity', c='y', ls=':')

ax_hum.set_ylim(0, 100)
ax_hum.set_ylabel('Relative Humidity (%)')

6. For the plot in the cell above, create a secondary axis and plot `Hum_Out` on that axis
7. Set the y limit for the secondary axis from 0 to 100
8. Give the secondary y-axis a label