# Lesson 5 Matplotlib Basics - tools for visualizing data

## Introduction

### The `matplotlib` **module** is used the basic module for data visualization using Python. 

### Other modules have been developed, e.g., seaborn, https://seaborn.pydata.org/, but these are (almost) always built on top of `matplotlib`

### The vast majority of the data visualization make use of a single submodule called `pyplot` 


## 5.1 Import the pyplot submodule

### Its common practice to import pyplot as plt

In [None]:
from matplotlib import pyplot as plt

### The above line imports the pylot submodule of matplotlib and gives it the short name plt

### And as always, I'm going to import numpy

In [None]:
import numpy as np

### For the sake of todays exercise, I am going to make a sine and a cosine function 

In [None]:
angle = np.linspace(0,2*np.pi,100) #I make 100 evenly spaces values going from 0 to 2*pi
C = np.cos(angle) #cosine function 
S = np.sin(angle) #sine function

## 5.2 Simple Line Plots

Perhaps not so surprisingly, the simplest command to make a plot in pyplt is called `plot`.  Here is a simple example. 

In [None]:
plt.plot(angle,S)
plt.show() # this makes a clean display of the plot.   

### This is the simplest, easiest way to invoke a plot in python.  And when you just want a quick plot, this is the way to do it. 

### However, when you want to make a high quality plot, you want to make use of a different interface to pyplot.  Today's lesson will make use of that interface which is often referred to as the **object oriented** interface.  

In [None]:
#object-oriented interface
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(angle,S) #add a plot to the axes.  
plt.show()

### The code above had 3 parts to it. <br>
* We created a figure *object* named **fig**.  This object has properties which control the figure. <br>
* We created an axes *object* named **ax**.  This object has properties which control the axis. <br>
* We added a plot to the axes **ax**  <br>

### When we use `plot` we are in fact invoking a figure and an axes, then adding a plot.  But, by using `plot` we are making use of the default values. 

### We can add a second line to the same plot, by calling plot again.  



In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(angle,S) #add a plot to the axes.  
ax.plot(angle,C) #add a plot to the axes.  
plt.show()

## 5.3 Labeling the plot. 

### What's wrong with that plot?   Its missing information to be able to interpret the plot.  

* Axes labels  - What is the x axis, what is the y axis? 
* Line labels  - which one is sine, and which one is cosine? 
* Title (optional) - maybe I need a title. 

In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(angle,S,label ='sine')  #add a plot to the axes.
                                #I also gave the line a label. 
ax.plot(angle,C,label = 'cosine' ) #add a plot to the axes.
                                #I also gave the line a label.  
ax.set_xlabel('Angle(radians)') #I added a x axis label 
ax.set_ylabel('Function value') #I added a y axis label
ax.legend() #I gave the figure a legend.  
ax.set_title('Trignometric Functions') # Let's give the figure a title.                              
plt.show() 

## 5.4 Improving the communication of plots.  

### Scientifically, one might argue that we are done now, and all necessary information to understand the plot is given, but it still *sucks*. 

### Why?  

### 5.4.1.  UNITS

In [None]:
angle = np.linspace(0,2*np.pi,100) #I make 100 evenly spaces values going from 0 to 2*pi
C = np.cos(angle) #cosine function 
S = np.sin(angle) #sine function
angle_in_degrees = angle*180/np.pi

In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(angle_in_degrees,S,label ='sine')  #add a plot to the axes.
                                #I also gave the line a label. 
ax.plot(angle_in_degrees,C,label = 'cosine' ) #add a plot to the axes.
                                #I also gave the line a label.  
ax.set_xlabel('Angle(Degrees)') #I added a x axis label 
ax.set_ylabel('Function value') #I added a y axis label
ax.legend() #I gave the figure a legend.  
ax.set_title('Trignometric Functions') # Let's give the figure a title.                              
plt.show() 

### 5.4.2 READING GRAPHS  
### We're almost there, but I'm still not happy. 
### When I look at a plot, I want to be able to *easily read out features of the data*.  
### I also want to be able to easily understand the domain and range of the the data and find the minimum and maximum.  
### Here the domain is 0 to 360 and the range is -1 to 1. 
### The features of the data are clear maxima and minima at function values -1 and 1 of the curves plotted but its not that easy to read out at values of angle at which the maxima occur.  
### The solution is to take control of the x and/or y axis values.

In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(angle_in_degrees,S,label ='sine')  #add a plot to the axes.
                                #I also gave the line a label. 
ax.plot(angle_in_degrees,C,label = 'cosine' ) #add a plot to the axes.
                                #I also gave the line a label.  
ax.set_xlabel('Angle(Degrees)') #I added a x axis label 
ax.set_ylabel('Function value') #I added a y axis label
ax.legend() #I gave the figure a legend.  
ax.set_title('Trignometric Functions') # Let's give the figure a title. 
xticklocations = np.linspace(0,360,9) # I am going to determine to have 9 ticks on the x axis between 0 and 360(inclusive)
ax.set_xticks(xticklocations)
#Note I could make a similar set of calls to `set_yticks` but I am happy with the default.  
plt.show()

### **OK I'm feeling better**  
### But, a plot can always be improved. How about this. 

In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(angle_in_degrees,S,label ='sine')  #add a plot to the axes.
                                #I also gave the line a label. 
ax.plot(angle_in_degrees,C,label = 'cosine' ) #add a plot to the axes.
                                #I also gave the line a label.  
ax.set_xlabel('Angle(Degrees)') #I added a x axis label 
ax.set_ylabel('Function value') #I added a y axis label
ax.legend() #I gave the figure a legend.  
ax.set_title('Trignometric Functions') # Let's give the figure a title. 
xticklocations = np.linspace(0,360,9) # I am going to determine to have 9 ticks on the x axis between 0 and 360(inclusive)
ax.set_xticks(xticklocations)
#Note I could make a similar set of calls to `set_yticks` but I am happy with the default.  
plt.grid(True)  #Here I turned on grid lines, to improve readability.  
plt.show()

### Now we can start thinking about aesthetics.  I like the default blue, orange lines, but others may disagree.  
### Try this. 

In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(angle_in_degrees,S,'r-', linewidth = 2, label ='sine')  #add a plot to the axes.
                                #I also gave the line a label.
                                # I selected the color red and a solid line
                                # I set the linewidth to 2, default is 1.   
ax.plot(angle_in_degrees,C,'g--', linewidth = 2, label = 'cosine' ) #add a plot to the axes.
                                #I also gave the line a label.  
                                # I selected the color green and a dashed line
                                # I set the linewidth to 2, default is 1.  
ax.set_xlabel('Angle(Degrees)') #I added a x axis label 
ax.set_ylabel('Function value') #I added a y axis label
ax.legend() #I gave the figure a legend.  
ax.set_title('Trignometric Functions') # Let's give the figure a title. 
xticklocations = np.linspace(0,360,13) # I am going to determine to have 13 ticks on the x axis between 0 and 360(inclusive)
ax.set_xticks(xticklocations)
#Note I could make a similar set of calls to `set_yticks` but I am happy with the default.  
plt.grid(True)  #Here I turned on grid lines, to improve readability.  
plt.show()

### Lets try to code something together now.  
### Make a plot that shows 

* $y = x$ , labeled 'linear'
* $y = x^2$, labeled 'quadratic'
* $y = x^3$, labeled 'cubic

### over the interval from x = -2 to x = 2 

### label the axes 'x' and 'y'

### use a different color for each line that you choose. 

## 5.5 Bar plots

### It makes sense to use a line plot when your domain (x-axis) is a continuous variable.  But it does not make sense to do this when your domain is a categorical variable.  


In [None]:
fruitnames = ['apple','pear','mango','rambutan'] #This is a list
fruitnumber = np.array([7,6,2,3]) #I converted a list of fruit counts into a numpy array. Actually not needed.


In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(fruitnames,fruitnumber)
ax.set_xlabel('Fruit')
ax.set_ylabel('Number')
plt.show()

In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.plot(fruitnames,fruitnumber,'bo') #Here I forced it to use a blue circle for each data point 
ax.set_xlabel('Fruit')
ax.set_ylabel('Number')
plt.show()

### Clearly, the plot below is superior to the plot above.  Drawing a continuous line between categorical variables doesnt make sense because there really isnt a value in between orange and dragon fruit.  

### But the best solution would be a `bar` graph.  


In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.bar(fruitnames,fruitnumber) #Here I forced it to use a blue circle for each data point 
ax.set_xlabel('Fruit')
ax.set_ylabel('Number')
plt.show()

In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.bar(fruitnames,fruitnumber) #Here I forced it to use a blue circle for each data point 
ax.set_xlabel('Fruit')
ax.set_ylabel('Number')
ax.grid(True,axis='y') #I added a grid and specified it should only be for the y axis 
plt.show()

In [None]:
fig = plt.figure()   #creates a blank canvas 
ax = fig.add_axes([0,0,1,1]) #creates an axes - (starting pt x, starting pt y, fractional size x, fractional size y)
ax.bar(fruitnames,fruitnumber) #Here I forced it to use a blue circle for each data point 
ax.set_xlabel('Fruit')
ax.set_ylabel('Number')
ax.grid(True,axis='y',color='r')#I added a grid and specified it should only be for the y axis and set its color to red
plt.show()

## 5.6 Subplot


### Sometimes we want to put more than one graph in a figure.  In this case, we can divide the figure into multiple plots. 

### The syntax `subplot(n,m)`  when I create the figure tells python I want a figure with with n row and m columns each of which can contain a figure  

In [None]:
#Create some data for four plots
x = np.arange(1,10) 
y1 = x**2   
y2 = np.sqrt(x)
y3 = np.exp(x)
y4 = np.log(x)

In [None]:
fig,a = plt.subplots(2,2)  #Here i create both the figure (fig) and the axes (a) in a single step.  
                           #Any options that you would send into the figure call, you can send to subplots.  
                           # I've asked for 4 subplots in a 2 x 2 grid.   

a[0][0].plot(x,y1)   #notice the syntax in dealing with the axis. The axes have a row index and a column index. 
a[0][0].set_title('square')
a[0][1].plot(x,y2)
a[0][1].set_title('square root')
a[1][0].plot(x,y3)
a[1][0].set_title('exponential')
a[1][1].plot(x,y4)
a[1][1].set_title('log')

#fig.tight_layout()  #this is really cool and fixes many problems! 

### Lets place these plots in the 4 corners of a 3 x 3 grid. 