# Introduction to Matplotlib


**Matplotlib** is the basic plotting library of Python programming language. It is the most prominent tool among Python visualization packages. Matplotlib is highly efficient in performing wide range of tasks. It can produce publication quality figures in a variety of formats.  It can export visualizations to all of the common formats like PDF, SVG, JPG, PNG, BMP and GIF. It can create popular visualization types – line plot, scatter plot, histogram, bar chart, error charts, pie chart, box plot, and many more types of plot. Matplotlib also supports 3D plotting. Many Python libraries are built on top of Matplotlib. For example, pandas and Seaborn are built on Matplotlib. They allow to access Matplotlib’s methods with less code. 


The project **Matplotlib** was started by John Hunter in 2002. Matplotlib was originally started to visualize Electrocorticography (ECoG) data of epilepsy patients during post-doctoral research in Neurobiology. The open-source tool Matplotlib emerged as the most widely used plotting library for the Python programming language. It was used for data visualization during landing of the Phoenix spacecraft in 2008.


In [None]:
# Import dependencies
import numpy as np
import pandas as pd
import seaborn as sn
import matplotlib.pyplot as plt 

### Plotting from a Jupyter notebook

Interactive plotting within a Jupyter Notebook can be done with the **%matplotlib** command. There are two possible options to work with graphics in Jupyter Notebook. These are as follows:-


•	**%matplotlib notebook** – This command will produce interactive plots embedded within the notebook.

•	**%matplotlib inline** – It will output static images of the plot embedded in the notebook.


After this command (it needs to be done only once per kernel per session), any cell within the notebook that creates a plot will embed a PNG image of the graphic.


In [None]:
%matplotlib inline

x1 = np.linspace(0, 10, 100)

# create a plot figure
fig = plt.figure()

plt.plot(x1, np.sin(x1), '-')
plt.plot(x1, np.cos(x1), '--');

### Matplotlib Object Hierarchy


There is an Object Hierarchy within Matplotlib. In Matplotlib, a plot is a hierarchy of nested Python objects. 
A **hierarchy** means that there is a tree-like structure of Matplotlib objects underlying each plot.


A **Figure** object is the outermost container for a Matplotlib plot. The **Figure** object contain multiple **Axes** objects. So, the **Figure** is the final graphic that may contain one or more **Axes**. The **Axes** represent an individual plot.


So, we can think of the **Figure** object as a box-like container containing one or more **Axes**. The **Axes** object contain smaller objects such as tick marks, lines, legends, title and text-boxes.


![matplotlib_Anatomy](https://raw.githubusercontent.com/fmmb/CEB/main/notebooks/images/matplotlib_anatomy.png)

# Matplotlib API Overview



Matplotlib has two APIs to work with. A MATLAB-style state-based interface and a more powerful object-oriented (OO) interface. 
The former MATLAB-style state-based interface is called **pyplot interface** and the latter is called **Object-Oriented** interface.


## Pyplot API 


**Matplotlib.pyplot** provides a MATLAB-style, procedural, state-machine interface to the underlying object-oriented library in Matplotlib. Each pyplot function makes some change to a figure - e.g., creates a figure, creates a plotting area in a figure etc. 

**Matplotlib.pyplot** is stateful because the underlying engine keeps track of the current figure and plotting area information and plotting functions change that information. To make it clearer, we did not use any object references during our plotting we just issued a pyplot command, and the changes appeared in the figure.

This is really helpful for interactive plotting, because we can issue a command and see the result immediately. But, it is not suitable for more complicated cases.


The following code produces sine and cosine curves using Pyplot API.

In [None]:
# create a plot figure
plt.figure()


# create the first of two panels and set current axis
plt.subplot(2, 1, 1)   # (rows, columns, panel number)
plt.plot(x1, np.sin(x1))


# create the second of two panels and set current axis
plt.subplot(2, 1, 2)   # (rows, columns, panel number)
plt.plot(x1, np.cos(x1));


### plot() - A versatile command


**plot()** is a versatile command. It will take an arbitrary number of arguments. For example, to plot x versus y, 
we can issue the following command:-

In [None]:
plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
plt.show()

### State-machine interface

Pyplot provides the state-machine interface to the underlying object-oriented plotting library. The state-machine implicitly and automatically creates figures and axes to achieve the desired plot. For example:

In [None]:
x = np.linspace(0, 2, 100)

plt.plot(x, x, label='linear')
plt.plot(x, x**2, label='quadratic')
plt.plot(x, x**3, label='cubic')

plt.xlabel('x label')
plt.ylabel('y label')

plt.title("Simple Plot")

plt.legend()

plt.show()

### Formatting the style of plot


For every x, y pair of arguments, there is an optional third argument which is the format string that indicates the color and line type of the plot. We can concatenate a color string with a line style string. The default format string is 'b-', which is a solid blue line. For example, to plot the above line with red circles, we would issue the following command:

In [None]:
plt.plot([1, 2, 3, 4], [1, 4, 9, 16], 'ro')
plt.axis([0, 6, 0, 20])
plt.show()

The **axis()** command in the example above takes a list of [xmin, xmax, ymin, ymax] and specifies the viewport of the axes.

In [None]:
t = np.arange(0., 5., 0.2)

# red dashes, blue squares and green triangles
plt.plot(t, t, 'r--', t, t**2, 'bs', t, t**3, 'g^')
plt.show()

## Object-Oriented API


The **Object-Oriented API** is available for more complex plotting situations. It allows us to exercise more control over the figure. In Pyplot API, we depend on some notion of an "active" figure or axes. But, in the **Object-Oriented API** the plotting functions are methods of explicit Figure and Axes objects.

The following code produces sine and cosine curves using Object-Oriented API.

In [None]:
# First create a grid of plots
# ax will be an array of two Axes objects
fig, ax = plt.subplots(2)


# Call plot() method on the appropriate object
ax[0].plot(x1, np.sin(x1), 'b-')
ax[1].plot(x1, np.cos(x1), 'b-');


### Objects and Reference


The main idea with the **Object Oriented API** is to have objects that one can apply functions and actions on. The real advantage of this approach becomes apparent when more than one figure is created or when a figure contains more than one 
subplot.


We create a reference to the figure instance in the **fig** variable. Then, we ceate a new axis instance **axes** using the 
**add_axes** method in the Figure class instance fig as follows:

In [None]:
fig = plt.figure()

x2 = np.linspace(0, 5, 10)
y2 = x2 ** 2

axes = fig.add_axes([0.1, 0.1, 0.8, 0.8])

axes.plot(x2, y2, 'r')

axes.set_xlabel('x2')
axes.set_ylabel('y2')
axes.set_title('title');

# Figure and Subplots



Plots in Matplotlib reside within a Figure object. As described earlier, we can create a new figure with plt.figure() 
as follows:-


`fig = plt.figure()`


Now, I create one or more subplots using fig.add_subplot() as follows:-


`ax1 = fig.add_subplot(2, 2, 1)`


The above command means that there are four plots in total (2 * 2 = 4). I select the first of four subplots (numbered from 1).


I create the next three subplots using the fig.add_subplot() commands as follows:-


`ax2 = fig.add_subplot(2, 2, 2)`

`ax3 = fig.add_subplot(2, 2, 3)`

`ax4 = fig.add_subplot(2, 2, 4)`

The above command result in creation of subplots. The diagrammatic representation of subplots are as follows:-


![Subplots.png](attachment:Subplots.png)

# Example plots with Matplotlib


Now, we will start seeing different types of plots we can create.

In [None]:
titanic = sn.load_dataset('titanic')
titanic

## Line Plot


We can use the following commands to draw the simple sinusoid line plot:-



In [None]:
# Create figure and axes first
fig = plt.figure()
ax = plt.axes()

# Declare a variable x5
x3 = np.linspace(0, 10, 1000)

# Plot the sinusoid function
ax.plot(x3, 'b-');

## Multiline Plots

Multiline Plots mean plotting more than one plot on the same figure.
It can be achieved by plotting all the lines before calling show(). It can be done as follows:-


In [None]:
x4 = range(1, 5)

plt.plot(x4, [xi*1.5 for xi in x4])
plt.plot(x4, [xi*3 for xi in x4])
plt.plot(x4, [xi/3.0 for xi in x4])

plt.show()

## Scatter Plot

Another commonly used plot type is the scatter plot. Here the points are represented individually with a dot or a circle.

In [None]:
x5 = np.linspace(0, 10, 30)
y5 = np.sin(x5)

plt.plot(x5, y5, 'o', color = 'black');

## Histogram


Histogram charts are a graphical display of frequencies. They are represented as bars. They show what portion of the 
dataset falls into each category, usually specified as non-overlapping intervals. These categories are called bins.

The **plt.hist()** function can be used to plot a simple histogram as follows:-


In [None]:
# using a series object (DataFrame column)
data1 = titanic['age']

plt.hist(data1); 

## Bar Chart


Bar charts display rectangular bars either in vertical or horizontal form. Their length is proportional to the values they represent. They are used to compare two or more values.


We can plot a bar chart using plt.bar() function. We can plot a bar chart as follows:-


In [None]:
titanic

In [None]:
data2 = titanic.groupby('pclass')['age'].mean()

plt.bar(range(len(data2)), data2)

plt.xticks(range(len(data2)), data2.index)

plt.show()

## Horizontal Bar Chart


We can produce Horizontal Bar Chart using the plt.barh() function. It is the strict equivalent of plt.bar() function.



In [None]:
plt.barh(range(len(data2)), data2)
plt.show() 

## Stacked Bar Chart


We can draw stacked bar chart by using a special parameter called **bottom** from the plt.bar() function. It can be done as follows:- 

In [None]:
A = titanic.groupby(['sex', 'pclass'])['age'].count()['female']
B = titanic.groupby(['sex', 'pclass'])['age'].count()['male']

z1 = range(3)

plt.bar(z1, A, color = 'b', label = 'female')
plt.bar(z1, B, color = 'r', bottom = A, label='male')

plt.xticks(z1, B.index)

plt.legend()

plt.show()

The optional **bottom** parameter of the plt.bar() function allows us to specify a starting position for a bar. Instead of running from zero to a value, it will go from the bottom to value. The first call to plt.bar() plots the blue bars. The second call to plt.bar() plots the red bars, with the bottom of the red bars being at the top of the blue bars.

## Boxplot


Boxplot allows us to compare distributions of values by showing the median, quartiles, maximum and minimum of a set of values.

We can plot a boxplot with the **boxplot()** function as follows:-

In [None]:
#data3 = titanic['fare']
data3 = titanic['age'].fillna(titanic['age'].mean())

plt.boxplot(data3)

plt.show();

The **boxplot()** function takes a set of values and computes the mean, median and other statistical quantities. The following points describe the preceeding boxplot:

## Styles with Matplotlib Plots


The Matplotlib version 1.4 which was released in August 2014 added a very convenient `style` module. It includes a number of 
new default stylesheets, as well as the ability to create and package own styles.

In [None]:
# View list of all available styles
print(plt.style.available)

We can set the **Styles** for Matplotlib plots as follows:-


`plt.style.use('seaborn-bright')`


In [None]:
# Set styles for plots
plt.style.use('seaborn-v0_8')

I have set the **seaborn-bright** style for plots. So, the plot uses the **seaborn-bright** Matplotlib style for plots.



# Adding objects to our plots

## Grid

In [None]:
x6 = np.arange(1, 5)

plt.plot(x6, x6*1.5, x6, x6*3.0, x6, x6/3.0)

plt.grid(True)

plt.show()

## Axes


Matplotlib automatically sets the limits of the plot to precisely contain the plotted datasets. Sometimes, we want to set the
axes limits ourself. We can set the axes limits with the **axis()** function as follows:

In [None]:
x7 = np.arange(1, 5)

plt.plot(x7, x7*1.5, x7, x7*3.0, x7, x7/3.0)

plt.axis([0, 5, -1, 13])

plt.show()

If we execute **axis()** without parameters, it returns the actual axis limits.

We can set parameters to **axis()** by a list of four values: [xmin, xmax, ymin, ymax] allows the minimum and maximum limits for X and Y axis respectively.
 

We can control the limits for each axis separately using the `xlim()` and `ylim()` functions. This can be done as follows:-

In [None]:
x8 = np.arange(1, 5)

plt.plot(x8, x8*1.5, x8, x8*3.0, x8, x8/3.0)

plt.xlim([1.0, 4.0])

plt.ylim([0.0, 12.0]);

## X and Y ticks


Vertical and horizontal ticks are those little segments on the axes, coupled with axes labels, used to give a reference system
on the graph.

Matplotlib provides two basic functions to manage them - **xticks()** and **yticks()**.

Executing with no arguments, the tick function returns the current ticks' locations and the labels corresponding to each of them.

We can pass arguments(in the form of lists) to the ticks functions. The arguments are:

1. Locations of the ticks
2. Labels to draw at these locations.

We can demonstrate the usage of the ticks functions in the code snippet below:

In [None]:
u = [5, 4, 9, 7, 8, 9, 6, 5, 7, 8]

plt.plot(u)

plt.xticks([2, 4, 6, 8, 10])
plt.yticks([2, 4, 6, 8, 10])

plt.show()

## Labels


Another important piece of information to add to a plot is the axes labels, since they specify the type of data we are plotting.

In [None]:
plt.barh(range(len(data2)), data2)

plt.xlabel('Average age')
plt.ylabel('Class')

plt.yticks(range(3), [1,2,3])

plt.show() 

## Adding a title


The title of a plot describes about the plot. Matplotlib provides a simple function **title()** to add a title to an image.  

In [None]:
plt.barh(range(len(data2)), data2)

plt.xlabel('Average age')
plt.ylabel('Class')
plt.yticks(range(3), [1,2,3])

plt.title('Average passenger age by class')

plt.show() 

## Adding a legend

Legends can be added in two ways. One method is to use the **legend** method of the axis object and 
pass a list/tuple of legend texts as follows:-

In [None]:
x9 = np.arange(1, 5)

fig, ax = plt.subplots()

ax.plot(x9, x9*1.5)
ax.plot(x9, x9*3.0)
ax.plot(x9, x9/3.0)

ax.legend(['Normal','Fast','Slow']);

The above method is prone to errors and unflexible if lines are added to or removed from the plot.

A better method is to use the **label** keyword argument when plots are added to the figure. Then we use the **legend** method without arguments to add the legend to the figure. 

The advantage of this method is that if lines are added or removed from the figure, the legend is automatically updated accordingly.

In [None]:
x10 = np.arange(1, 5)

fig, ax = plt.subplots()

ax.plot(x10, x10*1.5, label='Normal')
ax.plot(x10, x10*3.0, label='Fast')
ax.plot(x10, x10/3.0, label='Slow')

ax.legend(loc=2);

The **legend** function takes an optional keyword argument **loc**. It specifies the location of the legend to be drawn. 
The **loc** takes numerical codes for the various places the legend can be drawn. The most common **loc** values are as follows:-

ax.legend(loc=0)  # let Matplotlib decide the optimal location

ax.legend(loc=1)  # upper right corner

ax.legend(loc=2)  # upper left corner

ax.legend(loc=3)  # lower left corner

ax.legend(loc=4)  # lower right corner

ax.legend(loc=5)  # right

ax.legend(loc=6)  # center left

ax.legend(loc=7)  # center right

ax.legend(loc=8)  # lower center

ax.legend(loc=9)  # upper center

ax.legend(loc=10) # center

## Colours

We can draw different lines or curves in a plot with different colours. In the code below, we specify colour as the last argument to draw red, blue and green lines.

In [None]:
x11 = np.arange(1, 5)

plt.plot(x11, 'r')
plt.plot(x11+1, '#FF00FF')
plt.plot(x11+2, 'b')

plt.show()

The colour names and colour abbreviations are given in the following table:-


**Colour abbreviation**      **Colour name**

b                               blue

c                               cyan

g                               green

k                               black

m                               magenta

r                               red

w                               white

y                               yellow

There are several ways to specify colours, other than by colour abbreviations:
    
•	The full colour name, such as yellow

•	Hexadecimal string such as ##FF00FF


## Line styles


Matplotlib provides us different line style options to draw curves or plots. In the code below, we use different line styles to draw different plots.

In [None]:
x12 = np.arange(1, 5)

plt.plot(x12, '--', x12+1, '-.', x12+2, ':')

plt.show()

The above code snippet generates a blue dashed line, a green dash-dotted line and a red dotted line.

All the available line styles are in the following table:


**Style abbreviation**  **Style**

\-                   solid line
   
--                   dashed line
   
-.                   dash-dot line
   
:                    dotted line