# Plotting with Matplotlib

In [18]:
%matplotlib notebook

In [19]:
import matplotlib as mpl
mpl.get_backend()

'nbAgg'

In [20]:
import matplotlib.pyplot as plt
plt.plot?

**plt.plot**'s signature : _plt.plot(*args, scalex=True, scaley=True, data=None, ***kwargs)_

The _***args**_ means you can pass unlimited unnamed arguments.

The _****kwargs**_ means you can pass unlimited named arguments.

This makes the _plot_ function very flexible, but also ambiguous when it comes to assigning arguments. 

Basically, the arguments are interpreted as X, Y pairs.

In [6]:
plt.plot(3,2,'.')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x236ed433f08>]

Pyplot is a scripting interface, it keeps track of the latest _**figures, subplots, and axis objects.**_

Pyplot.plot() looks to see if there is a figure that already exists to work on, if not it creates one.

Pyplot.gca: gets current axis. Pyplot.gcf: gets current figure.

In [7]:
plt.figure()
plt.plot(3,2,'o')
ax = plt.gca()
ax.axis([0, 6, 0, 10])

<IPython.core.display.Javascript object>

[0, 6, 0, 10]

axis() set the x and y limits: _axis([xmin,xmax,ymin,ymax])_

In [8]:
ax=plt.gca()
ax.get_children()

[<matplotlib.lines.Line2D at 0x236eecb1448>,
 <matplotlib.spines.Spine at 0x236eeccf208>,
 <matplotlib.spines.Spine at 0x236eeccf1c8>,
 <matplotlib.spines.Spine at 0x236eecf3748>,
 <matplotlib.spines.Spine at 0x236eecf3688>,
 <matplotlib.axis.XAxis at 0x236eeccff48>,
 <matplotlib.axis.YAxis at 0x236ef385cc8>,
 Text(0.5, 1, ''),
 Text(0.0, 1, ''),
 Text(1.0, 1, ''),
 <matplotlib.patches.Rectangle at 0x236eecaa848>]

One line object is the data point. 

Spine are renderings of the borders of the frame. 

Rectangle is the bakcground for the axis.

Basically, **plot()** generates a series of lines that gets rendered against an axis object. Pyplot module has other useful methods in the scripting layer like **scatter().**

# Scatterplots

A scatterplot is a 2-dim plot similar to the line plot we've seen.
**scatter()** takes an x-axis value as a first argument, and y-axis value as the second. 

In [9]:
import numpy as np 
x = np.array([1,2,3,4,5,6,7])
y = x

plt.figure()
plt.scatter(x,x)

<IPython.core.display.Javascript object>

<matplotlib.collections.PathCollection at 0x236ef13d948>

**scatter()** doesn't represent items as a series. It's not like every point has a _x, y, name, and color_. 

Instead, we can pass a list of colors to scatter to represent certain points. 

In [10]:
colors=['green']*(len(x)-1)
colors.append('red')

plt.figure()
plt.scatter(x, x, s=100, c=colors)

<IPython.core.display.Javascript object>

<matplotlib.collections.PathCollection at 0x236ef47e2c8>

### Zip function and list unpacking
the **zip method** takes a number of iterables and creates tuples, matching them based on index.
Zip method returns a generator, to see the results we can use **list()**.

In [15]:
zip_generator = zip([1,2,3,4,5],[6,7,8,9,10])
list(zip_generator)

[(1, 6), (2, 7), (3, 8), (4, 9), (5, 10)]

It's common to store data in **tuples**. It's therefore very important to know to go to and from tuples. 

We can use parameter unpacking with **_zip()_** to turn the tuples back into lists. 

In [18]:
zip_generator = zip([1,2,3,4,5],[6,7,8,9,10])
x,y= zip(*zip_generator)
print(x)
print(y)

(1, 2, 3, 4, 5)
(6, 7, 8, 9, 10)


### Scatterplot: Axis, Legends.

We'll plot the two lists as two data series after slicing them.

In [20]:
plt.figure()
plt.scatter(x[:2], y[:2], s=100, c='red', label='Tall Students')
plt.scatter(x[2:], y[2:], s=100, c='blue', label='Short Students')

<IPython.core.display.Javascript object>

<matplotlib.collections.PathCollection at 0x236ef05d208>

**Axis** generally have labels to them. **Charts** have titles as well. **Legends** are artists and can contains children.

In [21]:
plt.xlabel('The number of ball kicks')
plt.ylabel('The grade')
plt.title('Relationship between ball kicking and grades')

Text(0.5, 1, 'Relationship between ball kicking and grades')

In [22]:
plt.legend()

<matplotlib.legend.Legend at 0x236ef09de08>

In [26]:
#to change the corner where the legend is displayed, get rid of the frame and add a title. 
plt.legend(loc=4, frameon=False, title='Legend')

<matplotlib.legend.Legend at 0x236f0dc0248>

In [27]:
plt.gca().get_children()

[<matplotlib.collections.PathCollection at 0x236ef05de48>,
 <matplotlib.collections.PathCollection at 0x236ef05d208>,
 <matplotlib.spines.Spine at 0x236ef02e3c8>,
 <matplotlib.spines.Spine at 0x236ef02e748>,
 <matplotlib.spines.Spine at 0x236ef02e8c8>,
 <matplotlib.spines.Spine at 0x236ef040548>,
 <matplotlib.axis.XAxis at 0x236ef02e648>,
 <matplotlib.axis.YAxis at 0x236ef040908>,
 Text(0.5, 1, 'Relationship between ball kicking and grades'),
 Text(0.0, 1, ''),
 Text(1.0, 1, ''),
 <matplotlib.legend.Legend at 0x236f0dc0248>,
 <matplotlib.patches.Rectangle at 0x236ef045908>]

In [28]:
legend=plt.gca().get_children()[-2]

In [31]:
legend.get_children()[0].get_children()

[<matplotlib.offsetbox.TextArea at 0x236f0d9d048>,
 <matplotlib.offsetbox.HPacker at 0x236f0d9dd88>]

# Line Plots
Lineplots are created using _**plot()**_. It plots different _series_ of _data points_, connects each series in a point with a line. 

Down below, we'll plot datapoints of two series: _linear_data_ and _quadratic_data_. 

In [21]:
import numpy as np 
import matplotlib.pyplot as plt

linear_data= np.array([1,2,3,4,5,6,7,8])
quadratic_data= linear_data**2

plt.figure()
plt.plot(linear_data, '-o', quadratic_data, '-o')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x2a7661d3048>,
 <matplotlib.lines.Line2D at 0x2a7661dce88>]

We only gave x-axis to the function _**plot()**_, it was smart to understand that the x-axis can be the indices of the series. <br> Unlike **_scatterplot()_**, we don't have to label the data points. 

To plot a dashed line, we can use the following:

In [22]:
plt.plot([21,44,55],'--r')

[<matplotlib.lines.Line2D at 0x2a76654b688>]

In [24]:
plt.xlabel('Some data')
plt.ylabel('Some Other data')
plt.title('A title')
plt.legend(['Europe','Africa','Americas'])

<matplotlib.legend.Legend at 0x2a7661dec88>

To fill between linear data and quadratic data, we'll use as arguments the same **range** of the data points, **lower bounds** and **upper bounds**, the **color** and **transparency**. 

In [25]:
plt.gca().fill_between(range(len(linear_data)), 
                       linear_data, quadratic_data, 
                       facecolor='blue', 
                       alpha=0.25)

<matplotlib.collections.PolyCollection at 0x2a7661fff08>

**np.arrange()** is used to sample dates, but they're not well handled as we can see in the graph. <br> We can use Pandas' **to_datetime()** instead. It converts Numpy dates, into standard library dates expected by Matplotlib.

In [36]:
plt.figure()

obs_dates = np.arange('2020-01-01', '2020-01-09', dtype='datetime64[D]')

plt.plot(obs_dates, linear_data, '-o',
        obs_dates, quadratic_data, '-o')

<IPython.core.display.Javascript object>

[<matplotlib.lines.Line2D at 0x2a769ac3748>,
 <matplotlib.lines.Line2D at 0x2a769af5fc8>]

In [39]:
x.get_children()

[Text(0.5, 26.430785546037878, ''),
 Text(1, 27.964350305663206, ''),
 <matplotlib.axis.XTick at 0x2a769abee48>,
 <matplotlib.axis.XTick at 0x2a769806088>,
 <matplotlib.axis.XTick at 0x2a769af59c8>,
 <matplotlib.axis.XTick at 0x2a769d89388>,
 <matplotlib.axis.XTick at 0x2a769d89a48>,
 <matplotlib.axis.XTick at 0x2a769d8d108>,
 <matplotlib.axis.XTick at 0x2a769d8d808>,
 <matplotlib.axis.XTick at 0x2a769d8d348>]

The tick labels stand for the dates written down. They need to be rotated.

In [40]:
x = plt.gca().xaxis

for item in x.get_ticklabels():
    item.set_rotation(45)

In [41]:
plt.subplots_adjust(bottom=0.25)

Matplotlib as heavily based on LaTeX, we can use mathematics formulas in the titles. 

In [43]:
ax = plt.gca()
ax.set_xlabel("Date")
ax.set_ylabel("Units")
ax.set_title("Quadratic vs. Linear performance")

Text(0.5, 1, 'Quadratic vs. Linear performance')

In [44]:
ax.set_title('Quadratic ($x^2$) vs. Linear ($x$) performance')

Text(0.5, 1, 'Quadratic ($x^2$) vs. Linear ($x$) performance')

# Bar Charts
For bar charts, we pass in the parameters of the **x components** and **the height of the bars**.

In [None]:
plt.figure()
x

In [1]:
linear_data

NameError: name 'linear_data' is not defined