<a href="https://colab.research.google.com/github/albertomanfreda/intensive_school_ml/blob/master/LessonMatplotlib.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Matplotlib

**Matplolitb** is a Python library for plotting, based on NumPy.
Most of its functionalities are accessed through the **pyplot** interface.

In [None]:
""" The following is a special notebook command which specified how matplotlib
figures are displayed in the notebook"""
%matplotlib inline
# The standard pyplot import
from matplotlib import pyplot as plt

A basic concept in Matplotlib is that of **figure**, which is the windows where everything is drawn. Inside a figure there may be one ore multiple **axes**, that contains the graphical objects: axis, lines, shapes etc... 

Figures are created with the *figure()* command. When a figure is created, an axes is automatically created into it (so we do not have to do that manually).

You can create multiple figures, but only one of them can be active at a time (by default the last that was created). plt commands always act on the currently active figure.

Matplotlib by default assign an increasing number to each newly created figure (starting from 1), which acts like an id for the figure; you can specify a different number upon creation.

If you call *figure()* with the number of a figure already existing, you do not create a new one, but rather make that figure the currently active one.

The basic command for plotting data in pyplot is **plot()**, which takes as input two arrays of the same size, a format string and a number of keyword arguments for setting the aspect of the plot.

In [None]:
from matplotlib import pyplot as plt
import numpy as np

# Generate some data
x = np.linspace(0., 10., 20)
y = 3 * x - 1
z = x**2

# Create a new (empty) figure.
plt.figure()
# Let's inspect its number: gcf (get current figure) return the active figure
print(plt.gcf().number)

# Draw the data points. You can set the line and marker properties using
# the format string, or by passing keyword arguments
plt.plot(x, y, 'r^', label='staright line') # 'r^' -> red triangle markers
plt.plot(x, z, linestyle='dashed', color='blue', marker='o', label='parable')

# Matplotlib will adjust the axis range so tht all data points are shown
# If you are not happy with that, you can manually set the limits
plt.xlim(0., 10.)
plt.ylim(0., 100.)

# Give a label to the axis
plt.xlabel('x')
plt.ylabel('f(x)')

# Draw the legend (automatically use the label of each plot)
plt.legend()

# Set the figure title
plt.title('My data')

# Add a grid on the background
plt.grid(True)

# Save the figure
plt.savefig('data_image.png')

# Draw all the figures
plt.show()
 # Show should only be called once at the very end of your program.

In [None]:
# Let's check if the figure is there using a bash command
!ls -hl data_image.png
# And open it
from IPython.display import Image
Image('data_image.png')

You can create a figure with multiple panels using the *subplots()* command.
The syntax of subplots is *suplots(rows, columns, panel number)*, where the last argument select the currently active panel.

In [None]:
from matplotlib import pyplot as plt
import numpy as np

x = np.linspace(0., 1., 20)
# Create a figure divided in 2 x 3 panels and select the first one
# We don't need to call figure(), as it is automatically done for us
plt.subplot(2, 3, 1)
# Draw some data
plt.plot(x, np.exp(x), 'b^')
# Select different panels and draw something else
plt.subplot(2, 3, 5)
plt.plot(x, np.sin(x), 'g.')
plt.subplot(2, 3, 3)
plt.plot(x, np.cos(x), 'k-')
# Show
plt.show()

## Histograms

Histograms are a useful kind of data represantation, so it is worth seeing how they can be created and plotted with NumPy + Matplotlib. Histograms are characterized by their bins - that is the division of the x axis, while on the y axis we have the number of values (or entries) falling into that bin.

In [None]:
from matplotlib import pyplot as plt
import numpy as np
# Generate 5000 random values with gaussian distribution
x = np.random.normal(loc=1., scale=2., size=5000)
# Automatic binning: we specify the numbe of bins
# plt.hist() returns the entries in each bin, the bin edges and the graphical
# objects
entries, bins, patches = plt.hist(x, bins=15, color='green',
                                  edgecolor='none', alpha=0.3)
print(bins)
print(entries)
# We can also manually specify the binning
plt.hist(x, bins=np.linspace(-5., 7., 41), histtype='step', color='steelblue')

# Always set the axis labels!
plt.xlabel('x')
plt.ylabel('entries/bin')

# Show
plt.show()

In [None]:
# 2d histograms
from matplotlib import pyplot as plt
import numpy as np

# Generate some random data
mean = [0, 0]
cov = [[1, 1], [1, 2]]
x, y = np.random.multivariate_normal(mean, cov, 10000).T

# Draw with hist2d
plt.hist2d(x, y, bins=30, cmap='Blues')
cb = plt.colorbar()
cb.set_label('counts in bin')

# Show
plt.show()