<div class="licence">
<span>Licence CC BY-NC-ND</span>
<span>Valérie Roy</span>
<span><img src="../media/ensmp-25-alpha.png" /></span>
</div>

# data visualization in Python

See also

- https://github.com/rougier/matplotlib-tutorial#introduction
- https://www.labri.fr/perso/nrougier/python-opengl/

## matplotlib

   - the project started in $\approx$ **2003**
   - it is inspired by **MATLAB**
   - it was the first **Python data visualization** library
   - and it is **for the time being** the **most popular** library
   - there is an **active developer community**
   - its **license** is **based** on the **Python Software Foundation** (PSF) **license**
   - https://matplotlib.org/

   - it is a **2D plotting** library
   - it can be used with the **Jupyter notebook**
   - the **3d plotting** is a **mpl toolkit** (*from mpl_toolkits.mplot3d import Axes3D*)

   - it has a **concise syntax**
   - it is rather **simple** and **powerful**
   - it **heavily** uses **numpy** for **performance** on **large arrays**
   - some other libraries are **built** on top of **matplotlib** (e.g. **seaborn**)
   - **pandas** has **wrappers** over **matplotlib**

   - it offers the **classic** functionnalities:  
    **plots**, **histograms**, **bar charts**, **scatterplots**, ...
   - https://matplotlib.org/gallery/index.html
   
   
   - that you can **customize** with **texts**, **grids**, **labels**, **legends**, ...   
   - **parameters** control **colors**, **line styles**, **font properties**, **axes properties**, ...
   
   
   - https://matplotlib.org/api/pyplot_summary.html

##  imports of the **libraries**
   - **plots** are done by the *matplotlib.pyplot* **functions**
   - **by convention** *matplotlib.pyplot* is **named** *plt*

In [None]:
# pyplot is the interface to the matplotlib plotting library
import matplotlib.pyplot as plt

# this is the magic to obtain plots in the notebook
# probably not required in recent versions of Jupyter
%matplotlib inline

In [None]:
import pandas as pd
import numpy as np

## line plot and scatter plot

   - we **create** an **array** *x* with **values** linearly spaced between $0$ and $2\pi$
   - we **get** a *numpy.ndarray* 

In [None]:
x = np.linspace(0, 2*np.pi, 50)

   - we **create** an **array** *y* by **computing** the **sinus** of the values of *x*
   - we **create** an **array** *z* by **computing** the **cosinus** of the values of *x*
   - we **get** two *numpy.ndarray* 

In [None]:
y = np.sin(x)
z = np.cos(x)

   - *pyplot.plot(x, y)*  **plots** **y** versus **x** with varying **linesize**, **color**, etc.
   - *pyplot.scatter* **scatter plots** of **y** versus **x** with varying **marker** **shape**, **size** and **color**

In [None]:
plt.plot(x, y)
plt.scatter(x, z);

   - we obtain a **plot** with the **default** settings

### improving the plot

with **parameters** and **methods**, we can **add** to the **plot**:
   - a **title**
   - **legends** to the **axis**
   - **labels** to the **plots**
   - with different **font sizes**...
   
Note that you can use LaTeX markup in strings (e.g. "2 \pi" between dollar signs renders as $2 \pi$).

In [None]:
x = np.linspace(0, 2*np.pi, 50); y = np.sin(x)

plt.title('trigonometric functions of angles between 0 and $2 \pi$', fontsize=20)

plt.xlabel('x coordinate', fontsize=18) # name of axis x
plt.ylabel('y coordinate') # name of axis y

plt.plot(x, y, label='sinus'); plt.scatter(x, z, label='cosinus')

plt.legend(fontsize=12) # show the legend, i.e. labels of lines/markers
plt.show(); # not mandatory in jupyter notebooks !

   - we can **vary** marker, color, size, linewidth

In [None]:
x = np.arange(1, 10)
y = np.power(x, 2)
plt.plot(x, y, color='orange',
         linestyle='--',
         linewidth=3);

In [None]:
# green, dashed line
plt.plot(x, y, 'g--', linewidth=4)
# red, square marker
plt.plot(x, y, 'rs', markersize=15)
# yellow, triangle marker
plt.plot(x, y, 'y^', markersize=6);

varying colors and size with *plt.scatter*
parameter `c`:
   - list of numbers is mapped to **colors**
   - there is an underlying **colormap**
   - https://matplotlib.org/users/colormaps.html


parameter `s`:
   - the list of numbers is mapped to the size of the shapes   

In [None]:
x = np.arange(10)
y = x + 10 * np.random.randn(10)

# random values for colors
z = np.random.randint(100, 10000, 10)
# random values for size 
v = np.random.randint(100, 5000, 10)

plt.scatter(x, y, marker='o',
            c=z,
            s=v,
            cmap='Blues');

setting the limits of the **abscissa**
   - here from $-2\pi$ to $2\pi$ 
   - (the same for the **oridnate**)

In [None]:
plt.xlim(-2*np.pi, 2*np.pi)


setting the number of **ticks** on the **abscissa** (the same for the **ordinate**)

In [None]:
plt.xticks(np.linspace(-2*np.pi, 2*np.pi, 10, endpoint=True));

setting **tick labels** (the same for the **ordinate**))


In [None]:
plt.xticks([-2*np.pi, -np.pi, 0, np.pi/2, np.pi, 2*np.pi],
           ['$-2\pi$', '$-\pi$', 0, '$\pi/2$', '$\pi$', '$2\pi$']);

example of the whole figure

In [None]:
x = np.linspace(-np.pi, np.pi, 50)
y = np.sin(x)

plt.xlim(-4, 4) 
plt.xticks(np.linspace(-4, 4, 10, endpoint=True))

plt.ylim(-1, 1)
plt.yticks(np.linspace(-2, 2, 10, endpoint=True))

plt.plot(x, y);

### writing **text**   [**OPTIONAL SLIDE**]

In [None]:
plt.text(0.5, 0.5, 'I wrote here !', fontsize=20, bbox=dict(facecolor='red', alpha=0.5));

annotating features on a plot [**OPTIONAL SLIDE**]
   - with *plt.annotate* **text** can be used to **point** some **feature** of the **plot**
  - you give **two points**:
     - the **location** being **annotated** (parameter *xy*)
     - the **location** of the **text** (parameter *xytext*)
     - you can add an **arrow** that will point toward the **point**

annotating example  [**OPTIONAL SLIDE**]

In [None]:
plt.scatter([0, 1, 2], [0, 1, 2], color='magenta')

plt.annotate('a point', xy=(0, 0), xytext=(0.25, 0.251),
             arrowprops=dict())

plt.annotate('not a point', xy=(1, 2), xytext=(0.15, 1.75),
             arrowprops=dict(arrowstyle='fancy'))

### you can **save** the **plots**  [**OPTIONAL SLIDE**]

In [None]:
# uncomment to get the doc
#plt.savefig?

In [None]:
x = np.linspace(-10, 10, 50)
y = np.power(x, 2)
plt.title('$y = x^2$', fontsize=20)
plt.xlabel('x')
plt.ylabel('$x^2$')
plt.plot(x, y, label='$x^2$')
plt.legend(fontsize=12)

plt.savefig('my_figure.png')

### plotting an array as an **image** [**OPTIONAL SLIDE**]

on a 2D regular raster

In [None]:
# we create an array
i = np.random.random((50, 100)) # numbers between [0, 1[

In [None]:
# we plot the array as an image
my_map = plt.imshow(i)

In [None]:
# color map, transparency (alpha)
plt.imshow(i, cmap=plt.cm.Blues, alpha=0.5) 
plt.colorbar();

***
this first small introduction to matplotlib can be completed by the notebooks:
   - `3-13-dataviz-pandas-dataframe-plot-and-boxplot.ipyn`
   - `3-14-dataviz-3D-plots.ipynb`
   - `3-15-customizing-matplotlib.ipynb`
   - `3-16-dataviz-figure-and-axe.ipynb`