# Matplotlib Tutorial - EuroSciPy 2018

Alexandre de Siqueira, University of Campinas

`euroscipy_matplotlib2018.ipynb`

Support material for the Matplotlib Tutorial at EuroSciPy 2018.

* Author: Alexandre de Siqueira
* Contact: afdesiqueira [at] gmail.com

In order to cite this material, please use the reference below (this is a Chicago-like style):

* _de Siqueira, Alexandre Fioravante. "Support material for the Matplotlib Tutorial at EuroSciPy 2018". EuroSciPy. 2018, Aug 29. Access date: < ACCESS DATE >._

Copyright (C) 2018 Alexandre Fioravante de Siqueira

This program is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the Free
Software Foundation, either version 3 of the License, or (at your option)
any later version.

This program is distributed in the hope that it will be useful, but WITHOUT
ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
more details.

You should have received a copy of the GNU General Public License along
with this program. If not, see <http://www.gnu.org/licenses/>.

## What is Matplotlib?

* Python 2D plotting library.
* Produces publication quality figures.
* Can be used in:
    * Python scripts;
    * Python and IPython shells;
    * Jupyter notebook;
    * web application servers;
    * several (four) graphical user interface toolkits.

## Importing some packages

In [None]:
import matplotlib.pyplot as plt
import numpy as np

%matplotlib inline

* Using `%matplotlib inline`, the output of plotting commands is displayed directly below the code cell that produced it. The resulting plots are stored in the notebook document.

# Creating a figure

* A figure is the recipient where you put your plots.
* Several (really this time) parameters. Check them at <https://matplotlib.org/api/_as_gen/matplotlib.pyplot.figure.html>.

In [None]:
plt.figure()

* We'll focus on one parameter: `figsize=(width, height)`. `width` and `height` are given in inches.

## `plot()`, the Swiss Army knife

* In this example, we use the function `np.arange()` to create `x`, an array with 10 numbers from zero to ten.

In [None]:
x = np.arange(start=0, stop=11)  # then, x last element will be 10
print(x)

* Let's then plot a sinusoid (a sine wave) using this array.

In [None]:
plt.figure(figsize=(15, 10))  # width = 15 inches; height = 10 inches
plt.plot(x, np.sin(x))

* Check that our sinusoid isn't "wavy" enough. We need more points for that.

In [None]:
x = np.arange(start=0, stop=10.1, step=0.1)
print(x)

In [None]:
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(x))

## Changing the appearance of your plots

* Let's change the lines, colors, add title and legend on our plots.

### Line styles

* Using the parameter `linestyle`, you can use the following line styles in your plots:

| Character | Description         |
|-----------|---------------------|
| `-`       | Solid line style    |
| `--`      | Dashed line style   |
| `-.`      | Dash-dot line style |
| `:`       | Dotted line style   |
| `steps`   | Stepped line style  |

* This example shows `log()` using a dash-dot line.

In [None]:
x = np.arange(start=0.1, stop=10.1, step=0.1)
plt.figure(figsize=(15, 10))
plt.plot(x, np.log(x), linestyle='-.')  # changing the line style.

* **Try it yourself!** Fill the code below to plot `exp()` using a dashed line.

In [None]:
# plt.figure(figsize=(15, 10))
# plt.plot(x, )  # Fill this line

### Markers

* Using the parameter `marker`, you can use the following markers in your plots:

| Character           | Description            |
|---------------------|------------------------|
| `.`                 | Point marker           |
| `,`                 | Pixel marker           |
| `o`                 | Circle marker          |
| `v`                 | Triangle down marker   |
| `^`                 | Triangle up marker     |
| `<`                 | Triangle left marker   |
| `>`                 | Triangle right marker  |
| `1`                 | Tri_down marker        |
| `2`                 | Tri_up marker          |
| `3`                 | Tri_left marker        |
| `4`                 | Tri_right marker       |
| `s`                 | Square marker          |
| `p`                 | Pentagon marker        |
| `*`                 | Star marker            |
| `h` or `H`          | Hexagon marker         |
| `+`                 | Plus marker            |
| `x`                 | X marker               |
| `D`                 | Diamond marker         |
| `d`                 | Thin diamond marker    |
| <code>&#124;</code> | Vertical line marker   |
| `_`                 | Horizontal line marker |

* More at <https://matplotlib.org/api/markers_api.html>.
* This example shows `cos()` using the thin diamond marker. When using the parameter `marker`, the parameter `linestyle` should be set to `''`; for more on that, please check <https://github.com/matplotlib/matplotlib/issues/4338/#issuecomment-93521497>.

In [None]:
x = np.arange(start=0.1, stop=10.1, step=0.1)
plt.figure(figsize=(15, 10))
plt.plot(x, np.cos(x), marker='d', linestyle='')  # changing the marker.

* **Try it yourself!** Fill the code below to plot `cos(exp())` using the X marker.

In [None]:
# x = np.arange(start=0.1, stop=4.1, step=0.001)
# plt.figure(figsize=(15, 10))
# plt.plot(x, )  # Fill this line

### Colors

* Using the parameter `color`, you can use the following color abbreviations in your plots:

| Character | Color   |
|-----------|---------|
| `b`       | blue    |
| `g`       | green   |
| `r`       | red     |
| `c`       | cyan    |
| `m`       | magenta |
| `y`       | yellow  |
| `k`       | black   |
| `w`       | white   |

* If the color is the only part of the format string, you can use also hex strings (for example, `#008000`).
* More on colors at <https://matplotlib.org/api/_as_gen/matplotlib.pyplot.plot.html> > Notes > Colors and <https://matplotlib.org/api/colors_api.html#module-matplotlib.colors>.
* This example shows `sin(tan())` using a nice orange.

In [None]:
x = np.arange(start=0.1, stop=2.1, step=0.001)
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(np.tan(x)), color='#FF4300')  # changing the marker.

* **Try it yourself!** Fill the code below to plot `cos()**5` using colors different than the default.

In [None]:
# x = np.arange(start=0.1, stop=10.1, step=0.001)
# plt.figure(figsize=(15, 10))
# plt.plot(x, )  # Fill this line

* When using markers, `color` changes the color of the entire marker. To change the color only in the inside, use the parameter `markerfacecolor`. Check this example, that uses only `color`:

In [None]:
x = np.arange(start=0.05, stop=10.1, step=0.01)
plt.figure(figsize=(15, 10))
plt.plot(x, np.cos(np.log(x/16)),
         linestyle='',
         marker='*',
         color='y',
         markersize=10)

* Now the same example, combining `color` and `markerfacecolor`:

In [None]:
x = np.arange(start=0.05, stop=10.1, step=0.01)
plt.figure(figsize=(15, 10))
plt.plot(x, np.cos(np.log(x/16)),
         linestyle='',
         marker='*',
         color='k',
         markersize=10,
         markerfacecolor='y')

* **Try it yourself!** Change the previous plot using colors different than the used ones.

In [None]:
# x = np.arange(start=0.05, stop=10.1, step=0.01)
# plt.figure(figsize=(15, 10))
# plt.plot(x, )

### Width of the lines, size of the markers

* To increase or decrease the thickness of the plot lines, or the size of each marker, use the parameters `linewidth` and `markersize`, respectively. Using our senoids, for example:

In [None]:
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(x), linestyle='--', linewidth=10, color='r')

In [None]:
# Note: a cosine wave is also considered a sinusoid; it has the same shape,
# being only out of phase when compared to the sine wave.
# cos(x - pi/2) == sin(x)

x = np.arange(start=0.1, stop=10.1, step=0.1)
plt.figure(figsize=(15, 10))
plt.plot(x, np.cos(x), marker='d', linestyle='', markersize=15, color='g')

* **Try it yourself!** Change the previous plots using different sizes.

In [None]:
# x = np.arange(start=0.1, stop=10.1, step=0.1)
# plt.figure(figsize=(15, 10))
# plt.plot(x, )

### Limits in `x` and `y`

* To change the limits in your plot, use `xlim((xmin, xmax))` for `x` and `ylim((ymin, ymax))` for `y`. For example:

In [None]:
x = np.arange(start=0.1, stop=10, step=0.0001)
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(np.tan(x)), linewidth=5, color='m')
plt.xlim((1, 1.5))

In [None]:
x = np.arange(start=0.05, stop=10.1, step=0.01)
plt.figure(figsize=(15, 10))
plt.plot(x, np.cos(np.log(x/16)), linewidth=5, color='c')
plt.ylim(-1, 0)

* You can also use `axis((xmin, xmax, ymin, ymax))` to change all limits at once:

In [None]:
x = np.arange(start=0.1, stop=10, step=0.0001)
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(np.tan(x)), linestyle='--', linewidth=5, color='k')
plt.axis((1, 1.5, 0, 1.5))

### Putting a grid

* To put a grid on your plots, use `grid(True)`.

In [None]:
x = np.arange(start=0.1, stop=2.1, step=0.001)
plt.figure(figsize=(15, 10))
plt.plot(x, np.cos(np.sinh(x)))
plt.grid(True)

* **Try it yourself!** Scale the following plot between `(0, 5)` in `x` and `(-0.5, 0.5)` in `y`. Add also a grid.

In [None]:
# x = np.arange(start=-7.1, stop=7.1, step=0.001)
# plt.figure(figsize=(15, 10))
# plt.plot(x, np.cos(np.cosh(x)))  # changing the marker.

### Adding a title and axes labels

* You can add a title to your plots using `title()`. For example:

In [None]:
x = np.arange(start=0, stop=10.1, step=0.1)
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(x))
plt.xlim((0, 2*np.pi))
plt.title('A sinusoid in the interval $[0, 2\pi]$', fontsize=25)  # It accepts LaTeX entries!

* The parameter `fontsize` controls the font size.
* To add labels to your axes, use `xlabel()` for `x` and `ylabel()` for `y`.

In [None]:
x = np.arange(start=0, stop=10.1, step=0.1)
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(x))
plt.xlim((0, 2*np.pi))
plt.title('A sinusoid in the interval $[0, 2\pi]$', fontsize=25)
plt.xlabel('Periodic time (t)', fontsize=22)
plt.ylabel('Amplitude', fontsize=22)

* **Try it yourself!** Set labels and title for some of our previous plots. Use also the parameter `fontsize` to increase the font size.
* Check how you could add different text in the plot using <https://matplotlib.org/users/text_intro.html>.

In [None]:
# x = np.arange(start=0, stop=10.1, step=0.1)
# plt.figure(figsize=(15, 10))
# plt.plot(x, np.sin(x))
# plt.xlim((0, 2*np.pi))
# plt.title()
# plt.xlabel()
# plt.ylabel()

### Adding legends

* You can add a legend in your plot using `legend()`, as in the following example:

In [None]:
x = np.arange(start=0, stop=10.1, step=0.1)
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(x))
plt.xlim((0, 2*np.pi))
plt.legend(['$\sin(x), x \in [0, 2\pi]$'], fontsize=20)

* You can also change the location of your legend, using the strings or codes below:

| Location String | Location Code |
|-----------------|---------------|
| `best`          | 0             |
| `upper right`   | 1             |
| `upper left`    | 2             |
| `lower left`    | 3             |
| `lower right`   | 4             |
| `right`         | 5             |
| `center left`   | 6             |
| `center right`  | 7             |
| `lower center`  | 8             |
| `upper center`  | 9             |
| `center`        | 10            |

* For instance, the legend on the previous plot could be added to the lower left:

In [None]:
x = np.arange(start=0, stop=10.1, step=0.1)
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(x))
plt.xlim((0, 2*np.pi))
plt.legend(['$\sin(x), x \in [0, 2\pi]$'], fontsize=20, loc='lower left')

## Adding several plots in the same window

* Now, let's add more than one plot in the same window.
* We can do this in two ways:
    * Using several plots in the same `plot()` function;
    * Using different `plot()` functions.
* The first case:

In [None]:
x = np.arange(start=-2*np.pi, stop=2*np.pi, step=0.1)
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(x), x, np.cos(x))
plt.legend(['$\sin(x)$', '$\cos(x)$'], fontsize=20)  # How to put two legends in one plot

* The second case:

In [None]:
x = np.arange(start=-2*np.pi, stop=2*np.pi, step=0.1)
plt.figure(figsize=(15, 10))
plt.plot(x, np.sin(x))
plt.plot(x, np.cos(x))
plt.legend(['$\sin(x)$', '$\cos(x)$'], fontsize=20)

* Now, let's put lots of plots in one window, using everything we studied so far:

In [None]:
x = np.arange(start=0.01, stop=4*np.pi, step=0.01)
plt.figure(figsize=(15, 10))

plt.plot(x, np.sin(x), linestyle='-', linewidth=3)
plt.plot(x, np.cos(x), linestyle='--', linewidth=3)
plt.plot(x, np.tan(x), linestyle='-.', linewidth=3)
plt.plot(x, np.exp(x), linestyle=':', linewidth=3)
plt.plot(x, np.log(x), linestyle='steps', linewidth=3)

plt.legend(['Sine', 'Cosine', 'Tangent', 'Exponential', 'Logarithm'], fontsize=15, loc='upper right')

plt.axis([0, 10, -2, 4])

* **Try it yourself!** Define different plots in the last code.

In [None]:
# x = np.arange(start=0.01, stop=4*np.pi, step=0.01)
# plt.figure(figsize=(15, 10))

# plt.plot()
# plt.plot()
# plt.plot()
# plt.plot()
# plt.plot()

# plt.legend()

# plt.axis()

## Adding plots in different areas of the same window

* Instead of working in all plots at the same window, we can use subplots.
* The function `subplots()` divides the window according to the parameters `nrows` and `ncols`, which receive the number of rows and cols respectively.

In [None]:
x = np.arange(start=-2*np.pi, stop=2*np.pi, step=0.01)
fig, ax = plt.subplots(nrows=2, ncols=1, figsize=(15, 10))

* Check that we divided a window in two rows. They are represented by the variable `ax`:

In [None]:
fig, ax = plt.subplots(nrows=2, ncols=1, figsize=(15, 10))

# ax[0], plot number one
ax[0].plot(x, np.sin(x), linestyle=':', color='r', linewidth=3)
ax[0].set_title('$\sin(x)$')
ax[0].set_xlabel('Periodic time (t)')
ax[0].set_ylabel('Amplitude')

# ax[1], plot number one
ax[1].plot(x, np.cos(x), linestyle='-.', color='m', linewidth=3)
ax[1].set_title('$\cos(x)$')
ax[1].set_xlabel('Periodic time (t)')
ax[1].set_ylabel('Amplitude')

* Now the same plots in columns, instead of rows:

In [None]:
fig, ax = plt.subplots(nrows=1, ncols=2, figsize=(15, 8))

# ax[0], plot number one
ax[0].plot(x, np.sin(x), linestyle=':', color='r', linewidth=3)
ax[0].set_title('$\sin(x)$')
ax[0].set_xlabel('Periodic time (t)')
ax[0].set_ylabel('Amplitude')

# ax[1], plot number one
ax[1].plot(x, np.cos(x), linestyle='-.', color='m', linewidth=3)
ax[1].set_title('$\cos(x)$')
ax[1].set_xlabel('Periodic time (t)')
ax[1].set_ylabel('Amplitude')
ax[1].grid(True)

* For more than two plots, `ax` receives two coordinates. One represent the row, and the other the column:

In [None]:
x = np.arange(start=0.01, stop=4*np.pi, step=0.1)
fig, ax = plt.subplots(nrows=2, ncols=2, figsize=(15, 15))

ax[0, 0].plot(x, np.sin(x), linestyle=':', color='r', linewidth=3)
ax[0, 1].plot(x, np.cos(x), linestyle='steps', color='m', linewidth=3)
ax[1, 0].plot(x, np.tan(x), linestyle='--', color='c', linewidth=3)
ax[1, 1].plot(x, np.exp(x), linestyle='-.', color='g', linewidth=3)

* For more difficult settings of subplots, please check Jake VanderPlas's Python Data Science Handbook: <https://jakevdp.github.io/PythonDataScienceHandbook/04.08-multiple-subplots.html>.

## Different types of plots

* There are other functions you could use for your plots! Let's check some of them.

### Histograms

In [None]:
rand_arr = np.random.randint(low=0, high=101, size=100)
plt.figure(figsize=(15, 10))
plt.hist(rand_arr)
plt.title('Frequency distribution of random integers in $[0, 100]$')
plt.ylabel('Frequency')

* You probably will have a different histogram than me. To ensure we can reproduce the results, let's set a seed using `random.seed()`:

In [None]:
np.random.seed(seed=0)  # Setting the seed 0
rand_arr = np.random.randint(low=0, high=101, size=100)

plt.figure(figsize=(15, 10))
plt.hist(rand_arr)
plt.title('Frequency distribution of random integers in $[0, 100]$')
plt.ylabel('Frequency')

* You can choose the histogram colors using the parameters `facecolor` and `edgecolor`:

In [None]:
np.random.seed(seed=0)
rand_arr = np.random.randint(low=0, high=101, size=100)

plt.figure(figsize=(15, 10))
plt.hist(rand_arr, facecolor='lightblue', edgecolor='w')
plt.title('Frequency distribution of random integers in $[0, 100]$')
plt.ylabel('Frequency')

* **Try it yourself!** Define different seeds, face and edge colors for the previous histogram.

### Bars

* You can use the function `bar()` to obtain simple bar plots. The number of rows in the input array divides the groups, and the columns divides the number of bars in each group. For example:

In [None]:
np.random.seed(seed=0)
counts = np.random.randint(low=0, high=101, size=(2, 5))

# Defining where counts will be plotted.
x_group1 = np.asarray((1.0, 3.0, 5.0, 7.0, 9.0))
x_group2 = x_group1 + 0.5
features = ('Year 1', 'Year 2', 'Year 3', 'Year 4', 'Year 5')

plt.figure(figsize=(15, 10))

# Now, defining where the ticks will be drawn.
# The function xticks() receives the location and the labels of the ticks.
plt.xticks((x_group1 + x_group2)/2, features)

plt.bar(x_group1, counts[0], width=0.5)
plt.bar(x_group2, counts[1], width=0.5)

* **Try it yourself!** Define different colors and values for each bar. Add also a new variable.

In [None]:
np.random.seed(seed=0)
counts = np.random.randint(low=0, high=101, size=(2, 5))

# Defining where counts will be plotted.
x_group1 = np.asarray((1.0, 3.0, 5.0, 7.0, 9.0))
x_group2 = x_group1 + 0.5
features = ('Year 1', 'Year 2', 'Year 3', 'Year 4', 'Year 5')

plt.figure(figsize=(15, 10))

#plt.xticks((x_group1 + x_group2)/2, features)

#plt.bar(x_group1, counts[0], width=0.5)
#plt.bar(x_group2, counts[1], width=0.5)

### Stem plots

* Stem plots can be used to show a discrete sequence of data. For that, use the function `stem()`.
* Using, for example, our very first plot:

In [None]:
x = np.arange(start=0, stop=11)
plt.figure(figsize=(15, 10))
plt.stem(x, np.sin(x))

* We can use the function `setp()` to set the stem plot properties:

In [None]:
x = np.arange(start=0, stop=11)
plt.figure(figsize=(15, 10))

# Let's receive the variables markerline, stemlines, baseline from stem().
markerline, stemlines, baseline = plt.stem(x, np.sin(x))

# Now, let's use the function setp() to change them.
plt.setp(baseline, color='k', linewidth=2)
plt.setp(stemlines, color='k', linestyle='-.', linewidth=2)
plt.setp(markerline, color='k', markerfacecolor='y', marker='*', markersize=15)

### Pie plots

* We can plot simple pie charts using the function `pie()`:

In [None]:
# Inspired by Pie Demo2's example:
# <https://matplotlib.org/gallery/pie_and_polar_charts/pie_demo2.html#sphx-glr-gallery-pie-and-polar-charts-pie-demo2-py>

labels = ('Frogs', 'Hogs', 'Dogs', 'Logs')
fracs = (15, 30, 45, 10)

plt.figure(figsize=(12, 12))
plt.pie(fracs, labels=labels)

* To show the numbers related to each slice, use the parameter `autopct` followed by a format `fmt%pct`. Also, `pie()` accepts the parameters `explode` (separates the piece) and `shadow` (adds a shadow).

In [None]:
explode = (0, 0.05, 0, 0)

plt.figure(figsize=(12, 12))
plt.pie(fracs, labels=labels, explode=explode, autopct='%.1f%%', shadow=True)  # Try also '%.0f%%', '%.2f%%'!

### Scatter plots

* 2D scatter plots represent the relation between two variables. We can use the function `scatter()` to obtain one of them:

In [None]:
np.random.seed(seed=0)
counts = np.random.randint(low=2, high=101, size=(2, 100))

plt.figure(figsize=(15, 10))
plt.scatter(counts[0], counts[1])

* We can control the size of each element using the parameter `s`:

In [None]:
x = np.linspace(start=1, stop=4*np.pi, num=200)
y = np.cos(np.exp(1/2)*x)

size_scale = np.linspace(start=1, stop=75, num=len(x))
plt.figure(figsize=(15, 10))
plt.scatter(x, y, s=size_scale)

* We can also use the parameter `c` to control the colors of each element:

In [None]:
x = np.linspace(start=1, stop=4*np.pi, num=200)
y = np.cos(np.exp(1/2)*x)

size_scale = np.linspace(start=10, stop=85, num=len(x))
color_scale = np.linspace(start=1, stop=10, num=len(x))

plt.figure(figsize=(15, 10))
plt.scatter(x, y, s=size_scale, c=color_scale)

* The color scale is given by the default colormap, `viridis`. We can change it using the function `set_cmap()`:

In [None]:
x = np.linspace(start=1, stop=4*np.pi, num=200)
y = np.cos(np.exp(1/2)*x)

size_scale = np.linspace(start=10, stop=85, num=len(x))
color_scale = np.linspace(start=1, stop=10, num=len(x))

plt.figure(figsize=(15, 10))
plt.scatter(x, y, s=size_scale, c=color_scale)
plt.set_cmap('tab20b')

* A list of all available colormaps is given at <https://matplotlib.org/examples/color/colormaps_reference.html>. Another example:

In [None]:
t = np.linspace(start=0, stop=1, num=250)
x = np.exp(t) * np.sin(100*t)
y = np.exp(t) * np.cos(100*t)

scales = np.linspace(start=55, stop=65, num=len(x))

plt.figure(figsize=(15, 10))
plt.scatter(x, y, marker='^', s=scales, c=scales)
plt.set_cmap('rainbow')

## An example using lots of stuff we used here!

* Let's draw a scatter plot using some data from Brazilian GDP and life expectancy in 2013. First, we'll import the necessary packages.

In [None]:
import matplotlib.lines as mlines
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

* Now we use `pandas` to read the data:

In [None]:
data_brazil = pd.read_excel('data_ibge.xls', sheet_name=2)

* Let's make our data nicer using the color palette `5-class Dark2` from ColorBrewer2: <http://colorbrewer2.org/>

In [None]:
colors = ['#1b9e77',
          '#d95f02',
          '#7570b3',
          '#e7298a',
          '#66a61e']

* Now we define the function `attribute_color()`, which points the color correspondent to each Brazilian region.

In [None]:
def attribute_color(region, colors):
    reg_colors = {
        'North': colors[0],
        'Northeast': colors[1],
        'Southeast': colors[2],
        'South': colors[3],
        'Central-West': colors[4]
    }
    return reg_colors.get(region, 'black')

* Now let's create the color vector `color_region`, containing each color attributed to a line from the table. 

In [None]:
color_region = list()
qty_states = len(data_brazil['Region'])

for state in range(qty_states):
    aux_color = attribute_color(data_brazil['Region'][state], colors)
    color_region.append(aux_color)

for idx, color in enumerate(color_region):
    print(data_brazil['UF'][idx],
          'is in region', data_brazil['Region'][idx],
          'and receives the color', color)

* Starting the plot!

In [None]:
fig, ax = plt.subplots(figsize=(25, 20))
ax.scatter(x=data_brazil['LifeExpec'],
           y=data_brazil['GDPperCapita'],
           s=data_brazil['PopX1000'],
           c=color_region,
           alpha=0.6)

* Putting title, labels and a grid.

In [None]:
ax.set_title('Brazilian development in 2013, according to each state',
          fontsize=22)
ax.set_xlabel('Life expectancy (years)', fontsize=22)
ax.set_ylabel('GDP per capita (R$)', fontsize=22)
ax.grid(True)

fig  # calling figure shows fig.

* Let's insert the state abbreviations into each circle.

In [None]:
for state in range(len(data_brazil['UF'])):
    ax.text(x=data_brazil['LifeExpec'][state],
            y=data_brazil['GDPperCapita'][state],
            s=data_brazil['UF'][state],
            fontsize=16)

fig  # calling figure shows fig.

* Now we adapt a legend presenting the regions using a 2D object with the colors we defined previously. First we define the regions:

In [None]:
regions = ['North',
           'Northeast',
           'Southeast',
           'South',
           'Central-West']

* Then we add the legend. Let's call it `legend1`.

In [None]:
legend1_line2d = list()
for step in range(len(colors)):
    legend1_line2d.append(mlines.Line2D([0], [0],
                                        linestyle='none',
                                        marker='o',
                                        alpha=0.6,
                                        markersize=15,
                                        markerfacecolor=colors[step]))

legend1 = ax.legend(legend1_line2d,
                    regions,
                    numpoints=1,
                    fontsize=22,
                    loc='lower right',
                    shadow=True)

fig  # calling figure shows fig.

* Now let's put another legend, presenting a comparison of the population size for each state. Let's call it `legend2`. 

In [None]:
legend2_line2d = list()
legend2_line2d.append(mlines.Line2D([0], [0],
                                    linestyle='none',
                                    marker='o',
                                    alpha=0.6,
                                    markersize=np.sqrt(100),
                                    markerfacecolor='#D3D3D3'))
legend2_line2d.append(mlines.Line2D([0], [0],
                                    linestyle='none',
                                    marker='o',
                                    alpha=0.6,
                                    markersize=np.sqrt(1000),
                                    markerfacecolor='#D3D3D3'))
legend2_line2d.append(mlines.Line2D([0], [0],
                                    linestyle='none',
                                    marker='o',
                                    alpha=0.6,
                                    markersize=np.sqrt(10000),
                                    markerfacecolor='#D3D3D3'))

legend2 = ax.legend(legend2_line2d,
                    ['1', '10', '100'],
                    title='Population (in 100,000)',
                    numpoints=1,
                    fontsize=20,
                    loc='upper left',
                    frameon=False,  # no edges
                    labelspacing=3,  # increase space between labels
                    handlelength=5,  # increase space between objects
                    borderpad=4)  # increase the margins of the legend
fig.gca().add_artist(legend1)
plt.setp(legend2.get_title(), fontsize=22)  # increasing the legend font

fig  # calling figure shows fig.

* **Try it yourself!** Play with this plot! Change it as you'd like.

In [None]:
import matplotlib.lines as mlines
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

data_brazil = pd.read_excel('data_ibge.xls', sheet_name=2)

colors = ['#1b9e77',
          '#d95f02',
          '#7570b3',
          '#e7298a',
          '#66a61e']

def attribute_color(region, colors):
    reg_colors = {
        'North': colors[0],
        'Northeast': colors[1],
        'Southeast': colors[2],
        'South': colors[3],
        'Central-West': colors[4]
    }
    return reg_colors.get(region, 'black')

color_region = list()
qty_states = len(data_brazil['Region'])

for state in range(qty_states):
    aux_color = attribute_color(data_brazil['Region'][state], colors)
    color_region.append(aux_color)

for idx, color in enumerate(color_region):
    print(data_brazil['UF'][idx],
          'is in region', data_brazil['Region'][idx],
          'and receives the color', color)

fig, ax = plt.subplots(figsize=(25, 20))
ax.scatter(x=data_brazil['LifeExpec'],
           y=data_brazil['GDPperCapita'],
           s=data_brazil['PopX1000'],
           c=color_region,
           alpha=0.6)

ax.set_title('Brazilian development in 2013, according to each state',
          fontsize=22)
ax.set_xlabel('Life expectancy (years)', fontsize=22)
ax.set_ylabel('GDP per capita (R$)', fontsize=22)
ax.grid(True)

for state in range(len(data_brazil['UF'])):
    ax.text(x=data_brazil['LifeExpec'][state],
            y=data_brazil['GDPperCapita'][state],
            s=data_brazil['UF'][state],
            fontsize=16)

regions = ['North',
           'Northeast',
           'Southeast',
           'South',
           'Central-West']

legend1_line2d = list()
for step in range(len(colors)):
    legend1_line2d.append(mlines.Line2D([0], [0],
                                        linestyle='none',
                                        marker='o',
                                        alpha=0.6,
                                        markersize=15,
                                        markerfacecolor=colors[step]))

legend1 = ax.legend(legend1_line2d,
                    regions,
                    numpoints=1,
                    fontsize=22,
                    loc='lower right',
                    shadow=True)

legend2_line2d = list()
legend2_line2d.append(mlines.Line2D([0], [0],
                                    linestyle='none',
                                    marker='o',
                                    alpha=0.6,
                                    markersize=np.sqrt(100),
                                    markerfacecolor='#D3D3D3'))
legend2_line2d.append(mlines.Line2D([0], [0],
                                    linestyle='none',
                                    marker='o',
                                    alpha=0.6,
                                    markersize=np.sqrt(1000),
                                    markerfacecolor='#D3D3D3'))
legend2_line2d.append(mlines.Line2D([0], [0],
                                    linestyle='none',
                                    marker='o',
                                    alpha=0.6,
                                    markersize=np.sqrt(10000),
                                    markerfacecolor='#D3D3D3'))

legend2 = ax.legend(legend2_line2d,
                    ['1', '10', '100'],
                    title='Population (in 100,000)',
                    numpoints=1,
                    fontsize=20,
                    loc='upper left',
                    frameon=False,  # no edges
                    labelspacing=3,  # increase space between labels
                    handlelength=5,  # increase space between objects
                    borderpad=4)  # increase the margins of the legend
fig.gca().add_artist(legend1)
plt.setp(legend2.get_title(), fontsize=22)  # increasing the legend font