In [None]:
import numpy
import pandas
from matplotlib import pyplot
%matplotlib inline

#Import rcParams to set font styles
from matplotlib import rcParams

#Set font style and size 
rcParams['font.family'] = 'serif'
rcParams['font.size'] = 16

In [None]:
url = 'https://python-graph-gallery.com/wp-content/uploads/gapminderData.csv'
life_expect = pandas.read_csv(url)

In [None]:
life_expect[0:5]

In [None]:
life_expect.shape

In [None]:
life_expect.info()

In [None]:
life_expect['year'].value_counts()

We have an even 142 occurrences of each year in the dataframe. It also is clear that we have data every five years, starting 1952 and ending 2007.

In [None]:
by_year = life_expect.groupby('year')

In [None]:
type(by_year)

In [None]:
by_year.first()

All the years of the first country: Afghanistan.

In [None]:
by_year.count()

That means that we have 142 entries in each year. The distinct entries correspond to each country. [GroupBy.count(https://pandas.pydata.org/pandas-docs/stable/generated/pandas.core.groupby.GroupBy.count.html)] excludes missing values, so we can conclude that all the countries are represented in each year's data.

In [None]:
by_country = life_expect.groupby('country')

In [None]:
by_country.first()

In [None]:
year1952 = by_country.first()

In [None]:
type(year1952)

In [None]:
year1952[0:5]

In [None]:
year1952['pop'].min()

In [None]:
populations = year1952['pop'].values

In [None]:
year1952.plot.scatter(figsize=(8,8), 
                       x='gdpPercap', y='lifeExp', s=populations/60000, 
                       title='Life expectancy in the year 1952',
                       edgecolors="white")
pyplot.xscale('log');

Matplotlib [colormaps](https://matplotlib.org/examples/color/colormaps_reference.html) offer several options for _qualitative_ data, using discrete colors mapped to a sequence of numbers. We'd like to use the `Accent` colormap to code countries by continent. We need a numeric code to assign to each continent, so it can be mapped to a color.

In [None]:
#from matplotlib import cm

In [None]:
pandas.Categorical(year1952['continent'])

In [None]:
colors = pandas.Categorical(year1952['continent']).codes

In [None]:
year1952.plot.scatter(figsize=(12,8), 
                         x='gdpPercap', y='lifeExp', s=populations/60000, c=colors, cmap='Accent',
                         title='Life expectancy in the year 1952',
                         edgecolors="white",
                         alpha=0.5)
pyplot.xscale('log');