# 5. Visualization with matplotlib and plotly

Starting with [matplotlib](https://matplotlib.org/stable/plot_types/index.html), we'll take a look at some basic charts, then start adding more formatting and interactivity with [plotly](https://plotly.com/python/) and statistical plots with [seaborn](https://seaborn.pydata.org/examples/index.html)

In [None]:
# import libraries for notebook

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import plotly.express as px


## Basic charts & formatting

Let's play around with some basic charts first, before we work with a dataset.

### Bar chart - showing comparison

In [None]:
# bar chart - showing comparison

x = np.arange(1, 6)
y = np.array([10, 25, 17, 20, 15])

plt.bar(x, y)

In [None]:
# turn it on its side with barh

x = np.arange(1, 6)
y = np.array([10, 25, 17, 20, 15])

plt.barh(x, y)

In [None]:
# add some formatting - change color

x = np.arange(1, 6)
y = np.array([10, 25, 17, 20, 15])

plt.bar(x, y, color = 'skyblue')


In [None]:
# add axis labels and title

x = np.arange(1, 6)
y = np.array([10, 25, 17, 20, 15])

plt.bar(x, y, color = 'skyblue')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Chart')
plt.show()

In [None]:
# now add data labels to the bars

x = np.arange(1, 6)
y = np.array([10, 25, 17, 20, 15])

# Create a bar chart
plt.bar(x, y, color='skyblue')
plt.xlabel('Categories')
plt.ylabel('Values')
plt.title('Bar Chart')

# Add data labels to the bars
for i, value in enumerate(y):
    plt.text(x[i], value + 1, str(value), ha='center', va='bottom', fontsize=12)

plt.show()

In [None]:
# Your turn - make a bar chart from this provided data, and format it how you wish

x = np.arange(1, 5)
y = np.array([12, 15, 20, 5, 15])


### Line chart - showing change over time

In [None]:
# line chart - showing change over time

x = [1, 2, 3, 4, 5, 6, 7]
y = [1, 2, 3, 4, 5, 6, 7]

plt.plot(x, y)

In [None]:
# change color of the line 

x = [1, 2, 3, 4, 5, 6, 7]
y = [1, 2, 3, 4, 5, 6, 7]

plt.plot(x, y, color='red')

In [None]:
# change color and style of the line 

x = [1, 2, 3, 4, 5, 6, 7]
y = [1, 2, 3, 4, 5, 6, 7]

plt.plot(x, y, color='red', linestyle = 'dotted')

In [None]:
# change color, style and width of line

x = [1, 2, 3, 4, 5, 6, 7]
y = [1, 2, 3, 4, 5, 6, 7]

plt.plot(x, y, color='red', linestyle = '-.', linewidth = 5)

In [None]:
# add a marker

x = [1, 2, 3, 4, 5, 6, 7]
y = [1, 2, 3, 4, 5, 6, 7]

# try diferent markers by running these one at a time

plt.plot(x, y, marker='o') # dot marker
#plt.plot(x, y, marker='s') # square marker
#plt.plot(x, y, marker='D') # diamond marker
#plt.plot(x, y, marker='*') # star marker
#plt.plot(y, marker='o', markersize=12) # large dot marker



In [None]:
# customize the marker even more

x = [1, 2, 3, 4, 5, 6, 7]
y = [1, 2, 3, 4, 5, 6, 7]

plt.plot(x, y, marker='o', markersize=12, 
         markeredgecolor='red', markerfacecolor='yellow')

You can even make your own markers! Check out [this example](https://matplotlib.org/stable/gallery/lines_bars_and_markers/multivariate_marker_plot.html#sphx-glr-gallery-lines-bars-and-markers-multivariate-marker-plot-py) that used emojis in a scatterplot.

In [None]:
# multiple lines - just add them all to the plot

x = ['Jan', 'Feb', 'March', 'Apr', 'May']
y1 = [1, 2, 3, 4, 5]
y2 = [2, 5, 9, 13, 18]
y3 = [2, 4, 6, 8, 10]

plt.plot(x, y1, x, y2, x, y3)

In [None]:
# add axis labels and plot title

x = ['Jan', 'Feb', 'March', 'Apr', 'May']
y1 = [1, 2, 3, 4, 5]
y2 = [2, 5, 9, 13, 18]
y3 = [2, 4, 6, 8, 10]

plt.plot(x, y1, x, y2, x, y3)
plt.xlabel('Month')
plt.ylabel('Sales')
plt.title('Sales to date this year')


In [None]:
# adding a legend and grid lines

x = ['Jan', 'Feb', 'March', 'Apr', 'May']
y1 = [1, 2, 3, 4, 5]
y2 = [2, 5, 9, 13, 18]
y3 = [2, 4, 6, 8, 10]

plt.plot(x, y1, x, y2, x, y3, marker='o')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.title('Monthly Sales')
plt.legend(labels=['Alice', 'Bob', 'Carol'], loc='upper left')
plt.grid(True)

In [None]:
# one more very important piece to add -- a source!

x = ['Jan', 'Feb', 'March', 'Apr', 'May']
y1 = [1, 2, 3, 4, 5]
y2 = [2, 5, 9, 13, 18]
y3 = [2, 4, 6, 8, 10]

plt.plot(x, y1, x, y2, x, y3, marker='o')
plt.xlabel('Month')
plt.ylabel('Sales')
plt.title('Monthly Sales')
plt.legend(labels=['Alice', 'Bob', 'Carol'], loc='upper left')
plt.figtext(0.90, 0.005, 'Source: Monthly sales data', horizontalalignment='right')
plt.grid(True)



In [None]:
# Your turn - make a line chart from this data and format how you wish

x = np.linspace(0, 10, 50)  # Create an array of 50 evenly spaced values from 0 to 10
y = np.random.randn(50)     # Generate 50 random data points





### Pie chart - showing parts of a whole

In [None]:
# pie chart - show percents of total

x = (125, 300)

plt.pie(x)

In [None]:
# add labels

plt.pie(x, labels=['Male', 'Female'])

In [None]:
# change the starting angle so it is easier to read

plt.pie(x, labels=['Male', 'Female'], startangle=90)

In [None]:
# separate a slice to emphasize it

plt.pie(x, labels=['Male', 'Female'], startangle=90, explode=[0.1, 0])


In [None]:
# add percentage labels for clarity

plt.pie(x, labels=['Male', 'Female'], startangle=90, autopct='%.1f%%')


### Histogram - showing distribution

In [None]:
# first using matplotlib

# Sample data for the histogram
data = np.random.randn(1000)

# Create a histogram
plt.hist(data, bins=20, color='lightgreen', edgecolor='black')
plt.xlabel('Value')
plt.ylabel('Frequency')
plt.title('Histogram')
plt.show()


In [None]:
# now with seaborn

sns.distplot(data, kde=False, bins=30)

# More with bar charts - Titanic passenger data using Seaborn


The sinking of the Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the widely considered “unsinkable” RMS Titanic sank after colliding with an iceberg. Unfortunately, there weren’t enough lifeboats for everyone onboard, resulting in the death of 1502 out of 2224 passengers and crew.

Let's use some visualizations to see some patterns in those who died or survived.

In [None]:
# load and preview data

titanic = pd.read_csv('https://raw.githubusercontent.com/dnmalan/advanced-data-journalism-23/main/data/titanic.csv')

titanic.head()

In [None]:
# bar chart using matplotlib

# first have to group by and create a new dataframe
survived = titanic.groupby('Survived').size()

# then can create a bar chart using the counts
titanic_barplot = survived.plot.bar()
plt.ylabel("Counts")
plt.xlabel('Survived')
plt.xticks(rotation=0)
plt.show(titanic_barplot)

In [None]:
# bar chart using seaborn (sns)
# it will automatically create the counts for you

sns.countplot(x='Survived', data=titanic)

In [None]:
# and it includes some nice set styles 

sns.set_style('whitegrid')

sns.countplot(x='Survived', data=titanic)

In [None]:
# you can also more easily create different kinds of bar charts
# such as this grouped bar chart

sns.set_style('whitegrid')
sns.countplot(x='Survived', hue='Pclass', data=titanic)

Compare with [the code](https://matplotlib.org/stable/gallery/lines_bars_and_markers/barchart.html) to create a grouped bar chart in matplotlib.

In [None]:
# Your turn -- create a similar chart for gender

sns.set_style('whitegrid')

sns.countplot(x='XXXXX', hue='XXXX', data=XXXX)

In [None]:
#stacked bar chart - no great options in seaborn, we can use pandas

#create new data to put into chart
total = titanic.groupby(['Survived','Pclass'])['Name'].count().unstack()

# Simple one-liner using our total df
total.plot(kind='bar', stacked=True)

# Just add a title and rotate the x-axis labels to be horizontal.
plt.title('Survival by passenger class')
plt.xticks(rotation=0, ha='center')

In [None]:
# Show how I wrote this




In [None]:
# your turn -- make a stacked bar chart of survival and gender

#create new data to put into chart
total = titanic.groupby(['Survived','XXX'])['Name'].count().unstack()

# Simple one-liner using our total df
total.plot(kind='bar', stacked=XXXX)

# Just add a title and rotate the x-axis labels to be horizontal.
plt.title('XXXX')
plt.xticks(rotation=0, ha='center')

In [None]:
# histogram using seaborn

sns.distplot(titanic['Age'].dropna(), kde=False, bins=30)


# More with line charts - using matplotlib & seaborn

In [None]:
# using matplotlib

year = [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010]
unemployment_rate = [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3]

#create plot
plt.plot(year, unemployment_rate, color='red', marker='o')
plt.title('unemployment rate vs year', fontsize=14)
plt.xlabel('year', fontsize=14)
plt.ylabel('unemployment rate', fontsize=14)
plt.grid(True)
plt.show()

In [None]:
# Using seaborn

year = [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010]
unemployment_rate = [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3]
  
# Set Seaborn style
sns.set(style="whitegrid")

# Create a line plot using Seaborn
plt.figure(figsize=(8, 4))  # Adjust the figure size as needed
sns.lineplot(x=year, y=unemployment_rate, marker='o', color='blue', linewidth=2)

# Add labels and title
plt.xlabel('Year')
plt.ylabel('Unemployment Rate (%)')
plt.title('Unemployment Rate Over Time')

# Display the plot
plt.xticks(rotation=45)  # Rotate x-axis labels for better readability
plt.tight_layout()  # Improve layout spacing
plt.show()

# Adding interactivity with Plotly

In [None]:

# Data
year = [1920, 1930, 1940, 1950, 1960, 1970, 1980, 1990, 2000, 2010]
unemployment_rate = [9.8, 12, 8, 7.2, 6.9, 7, 6.5, 6.2, 5.5, 6.3]

# Create a DataFrame
data = pd.DataFrame({'Year': year, 'Unemployment Rate (%)': unemployment_rate})

# Create an interactive line chart with Plotly Express
fig = px.line(data, x='Year', y='Unemployment Rate (%)', title='Unemployment Rate Over Time')

# Customize the layout (optional)
fig.update_layout(
    xaxis_title='Year',
    yaxis_title='Unemployment Rate (%)',
)

# Show the interactive plot
fig.show()

Let's redo the Olympic medals scatterplot with interactivity!

In [None]:
# Import data from a CSV file
df_olympics = pd.read_csv('https://raw.githubusercontent.com/dnmalan/advanced-data-journalism-23/main/data/olympics_medals_country_wise.csv')

# Display the first few rows of the DataFrame
df_olympics.head()

In [None]:
# reminder of original scatterplot using matplotlib

plt.figure(figsize=(10, 6))
plt.scatter(df_olympics['total_participation'], df_olympics['total_total '], alpha=0.5, color='b')
plt.xlabel('Combined Participations (Summer + Winter)')
plt.ylabel('Total Medals')
plt.title('Scatterplot of Total Medals vs. Combined Participations')
plt.grid(True)
plt.show()

In [None]:
# now in plotly

fig = px.scatter(df_olympics, x='total_participation', y='total_total ',
                 title='Scatterplot of Total Medals vs. Combined Participations',
                 labels={'total_participation': 'Combined Participations (Summer + Winter)',
                         'total_total ': 'Total Medals'})

# Customize the appearance (optional)
fig.update_traces(marker=dict(size=8, opacity=0.5), selector=dict(mode='markers'))

# Add grid lines (optional)
fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='lightgray')
fig.update_yaxes(showgrid=True, gridwidth=1, gridcolor='lightgray')

# Show the interactive plot
fig.show()


In [None]:
# add labels for country 

# add the "hover_data" parameter
fig = px.scatter(df_olympics, x='total_participation', y='total_total ', hover_data=['countries '],
                 title='Scatterplot of Total Medals vs. Combined Participations',
                 labels={'total_participation': 'Combined Participations (Summer + Winter)',
                         'total_total ': 'Total Medals', 'countries ': 'Country'})

# Customize the appearance (optional)
fig.update_traces(marker=dict(size=8, opacity=0.5), selector=dict(mode='markers'))

# Add grid lines (optional)
fig.update_xaxes(showgrid=True, gridwidth=1, gridcolor='lightgray')
fig.update_yaxes(showgrid=True, gridwidth=1, gridcolor='lightgray')

# Show the interactive plot
fig.show()


# Mapping in plotly

See the [plotly reference](https://plotly.com/python/maps/) for available maps and styles.

In [None]:
# start by creating a basic world map, using the built-in countries data in plotly

df = px.data.gapminder().query("year==2007")
fig = px.choropleth(df, locations="iso_alpha",
                    color="lifeExp") # lifeExp is a column of gapminder)
fig.show()

In [None]:
# add some formatting

df = px.data.gapminder().query("year==2007")
fig = px.choropleth(df, locations="iso_alpha",
                    color="lifeExp", # lifeExp is a column of gapminder
                    hover_name="country", # column to add to hover information
                    color_continuous_scale=px.colors.sequential.Plasma)
fig.show()

In [None]:
# Plotting GDP by country

import plotly.graph_objects as go

df = pd.read_csv('https://raw.githubusercontent.com/plotly/datasets/master/2014_world_gdp_with_codes.csv')

fig = go.Figure(data=go.Choropleth( # type of map = choropleth
    locations = df['CODE'], # locations are the country code
    z = df['GDP (BILLIONS)'], # color by gdp 
))


fig.show()

In [None]:
# add formatting

fig = go.Figure(data=go.Choropleth(
    locations = df['CODE'],
    z = df['GDP (BILLIONS)'],
    text = df['COUNTRY'],
    colorscale = 'Blues',
    autocolorscale=False,
    reversescale=True,
    marker_line_color='darkgray',
    marker_line_width=0.5,
    colorbar_tickprefix = '$',
    colorbar_title = 'GDP<br>Billions US$',
))

fig.update_layout(
    title_text='2014 Global GDP',
    geo=dict(
        showframe=False,
        showcoastlines=False,
        projection_type='equirectangular'
    ),
    annotations = [dict(
        x=0.55,
        y=0.1,
        xref='paper',
        yref='paper',
        text='Source: <a href="https://www.cia.gov/library/publications/the-world-factbook/fields/2195.html">\
            CIA World Factbook</a>',
        showarrow = False
    )]
)

fig.show()

# Example - global youth unemployment

This example is adapted from the [Kaggle notebook](https://www.kaggle.com/code/arthurtok/generation-unemployed-interactive-plotly-visuals/notebook) by Anisotropic, a data scientist in London. The project explores youth unemployment around the world during a 5-year period.

## Project setup

In [None]:
# Import the relevant libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
import plotly.offline as py
py.init_notebook_mode(connected=True)
import plotly.graph_objs as go
import plotly.tools as tls
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

In [None]:
# load and preview the data

country = pd.read_csv('https://raw.githubusercontent.com/dnmalan/advanced-data-journalism-23/1cd04389879d1897c43de4a80bf851a34e693154/data/API_ILO_country_YU.csv')
country.head()

In [None]:
# check the shape

country.shape

## Clean data


In [None]:
# Check the list of country names
display(country['Country Name'].unique())

Some countries are not countries at all! They are regions, and one is the entire world. We need to get rid of these non-countries.

In [None]:
# Create our new list of countries that we want to plot for. This was done manually as I was lazy
# to think of any clever tricks (eg text processing) to filter. 
country_list = ['Afghanistan','Angola','Albania','Argentina','Armenia','Australia'
,'Austria','Azerbaijan','Burundi','Belgium','Benin','Burkina Faso','Bangladesh','Bulgaria'
,'Bahrain','Bosnia and Herzegovina','Belarus','Belize','Bolivia','Brazil','Barbados','Brunei Darussalam'
,'Bhutan','Botswana','Central African Republic','Canada','Switzerland','Chile','China','Cameroon'
,'Congo','Colombia','Comoros','Cabo Verde','Costa Rica','Cuba','Cyprus','Czech Republic','Germany'
,'Denmark','Dominican Republic','Algeria','Ecuador','Egypt','Spain','Estonia','Ethiopia','Finland','Fiji'
,'France','Gabon','United Kingdom','Georgia','Ghana','Guinea','Greece','Guatemala','Guyana','Hong Kong'
,'Honduras','Croatia','Haiti','Hungary','Indonesia','India','Ireland','Iran','Iraq','Iceland','Israel'
,'Italy','Jamaica','Jordan','Japan','Kazakhstan','Kenya','Cambodia','Korea, Rep.','Kuwait','Lebanon','Liberia'
,'Libya','Sri Lanka','Lesotho','Lithuania','Luxembourg','Latvia','Macao','Morocco','Moldova','Madagascar'
,'Maldives','Mexico','Macedonia','Mali','Malta','Myanmar','Montenegro','Mongolia','Mozambique','Mauritania'
,'Mauritius','Malawi','Malaysia','North America','Namibia','Niger','Nigeria','Nicaragua','Netherlands'
,'Norway','Nepal','New Zealand   ','Oman','Pakistan','Panama','Peru','Philippines','Papua New Guinea'
,'Poland','Puerto Rico','Portugal','Paraguay','Qatar','Romania','Russian Federation','Rwanda','Saudi Arabia'
,'Sudan','Senegal','Singapore','Solomon Islands','Sierra Leone','El Salvador','Somalia','Serbia','Slovenia'
,'Sweden','Swaziland','Syrian Arab Republic','Chad','Togo','Thailand','Tajikistan','Turkmenistan','Timor-Leste'
,'Trinidad and Tobago','Tunisia','Turkey','Tanzania','Uganda','Ukraine','Uruguay','United States','Uzbekistan'
,'Venezuela, RB','Vietnam','Yemen, Rep.','South Africa','Congo, Dem. Rep.','Zambia','Zimbabwe'
]

In [None]:
# Create a new dataframe with our cleaned country list
country_clean = country[country['Country Name'].isin(country_list)]

## Create plots

### Interactive globe

In [None]:
# Plotting 2010 and 2014 visuals
metricscale1=[[0, 'rgb(102,194,165)'], [0.05, 'rgb(102,194,165)'], 
              [0.15, 'rgb(171,221,164)'], [0.2, 'rgb(230,245,152)'], 
              [0.25, 'rgb(255,255,191)'], [0.35, 'rgb(254,224,139)'], 
              [0.45, 'rgb(253,174,97)'], [0.55, 'rgb(213,62,79)'], [1.0, 'rgb(158,1,66)']]
data = [ dict(
        type = 'choropleth',
        autocolorscale = False,
        colorscale = metricscale1,
        showscale = True,
        locations = country_clean['Country Name'].values,
        z = country_clean['2010'].values,
        locationmode = 'country names',
        text = country_clean['Country Name'].values,
        marker = dict(
            line = dict(color = 'rgb(250,250,225)', width = 0.5)),
            colorbar = dict(autotick = True, tickprefix = '', 
            title = 'Unemployment\nRate')
            )
       ]

layout = dict(
    title = 'World Map of Global Youth Unemployment in the Year 2010',
    geo = dict(
        showframe = True,
        showocean = True,
        oceancolor = 'rgb(28,107,160)',
        #oceancolor = 'rgb(222,243,246)',
        projection = dict(
        type = 'orthographic',
            rotation = dict(
                    lon = 60,
                    lat = 10),
        ),
        lonaxis =  dict(
                showgrid = False,
                gridcolor = 'rgb(102, 102, 102)'
            ),
        lataxis = dict(
                showgrid = False,
                gridcolor = 'rgb(102, 102, 102)'
                )
            ),
        )
fig = dict(data=data, layout=layout)
py.iplot(fig, validate=False, filename='worldmap2010')

metricscale2=[[0, 'rgb(102,194,165)'], [0.05, 'rgb(102,194,165)'], 
              [0.15, 'rgb(171,221,164)'], [0.2, 'rgb(230,245,152)'], 
              [0.25, 'rgb(255,255,191)'], [0.35, 'rgb(254,224,139)'], 
              [0.45, 'rgb(253,174,97)'], [0.55, 'rgb(213,62,79)'], [1.0, 'rgb(158,1,66)']]
data = [ dict(
        type = 'choropleth',
        autocolorscale = False,
        colorscale = metricscale2,
        showscale = True,
        locations = country_clean['Country Name'].values,
        z = country_clean['2014'].values,
        locationmode = 'country names',
        text = country_clean['Country Name'].values,
        marker = dict(
            line = dict(color = 'rgb(250,250,200)', width = 0.5)),
            colorbar = dict(autotick = True, tickprefix = '', 
            title = 'Unemployment\nRate')
            )
       ]

layout = dict(
    title = 'World Map of Global Youth Unemployment in the Year 2014',
    geo = dict(
        showframe = True,
        showocean = True,
        oceancolor = 'rgb(28,107,160)',
        projection = dict(
        type = 'orthographic',
            rotation = dict(
                    lon = 60,
                    lat = 10),
        ),
        lonaxis =  dict(
                showgrid = False,
                gridcolor = 'rgb(202, 202, 202)',
                width = '0.05'
            ),
        lataxis = dict(
                showgrid = False,
                gridcolor = 'rgb(102, 102, 102)'
                )
            ),
        )
fig = dict(data=data, layout=layout)
py.iplot(fig, validate=False, filename='worldmap2014')

## Scatterplots, 2010 and 2014

In [None]:
# Scatter plot of 2010 unemployment rates
trace = go.Scatter(
    y = country_clean['2010'].values,
    mode='markers',
    marker=dict(
        size= country_clean['2010'].values,
        #color = np.random.randn(500), #set color equal to a variable
        color = country_clean['2010'].values,
        colorscale='Portland',
        showscale=True
    ),
    text = country_clean['Country Name'].values
)
data = [trace]

layout= go.Layout(
    autosize= True,
    title= 'Scatter plot of unemployment rates in 2010',
    hovermode= 'closest',
#     xaxis= dict(
#         title= 'Pop',
#         ticklen= 5,
#         zeroline= False,
#         gridwidth= 2,
#     ),
    yaxis=dict(
        title= 'Unemployment Rate',
        ticklen= 5,
        gridwidth= 2,
    ),
    showlegend= False
)
fig = go.Figure(data=data, layout=layout)
py.iplot(fig,filename='scatter2010')

# Scatter plot of 2014 unemployment rates
trace1 = go.Scatter(
    y = country_clean['2014'].values,
    mode='markers',
    marker=dict(
        size=country_clean['2014'].values,
        #color = np.random.randn(500), #set color equal to a variable
        color = country_clean['2014'].values,
        colorscale='Portland',
        showscale=True
    ),
    text = country_clean['Country Name']
)
data = [trace1]

layout= go.Layout(
    title= 'Scatter plot of unemployment rates in 2014',
    hovermode= 'closest',
    xaxis= dict(
        ticklen= 5,
        zeroline= False,
        gridwidth= 2,
    ),
    yaxis=dict(
        title= 'Unemployment Rate',
        ticklen= 5,
        gridwidth= 2,
    ),
    showlegend= False
)

fig = go.Figure(data=data, layout=layout)
py.iplot(fig,filename='scatter2014')

## Heatmap of change in unemployment 2010-2014

This heatmap uses the Seaborn library to visualize the countries that had the biggest change in unemployment (top 15 and bottom 15 countries).

In [None]:
# I first create an array containing the net movement in the unemployment rate.
diff = country_clean['2014'].values - country_clean['2010'].values

In [None]:
# filter to only the top and bottom 15 moving countries

x, y = (list(x) for x in zip(*sorted(zip(diff, country_clean['Country Name'].values), 
                                                            reverse = True)))

# Now I want to extract out the top 15 and bottom 15 countries 
Y = np.concatenate([y[0:15], y[-16:-1]])
X = np.concatenate([x[0:15], x[-16:-1]])

In [None]:
# Resize our dataframe first
keys = [c for c in country_clean if c.startswith('20')]
country_resize = pd.melt(country_clean, id_vars='Country Name', value_vars=keys, value_name='key')
country_resize['Year'] = country_resize['variable']

# Use boolean filtering to extract only our top 15 and bottom 15 moving countries
mask = country_resize['Country Name'].isin(Y)
country_final = country_resize[mask]

# Finally plot the seaborn heatmap
plt.figure(figsize=(12,10))
country_pivot = country_final.pivot("Country Name","Year",  "key")
country_pivot = country_pivot.sort_values('2014', ascending=False)
ax = sns.heatmap(country_pivot, cmap='coolwarm', annot=False, linewidths=0, linecolor='white')
plt.title('Movement in Unemployment rate ( Warmer: Higher rate, Cooler: Lower rate )')

# Example - World Happiness, 2008-21

See the [Kaggle notebook](https://www.kaggle.com/code/erglbozkurt/world-happiness-explanatory-data-analysis) by ERGÜLÜ BOZKURT for example data prep and visualizations.

# Example - Chinese population analysis

See the [Kaggle notebook](https://www.kaggle.com/code/masatoshikato/china-s-population-map) by MASATOSHI KATO for example data prep and visualizations.