## Treemap plot

The coffee dataset is used here to plot a treemap. Dataset contains the countries that produce coffee around the world.
Here, we will plot a binary tree plot, where has two sub sections one the "Continent" and other one the "Country".

In [1]:
import pygal
import pandas as pd

**Coffee producing region by country and continent can be visualized.**

**Link to the dataset** - http://www.fao.org/faostat/en/#data/QC

In [2]:
coffee_df = pd.read_csv('../datasets/coffee_production_stats.csv')

In [3]:
coffee_df.shape

(85, 15)

In [4]:
coffee_df.head()

Unnamed: 0,Domain Code,Domain,Area Code,Area,Continent,Element Code,Element,Item Code,Item,Year Code,Year,Unit,Value,Flag,Flag Description
0,QC,Crops,7,Angola,Africa,5510,Production,656,"Coffee, green",2017,2017,tonnes,15436.0,Im,FAO data based on imputation methodology
1,QC,Crops,23,Belize,North America,5510,Production,656,"Coffee, green",2017,2017,tonnes,80.0,Im,FAO data based on imputation methodology
2,QC,Crops,53,Benin,Africa,5510,Production,656,"Coffee, green",2017,2017,tonnes,50.0,Im,FAO data based on imputation methodology
3,QC,Crops,19,Bolivia (Plurinational State of),South America,5510,Production,656,"Coffee, green",2017,2017,tonnes,21181.0,,Official data
4,QC,Crops,21,Brazil,South America,5510,Production,656,"Coffee, green",2017,2017,tonnes,2680515.0,,Official data


**Trim the dataset** by reducing the number of columns

In [5]:
coffee_df = coffee_df[['Area', 'Continent', 'Item', 'Value']]

#### Rename the Column names to make it relevant.

In [6]:
coffee_df.columns = ['Country', 'Continent', 'Item', 'Value']

In [7]:
coffee_df.head()

Unnamed: 0,Country,Continent,Item,Value
0,Angola,Africa,"Coffee, green",15436.0
1,Belize,North America,"Coffee, green",80.0
2,Benin,Africa,"Coffee, green",50.0
3,Bolivia (Plurinational State of),South America,"Coffee, green",21181.0
4,Brazil,South America,"Coffee, green",2680515.0


###### Drop the NA values in the dataframe

In [8]:
coffee_df= coffee_df.dropna()

###### List out all the Continents in the datasets.

In [9]:
coffee_df['Continent'].unique()

array(['Africa', 'North America', 'South America', 'Asia', 'Oceania'],
      dtype=object)

Get the rows corresponding to each continent in separate dataframe. 

***process_df*** is method defined to convert each dataframe row of the continent into form of dictionary where 

- ***{'value':value_of_country, 'label':name_of_the_country}*** 
- ***append it to a list.***

In [10]:
def process_df(continent_name):
    
    continent_df = coffee_df[coffee_df['Continent'] == continent_name]
    countries_list = []
    
    for index, row in continent_df.iterrows():
        
        dict_country = {'label':row['Country'], 
                        'value':row['Value']}
        countries_list.append(dict_country)
        
    return countries_list

In [11]:
africa_list = process_df('Africa')
namerica_list = process_df('North America')
samerica_list = process_df('South America')
asia_list = process_df('Asia')
oceania_list = process_df('Oceania')

In [12]:
samerica_list

[{'label': 'Bolivia (Plurinational State of)', 'value': 21181.0},
 {'label': 'Brazil', 'value': 2680515.0},
 {'label': 'Colombia', 'value': 754376.0},
 {'label': 'Ecuador', 'value': 7564.0},
 {'label': 'Guyana', 'value': 401.0},
 {'label': 'Paraguay', 'value': 442.0},
 {'label': 'Peru', 'value': 346466.0},
 {'label': 'Suriname', 'value': 6.0},
 {'label': 'Trinidad and Tobago', 'value': 39.0},
 {'label': 'Venezuela (Bolivarian Republic of)', 'value': 46650.0}]

Method to display the interactive chart.

In [13]:
from IPython.display import display, HTML

html_skeleton = """
<!DOCTYPE html>
<html>
  <head>
  <script type="text/javascript" 
          src="http://kozea.github.com/pygal.js/javascripts/svg.jquery.js">
  </script>
  <script type="text/javascript" 
          src="https://kozea.github.io/pygal.js/2.0.x/pygal-tooltips.min.js"">
  </script>
  </head>
  <body>
    <figure>
      {rendered_chart}
    </figure>
  </body>
</html>
"""

def display_chart(chart):
    rendered_chart = chart.render(is_unicode=True)
    plot_html = html_skeleton.format(rendered_chart=rendered_chart)
    display(HTML(plot_html))

Initialize the pygal object and assign a title.

In [None]:
from pygal.style import LightColorizedStyle

In [14]:
treemap = pygal.Treemap(width = 640,
                        height = 360,
                        explicit_size = True,
                        style = LightColorizedStyle)

In [15]:
treemap.title = 'Coffee Production in 2017 by Continent and Country'

In [16]:
treemap.value_formatter = lambda x: '{:,.10g} metric tons'.format(x)

Add the data to the treemap graph.

In [17]:
treemap.add('Africa', africa_list)
treemap.add('North America', namerica_list)
treemap.add('South America', samerica_list)
treemap.add('Asia', asia_list)
treemap.add('Oceania', oceania_list)

display_chart(treemap)

In [18]:
from pygal.style import DarkSolarizedStyle

In [19]:
treemap = pygal.Treemap(width = 640,
                        height = 360,
                        explicit_size = True,
                        style = DarkSolarizedStyle)

In [20]:
treemap.title = 'Coffee Production in 2017 by Continent and Country'
treemap.value_formatter = lambda x: '{:,.10g} metric tons'.format(x)

treemap.add('Africa', africa_list)
treemap.add('North America', namerica_list)
treemap.add('South America', samerica_list)
treemap.add('Asia', asia_list)
treemap.add('Oceania', oceania_list)

display_chart(treemap)