## Global Data Analysis: Advanced Layouts and Graphs

Code repo: https://github.com/mannyjrod/Dash_Apps_Sandlot

Author: Emmanuel Rodriguez

[emmanueljrodriguez.com/](https://emmanueljrodriguez.com/)

19DEC2023, Renton, Seattle, WA

Ref: Schroeder, "The Book of Dash" Ch. 5, p.71

### Run the World Bank Data Analysis Dash App

Run the World Bank Data Analysis dashboard application from: https://github.com/DashBookProject/Plotly-Dash/tree/master/Chapter-5
> the .py script should be downloaded to the project folder

> Note: You can run a .py script inside a Jupyter cell by using the magic command `%run name-of-script.py`

In [None]:
%run ./Chapter-5/worldbank.py

Dash is running on http://127.0.0.1:8050/

 * Serving Flask app "worldbank" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: on


# Reproduce the Code:

## Project Requirements

Build a Dash app that compares and analyzes world data on three metrics: internet usage, proportion of females in parliament, and carbon dioxide (CO2) emissions.

### Import Libraries

Two Python libraries are being introduced:

- `dash_bootstrap_components` - An add-on to the built-in Dash layout capabilities, which has more components such as graphs and radio buttons, and allows detailed styling of these elements.

- `pandas_datareader` - A pandas extension that retrieves data via *APIs* and creates DataFrames from that data.

In [1]:
from dash import Dash, dcc, html, Input, Output, callback, State
import plotly.express as px
import dash_bootstrap_components as dbc
from pandas_datareader import wb # This app will use data from the World Bank only, to access this data the World Bank
# module `wb` is imported from the datareader extension.

import pandas as pd

### Read Data

#### Connecting to an API

- An API facilitates the app to read data dynamically; by connecting to the API via pandas datareader, data can be updated on the fly.

In [2]:
# Download the necessary country data into the app; country name and ISO id for mapping on choropleth

# Print the first 10 rows
countries = wb.get_countries() # the get_countries() function will query information about specified countries
print(countries.head(10)[['name']])
#exit() ## **NOTE** this function will kill the kernel

                          name
0                        Aruba
1  Africa Eastern and Southern
2                  Afghanistan
3                       Africa
4   Africa Western and Central
5                       Angola
6                      Albania
7                      Andorra
8                   Arab World
9         United Arab Emirates


In [3]:
# Manipulate the data so that only countries are contained in the dataset
countries["capitalCity"].replace({"": None}, inplace=True) # Exclude rows that don't have a capital city, leaving only country names
countries.dropna(subset=["capitalCity"], inplace=True)
countries = countries[["name", "iso3c"]] # ISO3 is a code designator used by Plotly to plot points on a map;
# Any other information that get_countries() returns is not needed, so the DataFrame is limited to the two necessary columns.

countries = countries[countries["name"] != "Kosovo"] # Exclude the Kosovo row (this is a corrupt row)
countries = countries.rename(columns={"name": "country"})

In [4]:
# Display the first 10 rows of the new DataFrame
countries.head(10)

Unnamed: 0,country,iso3c
0,Aruba,ABW
2,Afghanistan,AFG
5,Angola,AGO
6,Albania,ALB
7,Andorra,AND
9,United Arab Emirates,ARE
10,Argentina,ARG
11,Armenia,ARM
12,American Samoa,ASM
13,Antigua and Barbuda,ATG


### Identify Indicators

Extract the World Bank data tied to the three indicators: internet usage, females in parliament, and CO2 emissions.

The indicator names can be found by going [https://data.worldbank.org/indicator](https://data.worldbank.org/indicator) > All Indicators, then locate the indicators specified.

In [5]:
df = wb.get_indicators()[['id','name']]
df1 = df[df.name == 'Individuals using the Internet (% of population)']
df2 = df[df.name == 'Proportion of seats held by women in national parliaments (%)']
df3 = df[df.name == 'CO2 emissions (kt)']

#df = [df1, df2, df3]
#df # Print the indicator IDs

In [6]:
# Concatenate the dataframes
df = pd.concat([df1, df2, df3])

In [7]:
print(df)
print(type(df))
print(df.shape)
print(df.size)

                   id                                               name
8615   IT.NET.USER.ZS   Individuals using the Internet (% of population)
16097  SG.GEN.PARL.ZS  Proportion of seats held by women in national ...
6177   EN.ATM.CO2E.KT                                 CO2 emissions (kt)
<class 'pandas.core.frame.DataFrame'>
(3, 2)
6


In [8]:
# Convert dataframe to dictionary
indicators = df.to_dict()
print(indicators)

{'id': {8615: 'IT.NET.USER.ZS', 16097: 'SG.GEN.PARL.ZS', 6177: 'EN.ATM.CO2E.KT'}, 'name': {8615: 'Individuals using the Internet (% of population)', 16097: 'Proportion of seats held by women in national parliaments (%)', 6177: 'CO2 emissions (kt)'}}


In [9]:
indicators = {
    "IT.NET.USER.ZS": "Individuals using the Internet (% of population)",
    "SG.GEN.PARL.ZS": "Proportion of seats held by women in national parliaments (%)",
    "EN.ATM.CO2E.KT": "CO2 emissions (kt)",
}

### Extract the Data

In [10]:
# Build a function that downloads historical data for the specified indicators
def update_wb_data():
    # Retrieve specific world bank data from API
    df = wb.download(
        indicator=(list(indicators)), country=countries["iso3c"],
        start=2005, end=2016
    )
    df = df.reset_index()
    df.year = df.year.astype(int)
    
    # Add country ISO3 ID to main df
    df = pd.merge(df, countries, on="country")
    df = df.rename(columns=indicators)
    return df

# The 'update_wb_data()' function will be called inside the first callback

### App Layout

The `dash_bootstrap_components` package will be used to style the app, create the layout, and add Bootstrap components (e.g., buttons and radio items).

In [11]:
app = Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP]) # Assign the Bootstrap theme to the 
# `external_stylesheets` parameter

In [12]:
app.layout = dbc.Container(
    [
        dbc.Row(
            dbc.Col(
                [
                    html.H1(
                        "Comparison of World Bank Country Data",
                        style={"textAlign": "center"},
                    ),
                    dcc.Graph(id="my-choropleth", figure={}),
                ],
                width=12,
            )
        ),
        dbc.Row(
            dbc.Col(
                [
                    dbc.Label(
                        "Select Data Set:",
                        className="fw-bold",
                        style={"textDecoration": "underline", "fontSize": 20},
                    ),
                    # Create radio buttons
                    dcc.RadioItems(
                        id="radio-indicator",
                        options=[{"label": i, "value": i} for i in indicators.values()],
                        value=list(indicators.values())[0],
                        inputClassName="me-2",
                    ),
                ],
                width=4,
            )
        ),
        dbc.Row(
            [
                dbc.Col(
                    [
                        dbc.Label(
                            "Select Years:",
                            className="fw-bold",
                            style={"textDecoration": "underline", "fontSize": 20},
                        ),
                        # Define range slider
                        dcc.RangeSlider(
                            id="years-range",
                            min=2005,
                            max=2016,
                            step=1,
                            value=[2005, 2006],
                            marks={
                                2005: "2005",
                                2006: "'06",
                                2007: "'07",
                                2008: "'08",
                                2009: "'09",
                                2010: "'10",
                                2011: "'11",
                                2012: "'12",
                                2013: "'13",
                                2014: "'14",
                                2015: "'15",
                                2016: "2016",
                            },
                        ),
                        # Create Submit button
                        dbc.Button(
                            id="my-button",
                            children="Submit",
                            n_clicks=0,
                            color="primary",
                            className="mt-4",
                        ),
                    ],
                    width=6,
                ),
            ]
        ),
        # Declare the `Store` component - this is used to save dashboard data in memory on the user's web browser so that
        # the data can be called and recalled quickly and efficiently.
        dcc.Store(id="storage", storage_type="session", data={}), # The callbacks will use the store component
        
        # Add a Dash `Interval` component, which is used to automatically update the app at the set time interval
        dcc.Interval(id="timer", interval=1000 * 60, n_intervals=0), # Every 60 seconds, the `Interval` reactivates
        # the callback to pull the data again and create a new DataFrame.
    ]
)

### Callbacks

Two callbacks will be used.
- The first callback is responsible for retrieving data from the World Bank through the pandas datareader API
- The second callback is responsible for creating and displaying the choropleth map on the app

In [13]:
# Data retrieval callback

@app.callback(
    Output(component_id="storage", component_property="data"), 
    Input(component_id="timer", component_property="n_intervals")
)

def store_data(n_time):
    dataframe = update_wb_data()
    return dataframe.to_dict("records")

In [14]:
# Figure creation callback

@app.callback(
    Output("my-choropleth", "figure"),
    Input("my-button", "n_clicks"),
    Input("storage", "data"),
    State("years-range", "value"),
    State("radio-indicator", "value"),
)

def update_graph(n_clicks, stored_dataframe, years_chosen, indct_chosen):
    dff = pd.DataFrame.from_records(stored_dataframe)
    print(years_chosen)
    
    if years_chosen[0] != years_chosen[1]:
        dff = dff[dff.year.between(years_chosen[0], years_chosen[1])]
        dff = dff.groupby(["iso3c", "country"])[indct_chosen].mean()
        dff = dff.reset_index()
        
        fig = px.choropleth(
            data_frame=dff,
            locations="iso3c",
            color=indct_chosen,
            scope="world",
            hover_data={"iso3c": False, "country": True},
            labels={
                indicators["SG.GEN.PARL.ZS"]: "% parliament women",
                indicators["IT.NET.USER.ZS"]: "pop % using internet",
            },
        )
        fig.update_layout(
            geo={"projection": {"type": "natural earth"}},
            margin=dict(l=50, r=50, t=50, b=50),
        )
        return fig
    
        if years_chosen[0] == years_chosen[1]:
            dff = dff[dff["year"].isin(years_chosen)]
            fig = px.choropleth(
                data_frame=dff,
                locations="iso3c",
                color=indct_chosen,
                scope="world",
                hover_data={"iso3c": False, "country": True},
                labels={
                    indicators["SG.GEN.PARL.ZS"]: "% parliament women",
                    indicators["IT.NET.USER.ZS"]: "pop % using internet",
                },
            )
            fig.update_layout(
                geo={"projection": {"type": "natural earth"}},
                margin=dict(l=50, r=50, t=50, b=50),
            )
            return fig

In [15]:
#Run the application

if __name__ == "__main__":
    app.run_server(debug=True, use_reloader=False) # Setting the 'debug' parameter to True activates the Callback diagram

Dash is running on http://127.0.0.1:8050/

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: on
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005

[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]
[2005, 2006]

**12/21/23 12:00 Done!**