<img width="10%" alt="Naas" src="https://landen.imgix.net/jtci2pxwjczr/assets/5ice39g4.png?w=160"/>

# OWID - Visualize Population of Different Age Groups
<a href="https://app.naas.ai/user-redirect/naas/downloader?url=https://raw.githubusercontent.com/jupyter-naas/awesome-notebooks/master/OWID/OWID_Visualize_Population_of_Different_Age_Groups.ipynb" target="_parent"><img src="https://naasai-public.s3.eu-west-3.amazonaws.com/Open_in_Naas_Lab.svg"/></a><br><br><a href="https://bit.ly/3JyWIk6">Give Feedbacks</a> | <a href="https://github.com/jupyter-naas/awesome-notebooks/issues/new?assignees=&labels=bug&template=bug_report.md&title=OWID+-+Visualize+Population+of+Different+Age+Groups:+Error+short+description">Bug report</a>

**Tags:** #dash #dashboard #plotly #naas #asset #analytics #dropdown #callback #bootstrap #snippet

**Author:** [Zihui Ouyang](https://www.linkedin.com/in/zihui-ouyang-539626227/)

**Description:** This notebook creates an interactive plot using Dash app infrastructure with OWID's popultion by age group data.

**References:**
- https://ourworldindata.org/grapher/population-by-age-group-with-projections
- https://stackoverflow.com/questions/70886359/dash-python-making-subplots-when-multiple-parameters-are-selected
- https://dash-example-index.herokuapp.com/line-charts

## Input

In [None]:
try:
    import dash
    import os
except:
    !pip install dash --user
    import dash
try:
    import dash_bootstrap_components as dbc
except:
    !pip install dash_bootstrap_components --user
    import dash_bootstrap_components as dbc
import pandas as pd
from dash import Dash, html, dcc, callback, Output, Input
import plotly.express as px
import naas

### Setup Variables
- `DASH_PORT`: specify a port number for Dash
- `url`: URL to get data from Excel
- `title`: App title

In [None]:
DASH_PORT = 8050
url = "https://population.un.org/wpp/Download/Files/1_Indicators%20(Standard)/EXCEL_FILES/2_Population/WPP2022_POP_F02_1_POPULATION_5-YEAR_AGE_GROUPS_BOTH_SEXES.xlsx"
title = "Population composition"

## Model

### Initialize Dash app
The `os.environ.get("JUPYTERHUB_USER")` is used to access the environment variable `JUPYTERHUB_USER` already stored into your Naas Lab.

In [None]:
app = dash.Dash(
    requests_pathname_prefix=f'/user/{os.environ.get("JUPYTERHUB_USER")}/proxy/{DASH_PORT}/',
    external_stylesheets=[dbc.themes.BOOTSTRAP],
    meta_tags=[
        {"name": "viewport", "content": "width=device-width, initial-scale=1.0"}
    ],
)

# app = dash.Dash(title = "Population composition") if you are not in Naas

### Get data from estimates up to 2021

In [None]:
contents_df = pd.read_excel(url, sheet_name = "Estimates", header = 16) 
contents_df = contents_df.drop([72, 649, 1154, 1587])
contents_df = contents_df.reset_index(drop=True)# Clean Data
entity = contents_df["Region, subregion, country or area *"]
year = contents_df["Year"]

### Sort Data

In [None]:
under_five = contents_df["0-4"]*1000 # sort out the data
under_fifteen = under_five.add(contents_df["5-9"]*1000).add(contents_df["10-14"]*1000)
under_twenty_five = under_fifteen.add(contents_df["15-19"]*1000).add(contents_df["20-24"]*1000)
twenty_five_to_sixty_four = (contents_df["25-29"].add(contents_df["30-34"]).add(contents_df["35-39"]).add(contents_df["40-44"]).add(contents_df["45-49"]).add(contents_df["50-54"]).add(contents_df["55-59"]).add(contents_df["60-64"]))*1000
sixty_five_plus = (contents_df["65-69"].add(contents_df["70-74"]).add(contents_df["75-79"]).add(contents_df["80-84"]).add(contents_df["85-89"]).add(contents_df["90-94"]).add(contents_df["95-99"]).add(contents_df["100+"]))*1000
total = under_twenty_five.add(twenty_five_to_sixty_four).add(sixty_five_plus) 

### Get data from medium projections from 2022 to 2100

In [None]:
contents_df1 = pd.read_excel(url, sheet_name = "Medium variant", header = 16)
contents_df1 = contents_df1.drop([79, 712, 1266, 1741])
contents_df1 = contents_df1.reset_index(drop=True)
entity_1 = contents_df1["Region, subregion, country or area *"]
year_1 = contents_df1["Year"]
under_five1 = contents_df1["0-4"]*1000
under_fifteen1 = under_five1.add(contents_df1["5-9"]*1000).add(contents_df1["10-14"]*1000)
under_twenty_five1 = under_fifteen1.add(contents_df1["15-19"]*1000).add(contents_df1["20-24"]*1000)
twenty_five_to_sixty_four1 = (contents_df1["25-29"].add(contents_df1["30-34"]).add(contents_df1["35-39"]).add(contents_df1["40-44"]).add(contents_df1["45-49"]).add(contents_df1["50-54"]).add(contents_df1["55-59"]).add(contents_df1["60-64"]))*1000
sixty_five_plus1 = (contents_df1["65-69"].add(contents_df1["70-74"]).add(contents_df1["75-79"]).add(contents_df1["80-84"]).add(contents_df1["85-89"]).add(contents_df1["90-94"]).add(contents_df1["95-99"]).add(contents_df1["100+"]))*1000
total1 = under_twenty_five1.add(twenty_five_to_sixty_four1).add(sixty_five_plus1)

### Combining the data

In [None]:
new_entity = pd.concat([entity, entity_1])
new_year = pd.concat([year, year_1])
new_under_five = pd.concat([under_five, under_five1])
new_under_fifteen = pd.concat([under_fifteen, under_fifteen1])
new_under_twenty_five = pd.concat([under_twenty_five, under_twenty_five1])
new_twenty_five_to_sixty_four = pd.concat([twenty_five_to_sixty_four, twenty_five_to_sixty_four1])
new_sixty_five_plus = pd.concat([sixty_five_plus, sixty_five_plus1])
new_total = pd.concat([total, total1])
new_total = new_total.reset_index(drop=True)
new_entity = new_entity.reset_index(drop=True)
new_year = new_year.reset_index(drop=True)
new_under_five = new_under_five.reset_index(drop=True)
new_under_fifteen = new_under_fifteen.reset_index(drop = True)
new_under_twenty_five = new_under_twenty_five.reset_index(drop = True)
new_twenty_five_to_sixty_four = new_twenty_five_to_sixty_four.reset_index(drop = True)
new_sixty_five_plus = new_sixty_five_plus.reset_index(drop = True)

### Create a brand new dataframe to help plotting

In [None]:
categories = []
categories1 = []
categories2 = []
categories3 = []
categories4 = []
categories5 = []
for i in range(len(new_entity)):
    categories.append("Under 5")
    categories1.append("Under 15")
    categories2.append("Under 25")
    categories3.append("25-64")
    categories4.append("65+")
    categories5.append("Total")
    
new_dict = {"Entity": new_entity,
           "Year": new_year,
            "Categories": categories,
           "Population": new_under_five}
new_df = pd.DataFrame(data=new_dict)

new_dict1 = {"Entity": new_entity,
           "Year": new_year,
            "Categories": categories1,
           "Population": new_under_fifteen}
new_df1 = pd.DataFrame(data=new_dict1)

new_dict2 = {"Entity": new_entity,
           "Year": new_year,
            "Categories": categories2,
           "Population": new_under_twenty_five}
new_df2 = pd.DataFrame(data=new_dict2)

new_dict3 = {"Entity": new_entity,
           "Year": new_year,
            "Categories": categories3,
           "Population": new_twenty_five_to_sixty_four}
new_df3 = pd.DataFrame(data=new_dict3)

new_dict4 = {"Entity": new_entity,
           "Year": new_year,
            "Categories": categories4,
           "Population": new_sixty_five_plus}
new_df4 = pd.DataFrame(data=new_dict4)

new_dict5 = {"Entity": new_entity,
           "Year": new_year,
            "Categories": categories5,
           "Population": new_total}
new_df5 = pd.DataFrame(data=new_dict5)

new_df = pd.concat([new_df, new_df1, new_df2, new_df3, new_df4, new_df5])

### Create Dash app

In [None]:
app.layout = html.Div(
    [
        html.H4("Population of different age groups from 1950 to 2021 with medium projections from 2022 onwards"),
        html.P("Select country"),
        dcc.Dropdown(
            id="country",
            options=new_df.Entity.unique(),
            value="WORLD"
        ),
        dcc.RangeSlider(id='slider', min=1950, max=2100, value=[1950, 2100],
               marks={x: str(x) for x in [1950, 1975, 2000, 2025, 2050, 2075, 2100]}),
        dcc.Graph(id="Population", figure={}, style={'display': 'none'})
    ]
)

@callback(
    Output("Population", 'figure'),
    Output("Population", 'style'),
    Input('country', 'value'),
    Input('slider', 'value')
)

def update_graph(country, year):
    country_list = [country]
    dff = (new_df["Entity"].isin(country_list)) & (new_df["Year"] <= year[1]) & (new_df["Year"] >= year[0]) & (new_df["Year"] <= 2021)
    dff1 = (new_df["Entity"].isin(country_list)) & (new_df["Year"] <= year[1]) & (new_df["Year"] >= year[0]) & (new_df["Year"] > 2021)
    greater = new_df[dff1] # create a new dataframe that will change the names of different categories
    greater.replace(to_replace= "Under 5", value = "Under 5 Projections", inplace=True)
    greater.replace(to_replace= "Under 15", value = "Under 15 Projections", inplace=True)
    greater.replace(to_replace= "Under 25", value = "Under 25 Projections", inplace=True)
    greater.replace(to_replace= "25-64", value = "25-64 Projections", inplace=True)
    greater.replace(to_replace= "65+", value = "65+ Projections", inplace=True)
    greater.replace(to_replace= "Total", value = "Total Projections", inplace=True) 
        
    figures = px.line(
        new_df[dff],
        x='Year',
        y="Population",
        color="Categories",
        markers=True
    ).update_layout(
        plot_bgcolor='rgba(0, 0, 0, 0)',
        height= 600
    )
    figures.add_traces(list(px.line(greater, x="Year", y="Population", color = "Categories").select_traces()))
    styles = {'display': 'block'}   
    return figures, styles

## Output

### Generate URL and show logs

In [None]:
if __name__ == "__main__":
    app.run_server(proxy=f"http://127.0.0.1:{DASH_PORT}::https://app.naas.ai")