### Assignment #4: Basic UI

DS4003 | Spring 2024

Objective: Practice buidling basic UI components in Dash. 

Task: Build an app that contains the following components user the gapminder dataset: `gdp_pcap.csv`. [Info](https://www.gapminder.org/gdp-per-capita/)

UI Components:
A dropdown menu that allows the user to select `country`
-   The dropdown should allow the user to select multiple countries
-   The options should populate from the dataset (not be hard-coded)
A slider that allows the user to select `year`
-   The slider should allow the user to select a range of years
-   The range should be from the minimum year in the dataset to the maximum year in the dataset
A graph that displays the `gdpPercap` for the selected countries over the selected years
-   The graph should display the gdpPercap for each country as a line
-   Each country should have a unique color
-   Graph DOES NOT need to interact with dropdown or slider
-   The graph should have a title and axis labels in reader friendly format  

Layout:  
- Use a stylesheet
- There should be a title at the top of the page
- There should be a description of the data and app below the title (3-5 sentences)
- The dropdown and slider should be side by side above the graph and take up the full width of the page
- The graph should be below the dropdown and slider and take up the full width of the page

Submission: 
- There should be only one app in your submitted work
- Comment your code
- Submit the html file of the notebook save as `DS4003_A4_LastName.html`


**For help you may use the web resources and pandas documentation. No co-pilot or ChatGPT.**

In [13]:
#Install all needed dependencies
import pandas as pd
import numpy as np 
import plotly.express as px
import seaborn as sns
from dash import Dash, dcc, html, Input, Output

In [14]:
#Read in csv into a pandas data frame
df = pd.read_csv("gdp_pcap.csv")
# load the CSS stylesheet
stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css'] 
#Initialize the Dash app
app = Dash(__name__, external_stylesheets=stylesheets)

In [15]:
#View the first five rows of the data frame
print(df.head())
#Get number of rows and cols
print(df.shape)
#Get the data types of the cols
print(df.dtypes)

       country  1800  1801  1802  1803  1804  1805  1806  1807  1808  ...  \
0  Afghanistan   599   599   599   599   599   599   599   599   599  ...   
1       Angola   465   466   469   471   472   475   477   479   481  ...   
2      Albania   585   587   588   590   592   593   595   597   598  ...   
3      Andorra  1710  1710  1710  1720  1720  1720  1730  1730  1730  ...   
4          UAE  1420  1430  1430  1440  1450  1450  1460  1460  1470  ...   

    2091   2092   2093   2094   2095   2096   2097   2098   2099   2100  
0   4800   4910   5030   5150   5270   5390   5520   5650   5780   5920  
1  24.8k  25.3k  25.9k  26.4k  26.9k  27.4k    28k  28.5k  29.1k  29.6k  
2    54k  54.6k  55.2k  55.8k  56.4k  56.9k  57.5k  58.1k  58.7k  59.2k  
3  79.3k  79.5k  79.8k  80.1k  80.4k  80.7k    81k  81.2k  81.5k  81.8k  
4  92.5k  92.6k  92.6k  92.7k  92.8k  92.9k  92.9k    93k  93.1k  93.1k  

[5 rows x 302 columns]
(195, 302)
country    object
1800        int64
1801        int64
1802

In [16]:
# Define a function to convert the k to thousands
# Will take any instance of k and remove it and return the correct numeric version
# If k is not present it will return the proper numeric
def convert(x):                              
    if isinstance(x, str) and 'k' in x.lower():             
        return float(x[:-1]) * 1000                         
    return float(x)                
                         
# Convert all columns
convert_cols = {str(year): convert for year in range(1800, 2101)}      
# Reread in csv with the necessary converter 
df = pd.read_csv("gdp_pcap.csv", converters = convert_cols)   
df.head()

Unnamed: 0,country,1800,1801,1802,1803,1804,1805,1806,1807,1808,...,2091,2092,2093,2094,2095,2096,2097,2098,2099,2100
0,Afghanistan,599.0,599.0,599.0,599.0,599.0,599.0,599.0,599.0,599.0,...,4800.0,4910.0,5030.0,5150.0,5270.0,5390.0,5520.0,5650.0,5780.0,5920.0
1,Angola,465.0,466.0,469.0,471.0,472.0,475.0,477.0,479.0,481.0,...,24800.0,25300.0,25900.0,26400.0,26900.0,27400.0,28000.0,28500.0,29100.0,29600.0
2,Albania,585.0,587.0,588.0,590.0,592.0,593.0,595.0,597.0,598.0,...,54000.0,54600.0,55200.0,55800.0,56400.0,56900.0,57500.0,58100.0,58700.0,59200.0
3,Andorra,1710.0,1710.0,1710.0,1720.0,1720.0,1720.0,1730.0,1730.0,1730.0,...,79300.0,79500.0,79800.0,80100.0,80400.0,80700.0,81000.0,81200.0,81500.0,81800.0
4,UAE,1420.0,1430.0,1430.0,1440.0,1450.0,1450.0,1460.0,1460.0,1470.0,...,92500.0,92600.0,92600.0,92700.0,92800.0,92900.0,92900.0,93000.0,93100.0,93100.0


In [17]:
#sorting years to get the min and max years
years = sorted(int(year) for year in df.columns if year.isdigit())

# Callback to update the graph based on dropdown and slider inputs
@app.callback(
    Output('line graph', 'figure'),
    [Input('country dropdown', 'value'),
    Input('year slider', 'value')]
)

#Link between the graph and the dropdown and slider components
def update_graph(selected_countries, selected_years):
    # If no countries are selected default to an empty graph
    if not selected_countries:
        selected_countries = []
    # Defulat to having full range selected
    if not selected_years:
        selected_years = [min(years), max(years)]
    # Filter the data frame by which countries are currently selected
    filtered_df = df[df['country'].isin(selected_countries)]
    # Filter the year array for only the range of years selected in the slider
    filtered_years = [str(year) for year in range(selected_years[0], selected_years[1] + 1)]
    # Filter the data frame to keep only the selected years and add country col back
    filtered_df = filtered_df[['country'] + filtered_years]
    # Melt the filtered DataFrame into a plottable form
    df_plottable = pd.melt(filtered_df, id_vars=['country'], value_vars=filtered_years, var_name='year', value_name='gdp')
    # Ensure year is treated as an integer (although some of them are weird objects so I dont really know how to deal with them)
    df_plottable['year'] = df_plottable['year'].astype(int)
    # Create the plot
    fig = px.line(df_plottable, 
                  x='year', 
                  y='gdp', 
                  color='country', 
                  line_group='country',
                  title='GDP Per Capita Over Time',
                  labels={'gdp': 'GDP per Capita', 'year': 'Years'})
    
    
    return fig

In [18]:
#sorting years to get the min and max years
years = sorted(int(year) for year in df.columns if year.isdigit())
#App layout to define the structure of the web page
app.layout = html.Div([
    #Contaier div to store all the components
    html.Div(children=[
        #Title describing the purpose of the web app
        html.H1('UI Components for Gapminder Dataset'),
        #Description of the data and app 
        html.P('''
        This dashboard allows you to explore the GDP per capita. You can select multiple countries and a range of years to see trends in the GDP per capita. A visulization will be displayed in the form of a line graph.
        '''),
        #Header for country dropdown with a small margin on the top to seperate it from the title
        html.H2('Select Countries and a Range of Years', style = {'margin-top': '3%'}) ,
        
        html.Div(children=[
            html.Div(
                # Country dropdown component
                dcc.Dropdown(
                    # Identifier for the country dropdown component
                    id='country dropdown',  
                    #Using the dataframe to populate the options by creating key value pairs with unique contry names
                    options=[{'label': country, 'value': country} for country in df['country'].unique()],  
                    #Custom placeholder text to describe what actions to take as a user
                    placeholder = 'Select One or More Countries',
                    # Allows user to select multiple countries at once
                    multi=True,  
                # Styling to fit half the width
                ), className = 'six columns'
            ),

            # Year range slider
            html.Div(
                dcc.RangeSlider(
                    # Identifier for the year range slider component
                    id='year slider',
                    # Lower bound as determined from our sorted year array 
                    min=years[0],
                    # Upper bound as determined from our sorted year array 
                    max=years[-1],
                    # Default to having full range selected
                    value=[years[0], years[-1]],
                    # Put a mark every 10 years for readability
                    marks={str(year): str(year) for year in years[::50]},
                # Styling to fit half the width
                ), className = 'six columns'
                
            ),  
            # Styling to fit entire width
        ], className = 'twelve columns'),

        # Line Graph
        html.Div(
            # Display graph on web app 
            dcc.Graph(id='line graph'),
            # Styling to fit entire width
            className = 'twelve columns'
        ),

    #Styling for the container div that has all components stored in column direction and centers them
    ], style={'display': 'flex', 'alignItems': 'center', 'height': '100vh', 'flex-direction': 'column'}),  
])

#Run the app
if __name__ == '__main__':
    app.run(debug=True)