<h2 style="margin:.5em 0;">Visual Analytics</h2>
<h3 style="margin:.5em 0;">Fakultät für Elektronik und Informatik &ndash; Hochschule Aalen</h3>
<h4 style="margin:.5em 0;">Autoren: Stefan Wehrenberg, Martin Heckmann</h4>
<h4 style="margin:.5em 0;">13.01.23</h4>

# Assignment 10 - Global Disasters 3

Create an interactive [choropleth map](https://en.wikipedia.org/wiki/Choropleth_map) using some parts of the previous assignments' implementation.

### Task 1
Instead of displaying each disaster as a separate marker on a scatter plot, we will now use a choropleth map to group disasters by country.
- Use Plotly's [`px.choropleth_mapbox()`](https://plotly.com/python/mapbox-county-choropleth/) to create the choropleth map. This, again, is a Mapbox map, which can be used without an access token
- To display the colored polygons of each country's border, we use the GeoJSON file `countries.geo.json`, which contains lists (polygons) of latitude/longitude pairs for nearly all countries included in `assets/global_disasters.csv` (though it does not contain countries that seized to exist, like the UdSSR)
- Similar to the previous assignments, add the dropdowns that allow filtering for certain disaster types (again it should be possible to select multiple types) and a value (total deaths, cost, and disaster count) as well as the slider, that allows limiting the timeframe
- The color for the polygons should be based on the currently selected disaster type's value (total deaths, cost, or disaster count). If multiple disaster types are selected, add their values. Choose a colormap such that it is a good fit for the task 

**Hints**:<br>
- GeoJSON is a JSON format that represents different simple geographical features (points, lines, or polygons consisting of coordinates) and nonspatial attributes (properties like names).
You can use the following code snippet to import the JSON:

```python
# Load the geojson
with open('assets/countries.geo.json') as f:
    geojson = json.load(f)
```

- Choropleth maps need a GeoJSON file and a dataset that will be used to color the polygons and create tooltips. The `location` is the column in our data that links the ID for each polygon of the GeoJSON to our dataset. You can use the following code snippet to create the choropleth map:

```python
fig = px.choropleth_mapbox(filtered_df, geojson=geojson, locations='ISO',
                           mapbox_style='carto-positron', 
                           # Set the column, that defines the color of each location
                           color=value,
                           # Set a zoom level and a low opacity, so we can still see the map beneath
                           zoom=.5, opacity=0.6)
```

### Task 2
Create two plots, a line and some sort of a bar plot, based on the countries clicked on the choropleth map:
- The first can be implemented very similarly to the previous assignments’ first plot. The plot should show for the selected countries the time course of the selected value (total deaths, cost, and disaster count) summed over all selected disaster types. A separate line for each country should be shown if more than one country is selected. If no country is selected, it should show the sum over all countries. The time range shown should reflect the selected range of the time slider.  This means the data displayed on the line plot will reflect the filtering by the dropdowns and the time slider selection.
- Implement the selection via click in a way so that with each click the clicked country is added to the list of selected countries. Also, add a radio button that allows resetting the selection. When no country is selected, show the value summed over all countries. 
- This way we can use the click interaction to select multiple countries instead of the country dropdown that we used in the previous assignments.
- For the second plot show the ratio of the different disaster types for each country separately. Choose an appropriate visualization.
- Try to fit the whole application into a single view, so the contents of the plots can be compared.

**Hints**:<br>
- You can use the following code snippet to :

1. Get the ISO of a clicked or hovered polygon on the choropleth plot. To allow clicking the same country twice in a row, adding and removing it, we need to manually empty the 'clickData' variable by setting it to 'None'.
```python
@app.callback(
    Output('click_plot', 'figure'),
    Output('mapbox_plot', 'clickData'),
    Input('mapbox_plot', 'clickData'),
    ...
)
def plot_callback(clickData, ...):
    if clickData:
        country = clickData['points'][0]['location']
    ...
    return fig, None
```

2. Stop plotly from conneting gaps automatically to depict missing data
```python
# Use markers to display unconnected data points in a line plot
fig = px.line(... markers=True)
fig.update_traces(connectgaps=False)
```

3. Global variables can't be edited inside callback functions. Use the ```dcc.Store``` component to save the state of your application, e.g. the list of clicked countries. 
```python
# In the app.layout
dcc.Store(id='memory'),
```
```python
# In the callback
@app.callback(
      ...
      # Writting stored data using Output
      Output('memory', 'data'),
      # Stored data is not an input type, 
      # thus wont trigger a callback by itself,
      # though it can be used as a parameter 
      # for the callback function
      State('memory', 'data'),
      ...
)
def plot_callback(..., storedData):
    # Do something with the data then return it
    return ... storedData
```

## Solution
### Task 1 & 2 
(Since the app needs to be implemented in a single cell)
#### Import packages

In [3]:
# Plotly and Pandas imports
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
import plotly.io as pio
import numpy as np
import json

# Dash imports
from dash import dcc, html, Input, Output, State, Dash
#from jupyter_dash import JupyterDash

#### Set global plotly fontsize

In [4]:
global_fontsize = 14
# Edit the default template of plotly to set the font size for all figures.
# This wont work if the template is specifically changed from 'plotly', for a figure.
pio.templates["plotly"].layout['font'].size = global_fontsize

#### Import the data
We will get some warnings about the 'Longitude' and 'Latitude' columns having mixed types (They include float, int and strings).<br> Ignore this for now, we'll handle it later on.

In [5]:
disaster_df = pd.read_csv('assets/global_disasters.csv')

  disaster_df = pd.read_csv('assets/global_disasters.csv')


In [12]:
disaster_df


Unnamed: 0,Dis No,Year,Seq,Glide,Disaster Group,Disaster Subgroup,Disaster Type,Disaster Subtype,Disaster Subsubtype,Event Name,...,"Reconstruction Costs, Adjusted ('000 US$)",Insured Damages ('000 US$),"Insured Damages, Adjusted ('000 US$)",Total Damages ('000 US$),"Total Damages, Adjusted ('000 US$)",CPI,Adm Level,Admin1 Code,Admin2 Code,Geo Locations
0,1900-9002-CPV,1900,9002,,Natural,Climatological,Drought,Drought,,,...,,,,,,3.077091,,,,
1,1900-9001-IND,1900,9001,,Natural,Climatological,Drought,Drought,,,...,,,,,,3.077091,,,,
2,1901-0003-BEL,1901,3,,Technological,Technological,Industrial accident,Explosion,,Coal mine,...,,,,,,3.077091,,,,
3,1902-0012-GTM,1902,12,,Natural,Geophysical,Earthquake,Ground movement,,,...,,,,25000.0,781207.0,3.200175,,,,
4,1902-0003-GTM,1902,3,,Natural,Geophysical,Volcanic activity,Ash fall,,Santa Maria,...,,,,,,3.200175,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
25733,2022-0711-COD,2022,711,,Technological,Technological,Miscellaneous accident,Other,,Stampede,...,,,,,,,,,,
25734,2022-0569-ZWE,2022,569,EP-2022-000304,Natural,Biological,Epidemic,Viral disease,,Measles,...,,,,,,,,,,
25735,2022-0216-ZWE,2022,216,,Technological,Technological,Transport accident,Road,,,...,,,,,,,,,,
25736,2022-0201-TLS,2022,201,EP-2022-000162,Natural,Biological,Epidemic,Viral disease,,Dengue,...,,,,,,,,,,


Import the country border geojson data (Rough country borders for a better performance)

In [6]:
# Source: https://www.kaggle.com/datasets/chapagain/country-state-geo-location?resource=download
with open('assets/countries.geo.json') as f:
    geojson = json.load(f)

Reduce data to only the needed amount for better performance when iterating the dataframe

In [7]:
cut_disaster_df = disaster_df[['Year', 'Disaster Group', 'Disaster Type', 'Country', 'ISO',
                                'Total Deaths', 'Total Damages (\'000 US$)']]

#### Fill missing values with zeros, to avoid a bug when plotting in dashs callback functions

In [8]:
# Fill NaN values to stop Dash from displaying unintended traces / This is a bugfix
filled_df = cut_disaster_df.fillna({'Total Damages (\'000 US$)':0.0, 'Total Deaths':0})

In [9]:
filled_df.head()

Unnamed: 0,Year,Disaster Group,Disaster Type,Country,ISO,Total Deaths,Total Damages ('000 US$)
0,1900,Natural,Drought,Cabo Verde,CPV,11000.0,0.0
1,1900,Natural,Drought,India,IND,1250000.0,0.0
2,1901,Technological,Industrial accident,Belgium,BEL,18.0,0.0
3,1902,Natural,Earthquake,Guatemala,GTM,2000.0,25000.0
4,1902,Natural,Volcanic activity,Guatemala,GTM,1000.0,0.0


In [10]:
filled_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 25738 entries, 0 to 25737
Data columns (total 7 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   Year                      25738 non-null  int64  
 1   Disaster Group            25738 non-null  object 
 2   Disaster Type             25738 non-null  object 
 3   Country                   25738 non-null  object 
 4   ISO                       25738 non-null  object 
 5   Total Deaths              25738 non-null  float64
 6   Total Damages ('000 US$)  25738 non-null  float64
dtypes: float64(2), int64(1), object(4)
memory usage: 1.4+ MB


#### Implement the application

In [11]:
# Set the global stylesheet
external_stylesheets = ['assets/hs_webpage_bootstrap.css', 'assets/custom.css']
# Path of logo image
logo_img = 'assets/150202_Hochschule_AA_rgb.png'

# Initialize the application
app = Dash(__name__, title='Open Campus Day App', external_stylesheets=external_stylesheets)
#app = JupyterDash(__name__, title='Open Campus Day App', external_stylesheets=external_stylesheets)

# Global font settings
htmlFontSize = global_fontsize
fontfamily = '"Open Sans", verdana, arial, sans-serif'

# Define some variables for easier readable code and to avoid repeated calls of the dataframe in the callbacks and layout
year_min = filled_df['Year'].min()
year_max = filled_df['Year'].max()

app.layout = html.Div([
    html.Div([
        html.Div([
            html.H1('Open Campus Day 2023', className='page-header',
                    style={'margin': '0', 'padding-bottom':'1%', 'font-size': '2.5vw'}),
            html.H2('Die Welt ändert sich: Interaktive Karte aller Naturkatastrophen der letzten 100 Jahre',
                    style={'margin': '0', 'padding-top': '0', 'font-size': '1.2vw'}),
        ], 
            style={'width': '70%', 'display': 'inline-block'}),
        html.Div([
            html.Img(src=logo_img, style={'float':'right', 'width':'80%'}),
        ], 
            style={'width': '30%', 'display': 'inline-block'}),
    ], className='head-container', style={'padding': '1% 2% 0% 2%',
                                          'display':'flex',
                                          'justify-content':'center',
                                          'align-items':'center'}),
    
    # Storage for clicked countries
    dcc.Store(id='memory'),
    # Graphs
    html.Div([
        # Task 1
        html.Div([
            # Mapbox Graph
            dcc.Graph(id='mapbox_plot', style={'height': '60vh'}),
            html.Div([html.Div([
                    # Range slider for year selection
                    dcc.RangeSlider(min=year_min, max=year_max,
                                    step=1, value=[year_min, year_max],
                                    # Marks every 10 years
                                    marks={i: {'label': f'{i}', 
                                        'style':{'fontSize': '.8vw', 
                                                'fontFamily': fontfamily}} for i in range(year_min, year_max, 10)},
                                    id='year_slider',
                                    # To edit the textsize of the slider handles we would need to create a CSS file 
                                    # and assign it in the following dictionary to className
                                    tooltip={'placement': 'bottom', 'always_visible': True}),
                ]),
                
            
                # Dropdown menus and their labels
                html.Div([
                    html.Div([html.H3('Displayed Value:', style={'font-size': '1.2vw', 'margin': '0'})], 
                             style={'width': '20%', 'display': 'inline-block'}),
                    html.Div([dcc.Dropdown(['Total Damages (\'000 US$)', 'Total Deaths', 'Disaster Count'], style={'font-size': '1vw'},
                                        id='value_dropdown', value='Disaster Count', clearable=False)], 
                                        style={'width': '45%', 'display': 'inline-block'}, className='dropUp'),
                    # Spacer
                    html.Div([], style={'width':'10%'}),
                    # A regular click event fur buttons does not exist in dash
                    # n_clicks tracks how often the button has been clicked
                    html.Button('Reset Selection', id='reset_button', className='btn btn-primary btn-lg', n_clicks=0, 
                                style={'width':'25%', 'margin':'0'}),
                ], style={'padding': '2% 2% 2% 2%', 
                        'display':'flex', 
                        'justify-content':'center', 
                        'align-items':'center'}),
                html.Div([
                    html.Div([html.H3('Disaster Type:', style={'font-size': '1.2vw', 'margin': '0'})], 
                             style={'width': '20%', 'display': 'inline-block'}),
                    html.Div([dcc.Dropdown(filled_df['Disaster Type'].unique(), multi=True, id='type_dropdown', style={'font-size': '1vw'})], 
                            style={'width': '80%', 'display': 'inline-block'}, className='dropUp'),
                ], style={'height': '20%',
                        'padding': '0px 2% 2% 2%', 
                        'display':'flex', 
                        'justify-content':'center', 
                        'align-items':'center'}),
            ], style={'padding': '2% 2% 0 0', 'height':'28vh'}),
        ], style={'width': '60%', 'display': 'inline-block'}),
        
        # Task 2
        html.Div([
            # Line Graph
            dcc.Graph(id='line_plot',
                      style={'width': '100%', 'height': '44vh', 'margin': '0px', 'padding': '0px'}),
            # Bar Graph
            dcc.Graph(id='bar_plot',
                      style={'width': '100%', 'height': '44vh', 'margin': '0px', 'padding': '0px'}),
        ], style={'width': '40%', 'display': 'inline-block', 'float': 'right'})
    ], style={'background-color': '#f7f9f9', 'padding':'0% 2% 0% 2%'}),
],
    # Style settings of parent-div apply to all elements inside
    style={'fontSize': htmlFontSize, 'fontFamily': fontfamily, 'height':'100vh', 'width':'100vw'},
)


# Callback to create the choropleth graph
@app.callback(
    Output('mapbox_plot', 'figure'),
    Input('year_slider', 'value'),
    Input('value_dropdown', 'value'),
    Input('type_dropdown', 'value')
)
def get_map_plot(years, value, types):
    # Filter year span
    filtered_df = filled_df[cut_disaster_df['Year'].between(years[0], years[1])]
    # Filter disaster type
    if types:
        filtered_df = filtered_df[filtered_df['Disaster Type'].isin(types)]
    if value == 'Disaster Count':
        # Count the disasters by ISO values, reset the index and correct the names of the columns
        # filtered_df = filtered_df['ISO'].value_counts().reset_index().rename(columns={'index':'ISO', 'ISO':'Disaster Count'})
        filtered_df = filtered_df['ISO'].value_counts().reset_index()
        filtered_df.columns = ['ISO', 'Disaster Count']
    else:
        # Summarize deaths/cost per ISO
        filtered_df = filtered_df.groupby(['ISO'])[value].sum().reset_index()
        #print(filtered_df)

    fig = px.choropleth_mapbox(filtered_df, geojson=geojson, locations='ISO',
                           mapbox_style='carto-positron', color_continuous_scale='Reds',
                           # Set the column, that defines the color of each location
                           color=value,
                           # Set a zoom level and a low opacity, so we can still see the map beneath
                           zoom=.5, opacity=0.6)

    # Hide colorbar and legend
    fig.update_layout(showlegend=False, margin={'l': 0, 'b': 0, 't': 35, 'r': 0},
                      paper_bgcolor='#f7f9f9')
    return fig


def get_line_plot(storedData, value, types, years):
    # Initialize result_df used to plot the data
    result_df = pd.DataFrame()
    
    # Filter year span
    filtered_df = filled_df[cut_disaster_df['Year'].between(years[0], years[1])]
    # Filter disaster type
    if types:
        filtered_df = filtered_df[filtered_df['Disaster Type'].isin(types)]
    
    # If stored data for clicked countries exists
    if storedData:
        # Copy the filtered df to reset it after every country
        temp_df = filtered_df
        for country in storedData:
            filtered_df = temp_df[temp_df['ISO'] == country]

            # If 'Disaster Count' is selected, count df entries grouped by year
            if value == 'Disaster Count':
                # Group by year and country
                filtered_df = filtered_df.groupby(['Year', 'ISO'])['Year'].count().reset_index(name='Disaster Count')
            else:
                # Sum the selected value by year and country
                filtered_df = filtered_df.groupby(['Year', 'ISO'])[value].sum().reset_index()
            
            idx = np.arange(year_min, year_max, 1)
            # Set the index to 'Year', to fill all missing years using the idx list.
            # The new entries are by default filled with NaN's and will therefore 
            # display as gaps in the plot if 'connectgaps=False' is set.
            filtered_df = filtered_df.set_index('Year').reindex(idx).reset_index()
            # Set the country ISO to the current country for the newly added rows
            filtered_df['ISO'] = country

            # If the result is still empty, the current countries data is the result
            if result_df.empty:
                result_df = filtered_df
            # If result already includes countries we add the current one using concat
            else:
                result_df = pd.concat([result_df, filtered_df])
    else:
        if value == 'Disaster Count':
            filtered_df = filtered_df.groupby(['Year'])['Year'].count().reset_index(name='Disaster Count')
        else:
            filtered_df = filtered_df.groupby(['Year'])[value].sum().reset_index()
        result_df = filtered_df

    # Create the figure using the result_df
    # Use markers to display unconnected data points, else they wont be displayed
    fig = px.line(result_df, x='Year', y=value, color='ISO' if storedData else None, markers=True,)
                    #title='Count for Selected Countries' if storedData else 'Count for all Countries')
    # Update margin & x-axis title and range
    fig.update_layout(margin={'l': 0, 'b': 0, 't': 35, 'r': 0},
                      paper_bgcolor='#f7f9f9',
                      xaxis=dict(title='Year', range=[years[0], years[1]+1]))
    # Show gaps in the traces instead of connecting them
    # Set a smaller marker size than default
    fig.update_traces(connectgaps=False, marker=dict(size=5))
    return fig


# Function to create histogram/bar plot
def get_bar_plot(storedData, value, types, years):
    # Filter year span
    filtered_df = filled_df[cut_disaster_df['Year'].between(years[0], years[1])]
    # Filter disaster type
    if types:
        filtered_df = filtered_df[filtered_df['Disaster Type'].isin(types)]
    # Filter countries in stored data
    if storedData:
        filtered_df = filtered_df[filtered_df['ISO'].isin(storedData)]
    # Use a histogram plot, since it sums up by its x value by default if no y value is given
    fig = px.histogram(filtered_df, x='ISO' if storedData else 'Disaster Type', 
                       y=value if value != 'Disaster Count' else None,
                       color='Disaster Type')
    fig.update_layout(xaxis=dict(showticklabels=True if storedData else False, title=''),
                      paper_bgcolor='#f7f9f9',
                      yaxis=dict(title=value))
    return fig


# Callback for graphs based on clicked countries
# Since multiple Inputs affect the same Output, nearly all Inputs are processed in the same callback
@app.callback(
    Output('line_plot', 'figure'),
    Output('bar_plot', 'figure'),
    # Reset the clickData to None after updating the line plot,
    # else a second click on the same country will not be recognized as a click event.
    Output('mapbox_plot', 'clickData'),
    Output('reset_button', 'n_clicks'),
    # Update the stored data for each click
    Output('memory', 'data'),
    # Update the clicks variable of the button after each click
    Input('mapbox_plot', 'clickData'),
    Input('value_dropdown', 'value'),
    Input('type_dropdown', 'value'),
    Input('year_slider', 'value'),
    Input('reset_button', 'n_clicks'),
    # Stored data is not an input type, thus wont trigger a callback by itself
    State('memory', 'data'),
)
def plot_callback(clickData, value, types, years, buttonclick, storedData):
    # If the callback was caused by a click, process the 'clickData'
    if clickData:
        country = clickData['points'][0]['location']
        if storedData:
            # Add the clicked country to the stored data
            # If it already was in the stored data, remove it
            if country not in storedData:
                storedData.append(country)
            else:
                storedData.remove(country)
        else:
            # If the stored data was empty before,
            # the clicked country is the stored data
            storedData = [country]
    # Clear the stored data if the button was pressed
    if buttonclick:
        storedData = []
    # Return the figures, the empty 'clickData' and the processed stored data
    return get_line_plot(storedData, value, types, years), \
           get_bar_plot(storedData, value, types, years), None, None, storedData

app.run_server(debug=False)
# app.run_server(mode='external', debug=False, dev_tools_hot_reload=True)