<center><h1>Chilean quality of life visualization</h1></center>
<br>
The following notebook helps to visualize the ENCAVI (national survey of quality of life) results by region.  

The survey includes variables like:
* has_hypertension
* has_diabetes
* had_stroke
* had_thrombosis
* has_asma
* has_renal_insufficiency
* has_arthrosis
* had_cancer
* has_hypothyroidism
* angry
* optimist
* worried
* happiness
* tiredness
* useful

To add more variables, read the file **data/survey/manual_encavi.pdf** and search for the question codes. You can then add those codes to the dictionary **SELECTED_QUESTIONS** in the format {question_code: question_alias} (the question alias will be used in the tooltips of the map).

In [1]:
SELECTED_QUESTIONS = {'P5_2_A_1': 'has_hypertension',
                      'P5_2_B_1': 'has_diabetes',
                      'P5_2_C_1': 'had_stroke',
                      'P5_2_D_2': 'had_thrombosis',
                      'P5_2_E_1': 'has_asma',
                      'P5_2_K_1': 'has_renal_insufficiency',
                      'P5_2_N_1': 'has_arthrosis',
                      'P5_2_P_1': 'had_cancer',
                      'P5_2_Q_1': 'has_hypothyroidism',
                      'P3_24_A': 'angry',
                      'P3_24_B': 'optimist',
                      'P3_24_C': 'worried',
                      'P3_24_D': 'happiness',
                      'P3_24_G': 'tiredness',
                      'P3_24_H': 'useful',
                       # Add more questions here.
                      }

### Imports

In [2]:
import pandas as pd
import geopandas as gpd
import pyreadstat
import folium
import folium.plugins
from branca.colormap import LinearColormap

### Reading the survey and obtaining average values for each question on each region

In [3]:
df, meta = pyreadstat.read_sav('./data/survey/ENCAVI_2015.sav')
regions_gid = pd.read_csv('./data/survey/region_to_GID.csv', sep=';', index_col='region_name', squeeze=True)

df['Region'] = df.Region.map(meta.variable_value_labels['Region'])  # Using region names instead of codes
df_f = df[['Region', *SELECTED_QUESTIONS.keys()]].copy()
df_f['GID_1'] = df_f.Region.map(regions_gid)  # Assigning GID to each region
df_f.rename(columns=SELECTED_QUESTIONS, inplace=True)  # Renaming columns from file
df_f.replace({88: None, 99: None}, inplace=True)  # 88, 99 = does not know / does not answer

df_reg = df_f.groupby('GID_1').mean()  # getting the average value for each region
df_reg = df_reg.round(3)

### Loading a GeoJSON map and including the survey data

In [4]:
chilean_map = gpd.read_file('./data/map/chilean_map.json', encoding='latin-1')
gid_dict_mapping = {regions_gid['Ñuble']: regions_gid['Biobío']}  # When the survey was conducted Ñuble was part of Biobío
chilean_map.GID_1.replace(gid_dict_mapping, inplace=True)

chilean_map = chilean_map.merge(df_reg, on='GID_1')  # Both the map and df_reg use GID_1 (region id) 
chilean_map.rename(columns={'NAME_1': 'Region'}, inplace=True)

### Defining functions to create a DualMap

In [17]:
def get_color_scale(min_value, max_value, positive_indicator):
    """Creates a linear color scale.
    
    Return a LinearColormap from a minimum and maximum value and if the
    indicator is positive.
    
    Args:
        min_value (float): Minimum value of the variable.
        max_value (float): Maximim value of the variable.
        positive_indicator (bool): True if 'higher is better'.
        
    Returns:
        LinearColormap: A linear color scale.
    """
    colors = ['yellow', 'green'] if positive_indicator else ['yellow', 'red']   
    color_scale = LinearColormap(colors, vmin=min_value, vmax=max_value)
    return color_scale


def get_stylized_geomap(gdf, color_column, positive_indicator, tooltip_columns=None, color_column_tooltip=True):
    """Creates a properly stylized folium.GeoJson from a GeoDataFrame.
    
    Args:
        gdf (GeoDataFrame): A geopandas DataFrame with the information.
        color_column (string): The column used for coloring regions.
        positive_indicator (bool): True if 'higher is better'.
        tooltip_columns (list): List of strings containing variables displayed
            in the tooltip.
        color_column_tooltip (bool): Display the color column on the tooltip.
        
    Returns:
        folium.GeoJson: A GeoJSON map.
    """
    tooltip_columns = tooltip_columns.copy() if tooltip_columns else []
    if color_column_tooltip:
         tooltip_columns.append(color_column)

    min_value, max_value = gdf[color_column].agg([min, max])
    color_scale = get_color_scale(min_value, max_value, positive_indicator)
    color_scale.caption = 'A colormap caption'

    style_function = lambda feature: {
        'fillColor': color_scale(feature['properties'][color_column]),
        'fillOpacity': 0.65,
        'color' : 'black',
        'weight' : 1.5,
    }
    tooltip = folium.features.GeoJsonTooltip(fields=tooltip_columns, labels=True, sticky=False)
    highlight_function = lambda x: {"weight": 1,'fillOpacity': 1}
    
    stylized_geomap = folium.GeoJson(
        data=gdf.to_json(),
        style_function=style_function,
        highlight_function=highlight_function,
        tooltip=tooltip
    )
    return stylized_geomap


def get_dualmap(gdf, variable_tuple_1, variable_tuple_2, tooltip_columns=None):
    """Creates a DualMap a GeoDataFrame and two variables.
    
    Args:
        gdf (GeoDataFrame): A geopandas DataFrame with the information.
        variable_tuple_1 (tuple): A tuple containing a variable name and if
            the variable is a positive indicator or not.
        variable_tuple_2 (tuple): A tuple containing a variable name and if
            the variable is a positive indicator or not.
        tooltip_columns (list): List of strings containing variables displayed
            in the tooltip.
        
    Returns:
        folium DualMap.
    """
    if tooltip_columns is None:
        tooltip_columns = ['Region']
        
    variable_1, positive_indicator_1 = variable_tuple_1
    variable_2, positive_indicator_2 = variable_tuple_2
        
    m = folium.plugins.DualMap(location=[-39.5, -70], tiles='cartodbpositron',  zoom_start=4)
    map_1 = get_stylized_geomap(gdf, variable_1, tooltip_columns=tooltip_columns, positive_indicator=positive_indicator_1)
    map_2 = get_stylized_geomap(gdf, variable_2, tooltip_columns=tooltip_columns, positive_indicator=positive_indicator_2)
    map_1.add_to(m.m1)
    map_2.add_to(m.m2)
    return m

___

In [21]:
# Variables use the format (variable_name, positive_indicator).
# Ex. happiness: positive_indicator=True, Worried: positive_indicator=False
variable_tuple_1 = ('has_diabetes', False)
variable_tuple_2 = ('has_hypertension', False)

m = get_dualmap(chilean_map, variable_tuple_1, variable_tuple_2)
m