In [89]:
import pandas as pd
import folium
import json
from folium import plugins

# Background
As COVID-19 continues to spread across the world, one of the greatest challenges to response efforts is a lack of data and evidence about infection and mortality rates. We will learn more about the virus and the  populations most at risk as more test kits arrive in the hands of physicians and the total number of cases becomes clearer. Ideally, this information will be used to help target response efforts, guide policy decisions, inform the donor community, and support decision-making by businesses and multinational organizations whose work spans multiple countries. As a global development company with staff in over ___ countries, this kind of information is invaluable, as we think about ways to support our staff, and prepare for what comes next.

Information from the earliest confirmed cases of COVID-19 indicates that some groups of people are at a higher risk for infection than others. Public health organizations like the WHO and US CDC have released warnings for those at higher risk to take extra precautionary measures. To help DAI and other global development organizations better understand vulnerabilities in the communities in which they work, the DAI Global Health and the Center for Digital Acceleration are tracking these risk factors, in addition to other potential correlates to mortality rates. 

In this first of a multi-part series, we will keep an updated record of demographic, socio-economic, and environmental factors that are perceived to being correlated with heightened risk of mortality and identify known country and subnational datasets that can help us identify countries at risk. We will also provide charts and maps to identify potential country level “hotspots.” 


# Theories

### (THIS SECTION WILL BE UPDATED AS WE LEARN MORE ABOUT RISK FACTORS)

Below is a list of factors that have emerged as known drivers of increasing risk of mortality. Some of these theories are based on information released by official public health bodies while others are factors that could be proxies. As we learn more about possible drivers from the medical community, we will update this analysis with new datasets when possible.
1.	People over the age of 65+ are at higher risk (CDC)
2.	People who smoke are at higher risk (CDC)
3.	People who have serious chronic medical conditions are at higher risk (CDC). These conditions include:
a.	Heart disease
b.	Diabetes
c.	Lung disease
4.	People who have hypertension are at higher risk (Bloomberg News)
5.	People who work in industry are at higher risk
6.	People living in countries with higher air pollution are at higher risk
7.	People living in countries with higher inward Foreign Direct Investment are at higher risk
8.	People living in countries with lower preparedness scores are at higher risk
9.	People living in countries with lower Global Health Security Index scores are at higher risk
10.	Temperature is correlated with COVID-19 mortality


In [None]:
factor_df = pd.read_csv('https://docs.google.com/spreadsheets/d/e/2PACX-1vQ9Puqir6LrosixgUrjvXW09b58RzIsMOIdU1AmTqdTPM-Uki2nma39SGSN9ZzkqVQid8m6DT7nSHvq/pub?gid=0&single=true&output=csv')
factor_df.head()

In [107]:
covid_df = pd.read_csv('03-11-2020.csv')
covid_df.head()

Unnamed: 0,Province/State,Country/Region,Last Update,Confirmed,Deaths,Recovered,Latitude,Longitude
0,Hubei,China,2020-03-11T10:53:02,67773,3046,49134,30.9756,112.2707
1,,Italy,2020-03-11T21:33:02,12462,827,1045,43.0,12.0
2,,Iran,2020-03-11T18:52:03,9000,354,2959,32.0,53.0
3,,"Korea, South",2020-03-11T21:13:18,7755,60,288,36.0,128.0
4,France,France,2020-03-11T22:53:03,2281,48,12,46.2276,2.2137


In [108]:
# load world geojson file
with open('countries.geojson') as f:
    world_area = json.load(f)

In [111]:
def create_map(factor_df, covid_df, variable_name, world_area):
    # initialize the map
    world_map = folium.Map(tiles='Mapbox Bright')
    
    # add chloropleth base
    folium.Choropleth(
        geo_data=world_area,
        name=variable_name,
        data=factor_df,
        columns=['ISO3', variable_name],
        key_on='feature.properties.ISO_A3',
        fill_color='YlGn',
        fill_opacity=0.7,
        line_opacity=0.2,
        legend_name=variable_name).add_to(world_map)
    
    # add covid data - confirmed cases only
    for i,row in covid_df.iterrows():
        if row.Confirmed != 0:
            tooltip = 'Country/Region: ' + row['Country/Region'] + '<br>' + 'Province/State: ' + str(row['Province/State'])+ '<br>' + 'Confirmed Cases: ' + str(row.Confirmed)
            folium.CircleMarker((row.Latitude,row.Longitude), radius=row.Confirmed/1500, weight=2, color='red', fill_color='red', fill_opacity=.5, tooltip=tooltip).add_to(world_map)
    
    # save map as html
    world_map.save(variable_name+'.html')
    
    return None

In [None]:
for x in factor_df.columns[6:]:
    create_map(factor_df, covid_df, x, world_area)

% Age (65+)
Air Pollution
CCKP Projected Median Temperatures for Jan 2020 (2012)
Percent Asthmatic (2017)
Pecent Smoker (2016)
Percent High Blood Pressure (2015)
Prevalence Diabetes (2019)
Preparedness (2017)
Global Health Security Index (2019)
Percent of Labor Force in Industry (2019)
Percent of Labor Force in Industry (Female) (2019)
Percent of Labor Force in Industry (Male) (2019)
Total Trade (Exp + Imp) (2017)
Total Trade w China (2017)
Total Trade w Italy (2017)
Total Trade w S Korea (2017)
Total Trade w USA(2017)
