# US_Measles_Risk

### Task 1. Raw measles risk
Calculate raw risk for each county with $$ r_{ij}^{t} = C_{i}^{t} \times V_{ij}^{t} \times NME_{j}^{t} \times P_{j}^{t} $$
where <br/>
$i$ is the origin country, <br/>
$j$ is the US county, <br/>
$t$ is the year, <br/>
$r_{ij}^{t}$ is the measles risk from country $i$ to county $j$ in year $t$, <br/>
$C_{i}^{t}$ is the case incidence in Country $i$ in year $t$, <br/>
$V_{ij}^{t}$ is the travel volume (million) from country $i$ to county $j$ in year $t$, <br/>
$NME_{j}^{t}$ is the NME rate in county $j$ in year $t$, <br/>
$P_{j}^{t}$ is the county $j$ population in year $t$. <br/>
$$ r_{j}^{t} = \sum_{i} r_{ij}^{t} = (\sum_{i} C_{i}^{t} \times V_{ij}^{t}) \times NME_{j}^{t} \times P_{j}^{t}$$
where <br/>
$r_{j}^{t}$ is the measles risk of county $j$ in year $t$, <br/>

### Task 2. Rearrange travel volume by population
For counties where is no international travel - update $V_{ij}^{t}$. <br/>
Task 2.1: calculate the average of raw risk in neighboring counties <br/>
Task 2.2: proportion to population <br/>

### Task 3. Rearrange travel volume by Voronoi diagram

####  Goal:
Update $V_{ij}^{t}$ for all US counties.
#### Preparation: 
Create Thiessen polygons for all known __675__ airports in the US (in `Voronoi.mxd`).
1. Make sure the airports layer contains IATA code and cooridates. 
* `Create Thiessen Polygons` for `US_airports_675` to create `US_airports_Thiessen` (Output Fields: ALL).
* `Dissolve` `us_states` to create `US_Boundary` as the mask.
* `Clip` `US_airports_Thiessen` with `US_Boundary` to make sure all Thiessen polygons are within the US. Output: `US_airports_Thiessen_Clip`.
* Calculate geometry (`ThiessenAreaKM2`) for each Thiessen polygon.
* `Intersect` `US_airports_Thiessen_Clip` and `us_states` to get `Thiessen_County_Intersect`.
* Calcuate geometry (`IntersectAreaKM2`) for each polygon in `Thiessen_County_Intersect`.
* Calcuate percentage of intersected polygon to the airport Thiessen polygon (`ThiessenAreaPct = [IntersectAreaKM2] * 100/ [ThiessenAreaKM2]`).
* Export `Thiessen_County_Intersect` as `Thiessen_County_Intersect_Pct.csv`

#### Method:
Diffusing international incoming travel volume ( $V_{ij}^{t}$) to all neighboring counties. 

## Task 1: Calculate measles risk in county level

In [206]:
# environment setting
import pandas as pd
import datetime
t = datetime.datetime.now()
year = 2016
year_pop = 'pop2016'
year_iata = 2016 # we use 2017 IATA data for 2018 and 2019 (2007 to 2017)
out_folder = r'C:\Users\Ensheng\Desktop\mapping\scripts\\'
#pd.set_option("display.max_rows", 999)

#### Import world population

In [207]:
# ref: https://population.un.org/wpp/Download/Standard/Population/
in_table = out_folder + r'world_pop.xlsx'
df_pop = pd.read_excel(in_table)
print(len(df_pop))
df_pop.head(5)

235


Unnamed: 0,name,Country code,pop1950,pop1951,pop1952,pop1953,pop1954,pop1955,pop1956,pop1957,...,pop2011,pop2012,pop2013,pop2014,pop2015,pop2016,pop2017,pop2018,pop2019,pop2020
0,Afghanistan,4,7752.118,7840.156,7935.997,8039.694,8151.317,8270.991,8398.875,8535.163,...,30117.413,31161.376,32269.589,33370.794,34413.603,35383.032,36296.113,37171.921,38041.754,38928.346
1,Albania,8,1263.174,1287.5,1316.093,1348.112,1382.898,1419.994,1459.12,1500.181,...,2928.592,2914.096,2903.79,2896.305,2890.513,2886.438,2884.169,2882.74,2880.917,2877.797
2,Algeria,12,8872.247,9023.269,9186.138,9364.371,9560.149,9774.283,10006.147,10253.778,...,36661.445,37383.895,38140.133,38923.692,39728.025,40551.392,41389.189,42228.408,43053.054,43851.044
3,American Samoa,16,18.94,19.293,19.542,19.695,19.753,19.754,19.709,19.667,...,55.759,55.667,55.713,55.791,55.812,55.741,55.62,55.465,55.312,55.191
4,Andorra,20,6.196,6.689,7.247,7.865,8.525,9.232,9.989,10.779,...,83.747,82.427,80.774,79.213,78.011,77.297,77.001,77.006,77.142,77.265


In [208]:
# ref: http://worldpopulationreview.com/country-codes/
# ref: https://www.iban.com/country-codes
# note: add BLM manually
in_table = r'C:\Users\Ensheng\Desktop\mapping\diffusion_model\country_code.csv'
df_code = pd.read_csv(in_table)
print(len(df_code))
df_code.head(5)

238


Unnamed: 0,name,alpha2,alpha3,num3
0,Afghanistan,AF,AFG,4
1,Albania,AL,ALB,8
2,Algeria,DZ,DZA,12
3,American Samoa,AS,ASM,16
4,Andorra,AD,AND,20


In [209]:
df_pop3 = pd.merge(df_pop, df_code, how='left', left_on='Country code',right_on='num3')
print("Info: " + str(len(df_pop3)) + " countries in UN dataset.")
print("Warning: " + str(len(df_pop3.loc[df_pop3['num3'].isnull()])) + " countries mismatched.")
df_pop3.head(3)

Info: 235 countries in UN dataset.


Unnamed: 0,name_x,Country code,pop1950,pop1951,pop1952,pop1953,pop1954,pop1955,pop1956,pop1957,...,pop2015,pop2016,pop2017,pop2018,pop2019,pop2020,name_y,alpha2,alpha3,num3
0,Afghanistan,4,7752.118,7840.156,7935.997,8039.694,8151.317,8270.991,8398.875,8535.163,...,34413.603,35383.032,36296.113,37171.921,38041.754,38928.346,Afghanistan,AF,AFG,4
1,Albania,8,1263.174,1287.5,1316.093,1348.112,1382.898,1419.994,1459.12,1500.181,...,2890.513,2886.438,2884.169,2882.74,2880.917,2877.797,Albania,AL,ALB,8
2,Algeria,12,8872.247,9023.269,9186.138,9364.371,9560.149,9774.283,10006.147,10253.778,...,39728.025,40551.392,41389.189,42228.408,43053.054,43851.044,Algeria,DZ,DZA,12


#### Import WHO data (2019 is suspected data)

In [210]:
if year != 2019:
    print (year)
    # ref: https://www.who.int/immunization/monitoring_surveillance/burden/vpd/surveillance_type/active/measles_monthlydata/en/
    in_table = r'C:\Users\Ensheng\Desktop\mapping\diffusion_model\measlescasesbycountrybymonth.xls'
    df_who = pd.read_excel(in_table,sheet_name='WEB')
    df_who = df_who.loc[df_who['Year'] == year]
    #print(len(df_who))
    #print(df_who.head(3))
    
    col_list= list(df_who)
    col_list.remove('Year')
    df_who['Total'] = df_who[col_list].sum(axis=1)
    print(len(df_who))
    df_outbreak_raw = df_who[['ISO3','Country','Total']]
    

if year == 2019:
    print (year)
    # ref: https://www.who.int/immunization/monitoring_surveillance/burden/vpd/surveillance_type/active/measles_monthlydata/en/
    in_table = r'C:\Users\Ensheng\Desktop\mapping\diffusion_model\WHO_2019_Suspected.xlsx'
    df_who = pd.read_excel(in_table)
    print(len(df_who))
    df_outbreak_raw = df_who.fillna(0)
    
df_outbreak_raw.head(3)

2016
194


Unnamed: 0,ISO3,Country,Total
5,AGO,Angola,51.0
14,BDI,Burundi,14.0
23,BEN,Benin,95.0


In [211]:
df_outbreak = pd.merge(df_outbreak_raw, df_pop3, how='left', left_on='ISO3',right_on='alpha3')
print(len(df_outbreak))
df_outbreak = df_outbreak[['alpha3', 'Country', 'Total', year_pop]]
print(str(len(df_outbreak_raw) - df_outbreak.alpha3.notnull().sum()) + " row(s) have NaN as ISO 3 (alpha3).")
df_outbreak.sort_values(by='alpha3').head(5)

194
0 row(s) have NaN as ISO 3 (alpha3).


Unnamed: 0,alpha3,Country,Total,pop2016
82,AFG,Afghanistan,641.0,35383.032
0,AGO,Angola,51.0,28842.489
103,ALB,Albania,0.0,2886.438
104,AND,Andorra,0.0,77.297
83,ARE,United Arab Emirates,215.0,9360.98


#### Import $V_{ij}^{t}$

In [212]:
# IATA data
in_table = r'C:\Users\Ensheng\Desktop\mapping\IATA\flow_XY.csv'
df_iata = pd.read_csv(in_table)
df_iata = df_iata.loc[df_iata['year'] == year_iata] # slice for certain year
df_iata = df_iata[['FIPS', 'ISO', 'paxVolume']]
print(len(df_iata))
df_iata.head(5)

38937


Unnamed: 0,FIPS,ISO,paxVolume
293925,1045,DEU,730
293926,1045,CAN,416
293927,1045,GBR,356
293928,1045,KOR,310
293929,1045,MEX,302


#### Import $NME_{j}^{t}$ and $P_{j}^{t}$

In [213]:
in_table = r'C:\Users\Ensheng\Desktop\mapping\diffusion_model\ModelInputOutputAll 4_23.csv'
df_nme = pd.read_csv(in_table)
print(len(df_nme))
df_nme.head(5)

3142


Unnamed: 0,County Name,State,FIPS,2015_NME,2016_NME,State_Avg_NME,Population,Static,Year2011,Year2012,Year2013,Year2014,Year2015,Year2016,Year2017,Year2018,Year2019
0,Autauga,Alabama,1001,,,0.006,55504,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,Baldwin,Alabama,1003,,,0.006,212628,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,Barbour,Alabama,1005,,,0.006,25270,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,Bibb,Alabama,1007,,,0.006,22668,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,Blount,Alabama,1009,,,0.006,58013,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [214]:
df_nme['County'] = df_nme['County Name'] + ', ' + df_nme['State']

In [215]:
df_nme.loc[df_nme["2016_NME"].notnull(), 'FIPS_NME'] = df_nme['2016_NME']
df_nme.loc[(df_nme["FIPS_NME"].isnull()) & (df_nme["2015_NME"].notnull()), 'FIPS_NME'] = df_nme['2015_NME']
df_nme.loc[(df_nme["FIPS_NME"].isnull()) & (df_nme["State_Avg_NME"].notnull()), 'FIPS_NME'] = df_nme['State_Avg_NME']

In [216]:
df_nme = df_nme[['FIPS','County','2016_NME','2015_NME','State_Avg_NME','FIPS_NME','Population']]
print("No NME for the following counties:")
df_nme.loc[df_nme['FIPS_NME'].isnull()]

No NME for the following counties:


Unnamed: 0,FIPS,County,2016_NME,2015_NME,State_Avg_NME,FIPS_NME,Population
3119,56001,"Albany, Wyoming",,,,,38332
3120,56003,"Big Horn, Wyoming",,,,,11906
3121,56005,"Campbell, Wyoming",,,,,46242
3122,56007,"Carbon, Wyoming",,,,,15303
3123,56009,"Converse, Wyoming",,,,,13809
3124,56011,"Crook, Wyoming",,,,,7410
3125,56013,"Fremont, Wyoming",,,,,39803
3126,56015,"Goshen, Wyoming",,,,,13378
3127,56017,"Hot Springs, Wyoming",,,,,4696
3128,56019,"Johnson, Wyoming",,,,,8476


#### Calculate $r_{ij}^{t}$

In [217]:
df_temp = pd.merge(df_iata, df_outbreak, how='left', left_on='ISO',right_on='alpha3')
df_factors = pd.merge(df_temp, df_nme, how='left', left_on='FIPS',right_on='FIPS')
df_factors.head(5)

Unnamed: 0,FIPS,ISO,paxVolume,alpha3,Country,Total,pop2016,County,2016_NME,2015_NME,State_Avg_NME,FIPS_NME,Population
0,1045,DEU,730,DEU,Germany,328.0,82193.768,"Dale, Alabama",,,0.006,0.006,49226
1,1045,CAN,416,CAN,Canada,11.0,36382.944,"Dale, Alabama",,,0.006,0.006,49226
2,1045,GBR,356,GBR,United Kingdom of Great Britain and Northern I...,571.0,66297.944,"Dale, Alabama",,,0.006,0.006,49226
3,1045,KOR,310,KOR,Republic of Korea,18.0,50983.457,"Dale, Alabama",,,0.006,0.006,49226
4,1045,MEX,302,MEX,Mexico,0.0,123333.376,"Dale, Alabama",,,0.006,0.006,49226


In [218]:
# rename and reorder col.
df_factors.loc[:,('FIPS_Pop')] = df_factors['Population']
df_factors.loc[:,('ISO_Case')] = df_factors['Total']
df_factors.loc[:,('ISO_Pop')] = df_factors[year_pop]
df_factors = df_factors[['FIPS','County','FIPS_NME','FIPS_Pop','ISO','Country','ISO_Case','ISO_Pop','paxVolume']]
print(len(df_factors))
df_factors.head(5)

38937


Unnamed: 0,FIPS,County,FIPS_NME,FIPS_Pop,ISO,Country,ISO_Case,ISO_Pop,paxVolume
0,1045,"Dale, Alabama",0.006,49226,DEU,Germany,328.0,82193.768,730
1,1045,"Dale, Alabama",0.006,49226,CAN,Canada,11.0,36382.944,416
2,1045,"Dale, Alabama",0.006,49226,GBR,United Kingdom of Great Britain and Northern I...,571.0,66297.944,356
3,1045,"Dale, Alabama",0.006,49226,KOR,Republic of Korea,18.0,50983.457,310
4,1045,"Dale, Alabama",0.006,49226,MEX,Mexico,0.0,123333.376,302


In [219]:
# slice
df_factors = df_factors.loc[df_factors['ISO_Case'].notnull()]
print(len(df_factors))
df_factors = df_factors.loc[df_factors['paxVolume'].notnull()]
print(len(df_factors))

34244
34244


#### Calculate $r_{j}^{t}$

In [220]:
df_factors['Route_Risk'] = (df_factors['ISO_Case'] / df_factors['ISO_Pop']) * df_factors['paxVolume'] * df_factors['FIPS_NME'] * df_factors['FIPS_Pop']

In [221]:
df_risk = df_factors.groupby(['FIPS','County'])['Route_Risk'].sum().reset_index()
df_risk.loc[:,('FIPS_RawRisk')] = df_risk['Route_Risk']
df_risk.head(5)

Unnamed: 0,FIPS,County,Route_Risk,FIPS_RawRisk
0,1045,"Dale, Alabama",12063.92,12063.92
1,1073,"Jefferson, Alabama",2229144.0,2229144.0
2,1089,"Madison, Alabama",779748.4,779748.4
3,1097,"Mobile, Alabama",758529.2,758529.2
4,1101,"Montgomery, Alabama",137204.8,137204.8


#### Normalize and list the Top 25

In [222]:
# import county seats
# ref: https://en.wikipedia.org/wiki/List_of_the_most_populous_counties_in_the_United_States
in_table = r'C:\Users\Ensheng\Desktop\mapping\diffusion_model\County_Seat.xlsx'
df_seat = pd.read_excel(in_table)
print(len(df_seat))
df_seat.head(3)

101


Unnamed: 0,County,City
0,"Los Angeles, California",Los Angeles
1,"Cook, Illinois",Chicago
2,"Harris, Texas",Houston


In [267]:
highest_risk = df_risk['FIPS_RawRisk'].max()
df_risk['Risk'] = df_risk['FIPS_RawRisk'] / highest_risk
df_risk['FIPS_Rank'] = df_risk['Risk'].rank(ascending=False)
df_risk = pd.merge(df_risk, df_seat, how='left', left_on='County',right_on='County')
df_risk['Year'] = year
df_risk = df_risk[['FIPS','County','City','FIPS_RawRisk','Risk','FIPS_Rank','Year']]
df_risk = df_risk.sort_values('Risk',ascending = False).reset_index()
df_risk.head(50)

Unnamed: 0,index,FIPS,County,City,FIPS_RawRisk,Risk,FIPS_Rank,Year
0,0,6037,"Los Angeles, California",Los Angeles,5209939000.0,1.0,1.0,2016
1,1,17031,"Cook, Illinois",Chicago,2679492000.0,0.514304,2.0,2016
2,2,4013,"Maricopa, Arizona",Phoenix,1272118000.0,0.244171,3.0,2016
3,3,32003,"Clark, Nevada",Las Vegas,698475700.0,0.134066,4.0,2016
4,4,12086,"Miami-Dade, Florida",Miami,610437000.0,0.117168,5.0,2016
5,5,36081,"Queens, New York","Queens, NYC",602710100.0,0.115685,6.0,2016
6,6,34025,"Monmouth, New Jersey",,545286600.0,0.104663,7.0,2016
7,7,17097,"Lake, Illinois",Waukegan,443080200.0,0.085045,8.0,2016
8,8,15003,"Honolulu, Hawaii",Honolulu,416201100.0,0.079886,9.0,2016
9,9,36047,"Kings, New York","Brooklyn, NYC",400841100.0,0.076938,10.0,2016


In [224]:
result = df_risk
output_csv = out_folder + 'MeaslesRisk_US_' +  str(year) + '_raw_' + t.strftime('%m%d%y%H%M') + '.csv'
result.to_csv(output_csv, index=False, encoding='utf-8')

In [268]:
df_complete = pd.merge(df_factors, df_risk , how='left', left_on='FIPS',right_on='FIPS')
df_complete = df_complete.sort_values(by=['Risk','Route_Risk'], ascending=False)
df_complete['Route_Rank'] = df_complete.groupby('FIPS_Rank')['Route_Risk'].rank(ascending=False,method='dense')
df_complete = df_complete.rename(index=str, columns={"County_x": "County"})
df_complete = df_complete.drop(columns=['County_y'])
print(len(df_complete))
df_complete.head(10)

316566


Unnamed: 0,FIPS,County,FIPS_NME,FIPS_Pop,ISO,Country,ISO_Case,ISO_Pop,paxVolume,Route_Risk,index,City,FIPS_RawRisk,Risk,FIPS_Rank,Year,Route_Rank
208712,6037,"Los Angeles, California",0.006,10163507.0,MNG,Mongolia,28710.0,3056.364,3134.0,1795238000.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,1.0
56292,6037,"Los Angeles, California",0.006,10163507.0,CHN,China,25593.0,1414049.351,757413.602243,835958200.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,2.0
139131,6037,"Los Angeles, California",0.006,10163507.0,IND,India,70798.0,1324517.249,189526.143799,617770700.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,3.0
109681,6037,"Los Angeles, California",0.006,10163507.0,GBR,United Kingdom of Great Britain and Northern I...,571.0,66297.944,506594.029023,266067000.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,4.0
152178,6037,"Los Angeles, California",0.006,10163507.0,ITA,Italy,862.0,60663.06,227532.503298,197161100.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,5.0
232066,6037,"Los Angeles, California",0.006,10163507.0,NZL,New Zealand,104.0,4659.265,128390.0,174760000.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,6.0
243311,6037,"Los Angeles, California",0.006,10163507.0,PHL,Philippines,647.0,103663.816,320692.541557,122056500.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,7.0
136884,6037,"Los Angeles, California",0.006,10163507.0,IDN,Indonesia,7204.0,261556.381,70987.0,119229000.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,8.0
285014,6037,"Los Angeles, California",0.006,10163507.0,THA,Thailand,1009.0,68971.308,124150.602243,110755800.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,9.0
253317,6037,"Los Angeles, California",0.006,10163507.0,ROU,Romania,2432.0,19796.285,14704.0719,110157100.0,0.0,Los Angeles,5209939000.0,1.0,1.0,2016.0,10.0


In [226]:
result = df_complete
output_csv = out_folder + 'MeaslesRisk_US_' +  str(year) + '_raw_route_' + t.strftime('%m%d%y%H%M') + '.csv'
result.to_csv(output_csv, index=False, encoding='utf-8')

## Task 2: Travel volume proportional to the population (or pop density)

#### Import neighboring relationship table

In [227]:
in_table = r'C:\Users\Ensheng\Desktop\mapping\diffusion_model\nbr.csv'
df_nbr = pd.read_csv(in_table)
df_nbr = df_nbr[['src_FIPS', 'nbr_FIPS']]
print(len(df_nbr))
df_nbr.head(5)

18680


Unnamed: 0,src_FIPS,nbr_FIPS
0,1001.0,1021.0
1,1001.0,1047.0
2,1001.0,1051.0
3,1001.0,1085.0
4,1001.0,1101.0


In [228]:
# find all counties with IATA data
df_iataCounty = df_iata.groupby(['FIPS'])['paxVolume'].sum().reset_index()
df_iataCounty = df_iataCounty.loc[df_iataCounty['paxVolume'].notnull()]
print(str(len(df_nme)) + " counties in the US.")
print(str(len(df_iataCounty)) + " counties have IATA travel data.")

3142 counties in the US.
392 counties have IATA travel data.


In [229]:
# subset of df_nbr to show only src_FIPS with IATA data
df_temp = pd.merge(df_nbr, df_iataCounty, how='left', left_on='src_FIPS',right_on='FIPS')
df_hub = df_temp.loc[df_temp['paxVolume'].notnull()]
print(str(len(df_hub)) + " neighboring relationships remain.") # we will only work with these counties and their neighbors
print(str(df_hub.src_FIPS.nunique()) + " hub counties.")
df_hub.head(10)

2215 neighboring relationships remain.
388 hub counties.


Unnamed: 0,src_FIPS,nbr_FIPS,FIPS,paxVolume
131,1045.0,1005.0,1045.0,5264.0
132,1045.0,1031.0,1045.0,5264.0
133,1045.0,1061.0,1045.0,5264.0
134,1045.0,1067.0,1045.0,5264.0
135,1045.0,1069.0,1045.0,5264.0
136,1045.0,1109.0,1045.0,5264.0
214,1073.0,1007.0,1073.0,111446.0
215,1073.0,1009.0,1073.0,111446.0
216,1073.0,1115.0,1073.0,111446.0
217,1073.0,1117.0,1073.0,111446.0


In [230]:
print("The following (island) counties have IATA data but no neighboring counties: ")
print(set(df_iataCounty.FIPS.unique()) - set(df_hub.src_FIPS.unique()))

The following (island) counties have IATA data but no neighboring counties: 
{15001, 25019, 15003, 15007}


#### Update hub county list

In [231]:
# src_FIPS is the hub county, nbr_FIPS lists all neighboring counties along with itself, the hub county
# this will also clear out the island county issue
df_iataCounty["src_FIPS"] = df_iataCounty["FIPS"]
df_iataCounty["nbr_FIPS"] = df_iataCounty["FIPS"]
df_iataCounty = df_iataCounty[["src_FIPS","nbr_FIPS"]]
df_hub = df_hub[["src_FIPS","nbr_FIPS"]]
df_hub = df_hub.append(df_iataCounty)
print(str(len(df_hub)) + " neighboring relationships remain.")
print(str(df_hub.src_FIPS.nunique()) + " hub counties.")
df_hub = df_hub.sort_values(["src_FIPS","nbr_FIPS"]).reset_index()
df_hub.head(10)

2607 neighboring relationships remain.
392 hub counties.


Unnamed: 0,index,src_FIPS,nbr_FIPS
0,131,1045.0,1005.0
1,132,1045.0,1031.0
2,0,1045.0,1045.0
3,133,1045.0,1061.0
4,134,1045.0,1067.0
5,135,1045.0,1069.0
6,136,1045.0,1109.0
7,214,1073.0,1007.0
8,215,1073.0,1009.0
9,1,1073.0,1073.0


#### Merge county population

In [232]:
df_pop = pd.merge(df_hub, df_nme , how='left', left_on='nbr_FIPS',right_on='FIPS')
df_pop.head(5)

Unnamed: 0,index,src_FIPS,nbr_FIPS,FIPS,County,2016_NME,2015_NME,State_Avg_NME,FIPS_NME,Population
0,131,1045.0,1005.0,1005,"Barbour, Alabama",,,0.006,0.006,25270
1,132,1045.0,1031.0,1031,"Coffee, Alabama",,,0.006,0.006,51874
2,0,1045.0,1045.0,1045,"Dale, Alabama",,,0.006,0.006,49226
3,133,1045.0,1061.0,1061,"Geneva, Alabama",,,0.006,0.006,26421
4,134,1045.0,1067.0,1067,"Henry, Alabama",,,0.006,0.006,17147


#### Calculate population percentage

In [233]:
df_pop_tmp = df_pop.groupby(['src_FIPS', 'nbr_FIPS']).agg({'Population': 'sum'})
# Change: groupby df_nbr_tmp and divide by sum
df_poppct = df_pop_tmp.groupby(level=0) \
.apply(lambda x: 100 * x / float(x.sum())) \
.rename(columns={'Population':'POPPCT'}) \
.reset_index()

In [234]:
print(len(df_poppct)) # should be the same as len(df_hub), the count of neighboring pairs + the count of hub counties
df_poppct.head(15)

2607


Unnamed: 0,src_FIPS,nbr_FIPS,POPPCT
0,1045.0,1005.0,8.216523
1,1045.0,1031.0,16.866796
2,1045.0,1045.0,16.005801
3,1045.0,1061.0,8.59077
4,1045.0,1067.0,5.575335
5,1045.0,1069.0,33.928031
6,1045.0,1109.0,10.816743
7,1073.0,1007.0,1.725704
8,1073.0,1009.0,4.416502
9,1073.0,1073.0,50.184348


#### Calculate travel volume for each route

In [235]:
df_iata.head(5)

Unnamed: 0,FIPS,ISO,paxVolume
293925,1045,DEU,730
293926,1045,CAN,416
293927,1045,GBR,356
293928,1045,KOR,310
293929,1045,MEX,302


In [236]:
df_route = pd.merge(df_iata, df_poppct, how='left', left_on='FIPS',right_on='src_FIPS')
print(len(df_route))
df_route.head(15)

263003


Unnamed: 0,FIPS,ISO,paxVolume,src_FIPS,nbr_FIPS,POPPCT
0,1045,DEU,730,1045.0,1005.0,8.216523
1,1045,DEU,730,1045.0,1031.0,16.866796
2,1045,DEU,730,1045.0,1045.0,16.005801
3,1045,DEU,730,1045.0,1061.0,8.59077
4,1045,DEU,730,1045.0,1067.0,5.575335
5,1045,DEU,730,1045.0,1069.0,33.928031
6,1045,DEU,730,1045.0,1109.0,10.816743
7,1045,CAN,416,1045.0,1005.0,8.216523
8,1045,CAN,416,1045.0,1031.0,16.866796
9,1045,CAN,416,1045.0,1045.0,16.005801


In [237]:
df_route["IncomingTravel"] = df_route["paxVolume"] * df_route["POPPCT"] / 100
df_route.head(15)

Unnamed: 0,FIPS,ISO,paxVolume,src_FIPS,nbr_FIPS,POPPCT,IncomingTravel
0,1045,DEU,730,1045.0,1005.0,8.216523,59.980621
1,1045,DEU,730,1045.0,1031.0,16.866796,123.127611
2,1045,DEU,730,1045.0,1045.0,16.005801,116.842345
3,1045,DEU,730,1045.0,1061.0,8.59077,62.712623
4,1045,DEU,730,1045.0,1067.0,5.575335,40.699949
5,1045,DEU,730,1045.0,1069.0,33.928031,247.67463
6,1045,DEU,730,1045.0,1109.0,10.816743,78.962221
7,1045,CAN,416,1045.0,1005.0,8.216523,34.180738
8,1045,CAN,416,1045.0,1031.0,16.866796,70.165872
9,1045,CAN,416,1045.0,1045.0,16.005801,66.584131


In [238]:
df_iata_new = df_route.groupby(['nbr_FIPS','ISO'])['IncomingTravel'].sum().reset_index()
print(len(df_iata_new))
df_iata_new.head(5)

212126


Unnamed: 0,nbr_FIPS,ISO,IncomingTravel
0,1001.0,ABW,12.824599
1,1001.0,AFG,2.712896
2,1001.0,AGO,0.616567
3,1001.0,ARE,23.059615
4,1001.0,ARG,8.878568


In [239]:
# update df_iata with travel volume for more counties
df_iata_new["FIPS"] = df_iata_new["nbr_FIPS"]
df_iata_new["paxVolume"] = df_iata_new["IncomingTravel"]
df_iata = df_iata_new[["FIPS","ISO","paxVolume"]]
df_iata.head(5)

Unnamed: 0,FIPS,ISO,paxVolume
0,1001.0,ABW,12.824599
1,1001.0,AFG,2.712896
2,1001.0,AGO,0.616567
3,1001.0,ARE,23.059615
4,1001.0,ARG,8.878568


#### Calculate risk (same as Task 1)

#### Calculate $r_{ij}^{t}$

In [240]:
df_temp = pd.merge(df_iata, df_outbreak, how='left', left_on='ISO',right_on='alpha3')
df_factors = pd.merge(df_temp, df_nme, how='left', left_on='FIPS',right_on='FIPS')
df_factors.head(5)

Unnamed: 0,FIPS,ISO,paxVolume,alpha3,Country,Total,pop2016,County,2016_NME,2015_NME,State_Avg_NME,FIPS_NME,Population
0,1001.0,ABW,12.824599,,,,,"Autauga, Alabama",,,0.006,0.006,55504
1,1001.0,AFG,2.712896,AFG,Afghanistan,641.0,35383.032,"Autauga, Alabama",,,0.006,0.006,55504
2,1001.0,AGO,0.616567,AGO,Angola,51.0,28842.489,"Autauga, Alabama",,,0.006,0.006,55504
3,1001.0,ARE,23.059615,ARE,United Arab Emirates,215.0,9360.98,"Autauga, Alabama",,,0.006,0.006,55504
4,1001.0,ARG,8.878568,ARG,Argentina,0.0,43508.46,"Autauga, Alabama",,,0.006,0.006,55504


In [241]:
# rename and reorder col.
df_factors.loc[:,('FIPS_Pop')] = df_factors['Population']
df_factors.loc[:,('ISO_Case')] = df_factors['Total']
df_factors.loc[:,('ISO_Pop')] = df_factors[year_pop]
df_factors = df_factors[['FIPS','County','FIPS_NME','FIPS_Pop','ISO','Country','ISO_Case','ISO_Pop','paxVolume']]
print(len(df_factors))
df_factors.head(5)

212126


Unnamed: 0,FIPS,County,FIPS_NME,FIPS_Pop,ISO,Country,ISO_Case,ISO_Pop,paxVolume
0,1001.0,"Autauga, Alabama",0.006,55504,ABW,,,,12.824599
1,1001.0,"Autauga, Alabama",0.006,55504,AFG,Afghanistan,641.0,35383.032,2.712896
2,1001.0,"Autauga, Alabama",0.006,55504,AGO,Angola,51.0,28842.489,0.616567
3,1001.0,"Autauga, Alabama",0.006,55504,ARE,United Arab Emirates,215.0,9360.98,23.059615
4,1001.0,"Autauga, Alabama",0.006,55504,ARG,Argentina,0.0,43508.46,8.878568


In [242]:
# slice
df_factors = df_factors.loc[df_factors['ISO_Case'].notnull()]
print(len(df_factors))
df_factors = df_factors.loc[df_factors['paxVolume'].notnull()]
print(len(df_factors))

186650
186650


#### Calculate $r_{j}^{t}$

In [243]:
df_factors['Route_Risk'] = (df_factors['ISO_Case'] / df_factors['ISO_Pop']) * df_factors['paxVolume'] * df_factors['FIPS_NME'] * df_factors['FIPS_Pop']

In [244]:
df_risk = df_factors.groupby(['FIPS','County'])['Route_Risk'].sum().reset_index()
df_risk.loc[:,('FIPS_RawRisk')] = df_risk['Route_Risk']
df_risk.head(5)

Unnamed: 0,FIPS,County,Route_Risk,FIPS_RawRisk
0,1001.0,"Autauga, Alabama",4143.392546,4143.392546
1,1003.0,"Baldwin, Alabama",253087.364211,253087.364211
2,1005.0,"Barbour, Alabama",508.847043,508.847043
3,1007.0,"Bibb, Alabama",1322.824775,1322.824775
4,1009.0,"Blount, Alabama",8664.161944,8664.161944


#### Normalize and list the Top 25

In [245]:
highest_risk = df_risk['FIPS_RawRisk'].max()
df_risk['Risk'] = df_risk['FIPS_RawRisk'] / highest_risk
df_risk['FIPS_Rank'] = df_risk['Risk'].rank(ascending=False)
df_risk = pd.merge(df_risk, df_seat, how='left', left_on='County',right_on='County')
df_risk['Year'] = year
df_risk = df_risk[['FIPS','County','City','FIPS_RawRisk','Risk','FIPS_Rank','Year']]
df_risk = df_risk.sort_values('Risk',ascending = False).reset_index()
df_risk.head(50)

Unnamed: 0,index,FIPS,County,City,FIPS_RawRisk,Risk,FIPS_Rank
0,360,17031.0,"Cook, Illinois",Chicago,8665863000.0,1.0,1.0
1,118,6037.0,"Los Angeles, California",Los Angeles,3084786000.0,0.35597,2.0
2,71,4013.0,"Maricopa, Arizona",Phoenix,1073199000.0,0.123842,3.0
3,1678,53033.0,"King, Washington",Seattle,948078500.0,0.109404,4.0
4,1492,48201.0,"Harris, Texas",Houston,753025500.0,0.086896,5.0
5,1040,36047.0,"Kings, New York","Brooklyn, NYC",717787300.0,0.082829,6.0
6,1054,36081.0,"Queens, New York","Queens, NYC",569126600.0,0.065675,7.0
7,247,12086.0,"Miami-Dade, Florida",Miami,554265000.0,0.06396,8.0
8,957,32003.0,"Clark, Nevada",Las Vegas,510535100.0,0.058913,9.0
9,312,15003.0,"Honolulu, Hawaii",Honolulu,416223100.0,0.04803,10.0


In [246]:
result = df_risk
output_csv = out_folder + 'MeaslesRisk_US_' +  str(year) + '_pop_' + t.strftime('%m%d%y%H%M') + '.csv'
result.to_csv(output_csv, index=False, encoding='utf-8')

In [247]:
df_complete = pd.merge(df_factors, df_risk , how='left', left_on='FIPS',right_on='FIPS')
df_complete = df_complete.sort_values(by=['Risk','Route_Risk'], ascending=False)
df_complete['Route_Rank'] = df_complete.groupby('FIPS_Rank')['Route_Risk'].rank(ascending=False,method='dense')
df_complete = df_complete.rename(index=str, columns={"County_x": "County"})
df_complete = df_complete.drop(columns=['County_y'])
print(len(df_complete))
df_complete.head(10)

186650


Unnamed: 0,FIPS,County,FIPS_NME,FIPS_Pop,ISO,Country,ISO_Case,ISO_Pop,paxVolume,Route_Risk,index,City,FIPS_RawRisk,Risk,FIPS_Rank,Route_Rank
41368,17031.0,"Cook, Illinois",0.051,5211263,MNG,Mongolia,28710.0,3056.364,1241.840248,3100324000.0,360,Chicago,8665863000.0,1.0,1.0,1.0
41331,17031.0,"Cook, Illinois",0.051,5211263,IND,India,70798.0,1324517.249,153636.737594,2182587000.0,360,Chicago,8665863000.0,1.0,1.0,2.0
41287,17031.0,"Cook, Illinois",0.051,5211263,CHN,China,25593.0,1414049.351,139656.190592,671784500.0,360,Chicago,8665863000.0,1.0,1.0,3.0
41316,17031.0,"Cook, Illinois",0.051,5211263,GBR,United Kingdom of Great Britain and Northern I...,571.0,66297.944,164072.192132,375564200.0,360,Chicago,8665863000.0,1.0,1.0,4.0
41337,17031.0,"Cook, Illinois",0.051,5211263,ITA,Italy,862.0,60663.06,90185.342103,340590100.0,360,Chicago,8665863000.0,1.0,1.0,5.0
41391,17031.0,"Cook, Illinois",0.051,5211263,ROU,Romania,2432.0,19796.285,9514.506281,310655700.0,360,Chicago,8665863000.0,1.0,1.0,6.0
41376,17031.0,"Cook, Illinois",0.051,5211263,NGA,Nigeria,16438.0,185960.241,10475.419109,246101000.0,360,Chicago,8665863000.0,1.0,1.0,7.0
41260,17031.0,"Cook, Illinois",0.051,5211263,ARE,United Arab Emirates,215.0,9360.98,35002.498645,213663000.0,360,Chicago,8665863000.0,1.0,1.0,8.0
41332,17031.0,"Cook, Illinois",0.051,5211263,IRL,Ireland,43.0,4695.779,61821.663826,150457800.0,360,Chicago,8665863000.0,1.0,1.0,9.0
41299,17031.0,"Cook, Illinois",0.051,5211263,DEU,Germany,328.0,82193.768,99916.714949,105970800.0,360,Chicago,8665863000.0,1.0,1.0,10.0


In [248]:
result = df_complete
output_csv = out_folder + 'MeaslesRisk_US_' +  str(year) + '_pop_route_' + t.strftime('%m%d%y%H%M') + '.csv'
result.to_csv(output_csv, index=False, encoding='utf-8')

## Task 3: Travel volume proportional to Voronoi diagram

In [249]:
# environment setting
v_folder = r'C:\Users\Ensheng\Desktop\mapping\Voronoi\\'

#### Import original $V_{ij}^{t}$

In [250]:
# IATA data
in_table = r'C:\Users\Ensheng\Desktop\mapping\IATA\flow_XY.csv'
# Note: CSL and SBP are the same airport. CSL -> SBP (Airport count 676 -> 675)
df_iata = pd.read_csv(in_table)
df_iata = df_iata.loc[df_iata['year'] == year_iata] # slice for certain year
# note: FIPS means the state where the airport (IATA) is located. One airport (IATA) has only one associated state (FIPS).
df_iata = df_iata[['ISO', 'Code', 'FIPS', 'paxVolume']]
print(len(df_iata))
df_iata.head(5)

38937


Unnamed: 0,ISO,Code,FIPS,paxVolume
293925,DEU,DHN,1045,730
293926,CAN,DHN,1045,416
293927,GBR,DHN,1045,356
293928,KOR,DHN,1045,310
293929,MEX,DHN,1045,302


In [251]:
print("Warning: " + str(len(df_iata.loc[df_iata['Code'].isnull()])) + " airport(s) missing info.")



#### Update incoming travel volume data

In [252]:
# Thiessen data
in_table = v_folder + 'Thiessen_County_Intersect_Pct.csv'
df_tpct = pd.read_csv(in_table)
# note: FIPS_1 means all states within an airport Thiessen polygon. One airport (Code) has at least one associated state (FIPS_1).
# The sum of ThiessenAreaPct for the same airport should be 100%.
df_tpct = df_tpct[['Code', 'FIPS_1', 'ThiessenAreaPct']]
print(len(df_tpct))
df_tpct.sort_values(by='Code').head(15)

7137


Unnamed: 0,Code,FIPS_1,ThiessenAreaPct
6071,ABE,42095,14.27621
6070,ABE,42089,11.460134
6069,ABE,42011,14.898813
6068,ABE,42077,13.235343
6067,ABE,42017,8.398984
6066,ABE,42091,5.618685
6065,ABE,42103,0.115441
6064,ABE,34041,8.097077
6063,ABE,42107,10.020279
6061,ABE,42025,10.374824


In [253]:
# diffuse the travel volume to each county (make sure there is no null values after the left join)
df_temp = pd.merge(df_iata, df_tpct, how='left', on='Code')
df_temp.loc[df_temp['FIPS_1'].isnull()]

Unnamed: 0,ISO,Code,FIPS,paxVolume,FIPS_1,ThiessenAreaPct


In [254]:
df_temp['travelVolume'] = df_temp['paxVolume'] * df_temp['ThiessenAreaPct'] / 100
print(len(df_temp))
df_temp.sort_values(by=['ISO','Code']).head(15)

492700


Unnamed: 0,ISO,Code,FIPS,paxVolume,FIPS_1,ThiessenAreaPct,travelVolume
330853,ABW,ABE,42077,221,42025,10.374824,22.928361
330854,ABW,ABE,42077,221,34019,3.324779,7.347761
330855,ABW,ABE,42077,221,42107,10.020279,22.144817
330856,ABW,ABE,42077,221,34041,8.097077,17.89454
330857,ABW,ABE,42077,221,42103,0.115441,0.255125
330858,ABW,ABE,42077,221,42091,5.618685,12.417295
330859,ABW,ABE,42077,221,42017,8.398984,18.561754
330860,ABW,ABE,42077,221,42077,13.235343,29.250108
330861,ABW,ABE,42077,221,42011,14.898813,32.926377
330862,ABW,ABE,42077,221,42089,11.460134,25.326897


In [255]:
df = df_temp.groupby(['ISO','FIPS_1'])['travelVolume'].sum().reset_index()
# update df_iata with travel volume for more counties
df["FIPS"] = df["FIPS_1"]
df["paxVolume"] = df["travelVolume"]
df = df[["FIPS","ISO","paxVolume"]]
df.head(5)

Unnamed: 0,FIPS,ISO,paxVolume
0,1001,ABW,7.64918
1,1003,ABW,16.051309
2,1005,ABW,0.71803
3,1007,ABW,70.158321
4,1009,ABW,66.91603


In [256]:
df_iata = df
print(len(df_iata))
df_iata.head(5)

359940


Unnamed: 0,FIPS,ISO,paxVolume
0,1001,ABW,7.64918
1,1003,ABW,16.051309
2,1005,ABW,0.71803
3,1007,ABW,70.158321
4,1009,ABW,66.91603


#### Calculate risk (same as Task 1)

#### Calculate $r_{ij}^{t}$

In [257]:
df_temp = pd.merge(df_iata, df_outbreak, how='left', left_on='ISO',right_on='alpha3')
df_factors = pd.merge(df_temp, df_nme, how='left', left_on='FIPS',right_on='FIPS')
df_factors.head(5)

Unnamed: 0,FIPS,ISO,paxVolume,alpha3,Country,Total,pop2016,County,2016_NME,2015_NME,State_Avg_NME,FIPS_NME,Population
0,1001,ABW,7.64918,,,,,"Autauga, Alabama",,,0.006,0.006,55504.0
1,1003,ABW,16.051309,,,,,"Baldwin, Alabama",,,0.006,0.006,212628.0
2,1005,ABW,0.71803,,,,,"Barbour, Alabama",,,0.006,0.006,25270.0
3,1007,ABW,70.158321,,,,,"Bibb, Alabama",,,0.006,0.006,22668.0
4,1009,ABW,66.91603,,,,,"Blount, Alabama",,,0.006,0.006,58013.0


In [258]:
# rename and reorder col.
df_factors.loc[:,('FIPS_Pop')] = df_factors['Population']
df_factors.loc[:,('ISO_Case')] = df_factors['Total']
df_factors.loc[:,('ISO_Pop')] = df_factors[year_pop]
df_factors = df_factors[['FIPS','County','FIPS_NME','FIPS_Pop','ISO','Country','ISO_Case','ISO_Pop','paxVolume']]
print(len(df_factors))
df_factors.head(5)

359940


Unnamed: 0,FIPS,County,FIPS_NME,FIPS_Pop,ISO,Country,ISO_Case,ISO_Pop,paxVolume
0,1001,"Autauga, Alabama",0.006,55504.0,ABW,,,,7.64918
1,1003,"Baldwin, Alabama",0.006,212628.0,ABW,,,,16.051309
2,1005,"Barbour, Alabama",0.006,25270.0,ABW,,,,0.71803
3,1007,"Bibb, Alabama",0.006,22668.0,ABW,,,,70.158321
4,1009,"Blount, Alabama",0.006,58013.0,ABW,,,,66.91603


In [259]:
# slice
df_factors = df_factors.loc[df_factors['ISO_Case'].notnull()]
print(len(df_factors))
df_factors = df_factors.loc[df_factors['paxVolume'].notnull()]
print(len(df_factors))

316566
316566


#### Calculate $r_{j}^{t}$

In [260]:
df_factors['Route_Risk'] = (df_factors['ISO_Case'] / df_factors['ISO_Pop']) * df_factors['paxVolume'] * df_factors['FIPS_NME'] * df_factors['FIPS_Pop']

In [261]:
df_risk = df_factors.groupby(['FIPS','County'])['Route_Risk'].sum().reset_index()
df_risk.loc[:,('FIPS_RawRisk')] = df_risk['Route_Risk']
df_risk.head(5)

Unnamed: 0,FIPS,County,Route_Risk,FIPS_RawRisk
0,1001,"Autauga, Alabama",2471.309734,2471.309734
1,1003,"Baldwin, Alabama",61331.953286,61331.953286
2,1005,"Barbour, Alabama",1125.573866,1125.573866
3,1007,"Bibb, Alabama",7310.810865,7310.810865
4,1009,"Blount, Alabama",20190.185699,20190.185699


#### Normalize and list the Top 25

In [262]:
highest_risk = df_risk['FIPS_RawRisk'].max()
df_risk['Risk'] = df_risk['FIPS_RawRisk'] / highest_risk
df_risk['FIPS_Rank'] = df_risk['Risk'].rank(ascending=False)
df_risk = pd.merge(df_risk, df_seat, how='left', left_on='County',right_on='County')
df_risk['Year'] = year
df_risk = df_risk[['FIPS','County','City','FIPS_RawRisk','Risk','FIPS_Rank','Year']]
df_risk = df_risk.sort_values('Risk',ascending = False).reset_index()
df_risk.head(50)

Unnamed: 0,index,FIPS,County,City,FIPS_RawRisk,Risk,FIPS_Rank
0,159,6037,"Los Angeles, California",Los Angeles,5209939000.0,1.0,1.0
1,506,17031,"Cook, Illinois",Chicago,2679492000.0,0.514304,2.0
2,91,4013,"Maricopa, Arizona",Phoenix,1272118000.0,0.244171,3.0
3,1498,32003,"Clark, Nevada",Las Vegas,698475700.0,0.134066,4.0
4,307,12086,"Miami-Dade, Florida",Miami,610437000.0,0.117168,5.0
5,1610,36081,"Queens, New York","Queens, NYC",602710100.0,0.115685,6.0
6,1535,34025,"Monmouth, New Jersey",,545286600.0,0.104663,7.0
7,534,17097,"Lake, Illinois",Waukegan,443080200.0,0.085045,8.0
8,448,15003,"Honolulu, Hawaii",Honolulu,416201100.0,0.079886,9.0
9,1593,36047,"Kings, New York","Brooklyn, NYC",400841100.0,0.076938,10.0


In [263]:
result = df_risk
output_csv = out_folder + 'MeaslesRisk_US_' +  str(year) + '_voronoi_' + t.strftime('%m%d%y%H%M') + '.csv'
result.to_csv(output_csv, index=False, encoding='utf-8')

In [264]:
df_complete = pd.merge(df_factors, df_risk , how='left', left_on='FIPS',right_on='FIPS')
df_complete = df_complete.sort_values(by=['Risk','Route_Risk'], ascending=False)
df_complete['Route_Rank'] = df_complete.groupby('FIPS_Rank')['Route_Risk'].rank(ascending=False,method='dense')
df_complete = df_complete.rename(index=str, columns={"County_x": "County"})
df_complete = df_complete.drop(columns=['County_y'])
print(len(df_complete))
df_complete.head(10)

316566


Unnamed: 0,FIPS,County,FIPS_NME,FIPS_Pop,ISO,Country,ISO_Case,ISO_Pop,paxVolume,Route_Risk,index,City,FIPS_RawRisk,Risk,FIPS_Rank,Route_Rank
208712,6037,"Los Angeles, California",0.006,10163507.0,MNG,Mongolia,28710.0,3056.364,3134.0,1795238000.0,159.0,Los Angeles,5209939000.0,1.0,1.0,1.0
56292,6037,"Los Angeles, California",0.006,10163507.0,CHN,China,25593.0,1414049.351,757413.602243,835958200.0,159.0,Los Angeles,5209939000.0,1.0,1.0,2.0
139131,6037,"Los Angeles, California",0.006,10163507.0,IND,India,70798.0,1324517.249,189526.143799,617770700.0,159.0,Los Angeles,5209939000.0,1.0,1.0,3.0
109681,6037,"Los Angeles, California",0.006,10163507.0,GBR,United Kingdom of Great Britain and Northern I...,571.0,66297.944,506594.029023,266067000.0,159.0,Los Angeles,5209939000.0,1.0,1.0,4.0
152178,6037,"Los Angeles, California",0.006,10163507.0,ITA,Italy,862.0,60663.06,227532.503298,197161100.0,159.0,Los Angeles,5209939000.0,1.0,1.0,5.0
232066,6037,"Los Angeles, California",0.006,10163507.0,NZL,New Zealand,104.0,4659.265,128390.0,174760000.0,159.0,Los Angeles,5209939000.0,1.0,1.0,6.0
243311,6037,"Los Angeles, California",0.006,10163507.0,PHL,Philippines,647.0,103663.816,320692.541557,122056500.0,159.0,Los Angeles,5209939000.0,1.0,1.0,7.0
136884,6037,"Los Angeles, California",0.006,10163507.0,IDN,Indonesia,7204.0,261556.381,70987.0,119229000.0,159.0,Los Angeles,5209939000.0,1.0,1.0,8.0
285014,6037,"Los Angeles, California",0.006,10163507.0,THA,Thailand,1009.0,68971.308,124150.602243,110755800.0,159.0,Los Angeles,5209939000.0,1.0,1.0,9.0
253317,6037,"Los Angeles, California",0.006,10163507.0,ROU,Romania,2432.0,19796.285,14704.0719,110157100.0,159.0,Los Angeles,5209939000.0,1.0,1.0,10.0


In [265]:
result = df_complete
output_csv = out_folder + 'MeaslesRisk_US_' +  str(year) + '_voronoi_route_' + t.strftime('%m%d%y%H%M') + '.csv'
result.to_csv(output_csv, index=False, encoding='utf-8')