# CountryByMonth

#### Objective:
Create a time slider to describe counts of measleas reports for all countries in the world from 2011 to 2019 monthly.

#### Data Sources:
* [Shapefile - World Country Boundaries (ISO2)](https://hub.arcgis.com/datasets/252471276c9941729543be8789e06e12_0?geometry=-40.166%2C2.702%2C226.846%2C53.892)
* [Table - ISO2 and ISO3](https://www.nationsonline.org/oneworld/country_code_list.htm)
* [Shapefile - World Country (ISO3)](https://www.arcgis.com/home/item.html?id=170b5e6529064b8d9275168687880359)
* Table - measlescasesbycountrybymonth.xls

#### Steps:
1. ArcMap - Dissolve countries by ISO3 `ISO3_Countries` (after this step = 244 countries).
* ArcMap - [Copy countries](http://desktop.arcgis.com/en/arcmap/latest/manage-data/tables/copying-and-pasting-records-in-a-table.htm) for 9 years * 12 months times `ISO3_Countries_Ext` (after this step = 26352 records).
* ArcMap - Export as `ISO3_Countries_Ext.csv` and covert to .xlsx.
* Excel - Add time stamps (year and month) in `ISO3_Countries_Ext.xlsx`.
* Script - Merge all time information in `measlescasesbycountrybymonth` table into one field.
* Script - Merge `ISO3_Countries_Ext.xlsx` and `measlescasesbycountrybymonth` by `TimeID`.
* ArcMap - Join the merged table back to `ISO3_Countries_Ext` layer by `OBJECTID`. Export as `Measles_ISO3` layer.
* ArcMap - Remove all records where measles value is null. Create a new short int field `mcount`.
* ArcMap - Create time animation and publish to ArcGIS Online.

#### Update Info:
* 04/15/2019: first version based on MIA_MRE.

### Step 5 - Data Wrangling

In [1]:
import pandas as pd

In [20]:
in_table = r'C:\Users\Ensheng\Dropbox\measles\measlescasesbycountrybymonth.xls'
raw_df = pd.read_excel(in_table, sheet_name='WEB')
raw_df.head(15)

Unnamed: 0,Region,ISO3,Country,Year,January,February,March,April,May,June,July,August,September,October,November,December
0,AFR,AGO,Angola,2011,17.0,19.0,37.0,41.0,11.0,8.0,5.0,4.0,32.0,10.0,8.0,0.0
1,AFR,AGO,Angola,2012,373.0,289.0,381.0,393.0,546.0,357.0,382.0,553.0,571.0,367.0,216.0,42.0
2,AFR,AGO,Angola,2013,725.0,646.0,734.0,491.0,726.0,695.0,680.0,660.0,563.0,288.0,265.0,91.0
3,AFR,AGO,Angola,2014,1161.0,1101.0,1319.0,1094.0,1754.0,1150.0,1484.0,1429.0,1098.0,373.0,27.0,12.0
4,AFR,AGO,Angola,2015,4.0,15.0,0.0,0.0,3.0,3.0,4.0,13.0,40.0,14.0,5.0,2.0
5,AFR,AGO,Angola,2016,3.0,2.0,0.0,4.0,6.0,2.0,5.0,11.0,5.0,1.0,5.0,7.0
6,AFR,AGO,Angola,2017,1.0,7.0,2.0,0.0,3.0,2.0,1.0,1.0,2.0,0.0,2.0,6.0
7,AFR,AGO,Angola,2018,4.0,5.0,8.0,2.0,3.0,5.0,4.0,6.0,2.0,13.0,13.0,3.0
8,AFR,AGO,Angola,2019,,,,,,,,,,,,
9,AFR,BDI,Burundi,2011,6.0,2.0,8.0,4.0,2.0,4.0,19.0,12.0,2.0,0.0,2.0,2.0


In [21]:
len(raw_df)

1746

In [22]:
raw_df.columns

Index(['Region', 'ISO3', 'Country', 'Year', 'January', 'February', 'March',
       'April', 'May', 'June', 'July', 'August', 'September', 'October',
       'November', 'December'],
      dtype='object')

In [23]:
common_list = ['Region', 'ISO3', 'Country', 'Year']
month_list = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 'August', 'September', 'October', 'November', 'December']

In [24]:
# melt and sort table
df = pd.melt(raw_df, id_vars=common_list,value_vars=month_list)
# rename add month
df.loc[:,('Month')] = df['variable']
df.loc[:,('Measles')] = df['value']
df = df[['Region', 'ISO3', 'Country', 'Year', 'Month', 'Measles']]
df.head(15)

Unnamed: 0,Region,ISO3,Country,Year,Month,Measles
0,AFR,AGO,Angola,2011,January,17.0
1,AFR,AGO,Angola,2012,January,373.0
2,AFR,AGO,Angola,2013,January,725.0
3,AFR,AGO,Angola,2014,January,1161.0
4,AFR,AGO,Angola,2015,January,4.0
5,AFR,AGO,Angola,2016,January,3.0
6,AFR,AGO,Angola,2017,January,1.0
7,AFR,AGO,Angola,2018,January,4.0
8,AFR,AGO,Angola,2019,January,
9,AFR,BDI,Burundi,2011,January,6.0


In [25]:
# rename months
df['Month'] = df['Month'].map({
    'January': '01', 
    'February': '02', 
    'March': '03', 
    'April': '04', 
    'May': '05', 
    'June': '06', 
    'July': '07', 
    'August': '08', 
    'September': '09', 
    'October': '10', 
    'November': '11', 
    'December': '12'})

In [26]:
df = df.sort_values(by=['Region','ISO3','Year','Month'])
df.head(15)

Unnamed: 0,Region,ISO3,Country,Year,Month,Measles
0,AFR,AGO,Angola,2011,1,17.0
1746,AFR,AGO,Angola,2011,2,19.0
3492,AFR,AGO,Angola,2011,3,37.0
5238,AFR,AGO,Angola,2011,4,41.0
6984,AFR,AGO,Angola,2011,5,11.0
8730,AFR,AGO,Angola,2011,6,8.0
10476,AFR,AGO,Angola,2011,7,5.0
12222,AFR,AGO,Angola,2011,8,4.0
13968,AFR,AGO,Angola,2011,9,32.0
15714,AFR,AGO,Angola,2011,10,10.0


In [27]:
# new number of rows should be 1746*12
len(df)

20952

In [28]:
# merge year and month
df['Time'] = df['Year'].map(str) + '/' + df['Month']
df['TimeID'] = df['ISO3'] + df['Year'].map(str) + df['Month']
df = df[['Region', 'ISO3', 'Country', 'Time', 'Measles', 'TimeID']]
df.head(15)

Unnamed: 0,Region,ISO3,Country,Time,Measles,TimeID
0,AFR,AGO,Angola,2011/01,17.0,AGO201101
1746,AFR,AGO,Angola,2011/02,19.0,AGO201102
3492,AFR,AGO,Angola,2011/03,37.0,AGO201103
5238,AFR,AGO,Angola,2011/04,41.0,AGO201104
6984,AFR,AGO,Angola,2011/05,11.0,AGO201105
8730,AFR,AGO,Angola,2011/06,8.0,AGO201106
10476,AFR,AGO,Angola,2011/07,5.0,AGO201107
12222,AFR,AGO,Angola,2011/08,4.0,AGO201108
13968,AFR,AGO,Angola,2011/09,32.0,AGO201109
15714,AFR,AGO,Angola,2011/10,10.0,AGO201110


### Step 6 - Extension

In [58]:
in_table = r'C:\Users\Ensheng\Desktop\mapping\ISO3_Countries_Ext.xlsx'
iso3_df = pd.read_excel(in_table)
iso3_df = iso3_df.sort_values(by=['ISO_3DIGIT','OBJECTID'])

In [59]:
# rename months
iso3_df['Month'] = iso3_df['Month'].map({
    'January': '01', 
    'February': '02', 
    'March': '03', 
    'April': '04', 
    'May': '05', 
    'June': '06', 
    'July': '07', 
    'August': '08', 
    'September': '09', 
    'October': '10', 
    'November': '11', 
    'December': '12'})
iso3_df['TimeStamp'] = iso3_df['Year'].map(str) + '/' + iso3_df['Month']
iso3_df['TimeID'] = iso3_df['ISO_3DIGIT'] + iso3_df['Year'].map(str) + iso3_df['Month']
iso3_df.head(15)

Unnamed: 0,OBJECTID,ISO_3DIGIT,Shape_Length,Shape_Area,Year,Month,TimeID,TimeStamp
0,1,ABW,0.61034,0.016598,2011,1,ABW201101,2011/01
12,245,ABW,0.61034,0.016598,2012,1,ABW201201,2012/01
24,489,ABW,0.61034,0.016598,2013,1,ABW201301,2013/01
36,733,ABW,0.61034,0.016598,2014,1,ABW201401,2014/01
48,977,ABW,0.61034,0.016598,2015,1,ABW201501,2015/01
60,1221,ABW,0.61034,0.016598,2016,1,ABW201601,2016/01
72,1465,ABW,0.61034,0.016598,2017,1,ABW201701,2017/01
84,1709,ABW,0.61034,0.016598,2018,1,ABW201801,2018/01
96,1953,ABW,0.61034,0.016598,2019,1,ABW201901,2019/01
1,2197,ABW,0.61034,0.016598,2011,2,ABW201102,2011/02


In [60]:
len(iso3_df)

26352

In [73]:
# merge iso3_df and df
result = pd.merge(iso3_df, df, how='outer', on='TimeID')
result['GISID'] = result['OBJECTID']
result = result[['GISID','ISO_3DIGIT','TimeStamp','Measles']]
result[130:140]

Unnamed: 0,GISID,ISO_3DIGIT,TimeStamp,Measles
130,5370,AFG,2015/03,269.0
131,5614,AFG,2016/03,52.0
132,5858,AFG,2017/03,201.0
133,6102,AFG,2018/03,438.0
134,6346,AFG,2019/03,
135,6590,AFG,2011/04,229.0
136,6834,AFG,2012/04,341.0
137,7078,AFG,2013/04,69.0
138,7322,AFG,2014/04,29.0
139,7566,AFG,2015/04,198.0


In [75]:
# output as a csv table
output_csv = r'C:\Users\Ensheng\Desktop\mapping\CountryByMonth.csv'
result.to_csv(output_csv,index = False,encoding='utf-8')