## What I was Thinking

```2011-2018_monthly_refugee.csv``` contains data about the number of refugees taking shelter in what countries in what year. I was thinking of displaying a world map and depending on the number of refugees seen in the country in each, color each country accordingly. Somewhere down the line, we should add a scroll that allows the user to change the year and see the data change chronologically.

## What I Think Needs to be Done

Here is a tutorial for the plotly library: https://plot.ly/python/choropleth-maps/<br>
Here is a tutorial for creating a slider using plotly: https://plot.ly/python/gapminder-example/

If we do a world map, it would be better to use plotly since it has a world map already set for you. However, we need the country code to connect the map and the data in the csv, so... I don't really know what to do... Maybe Geopandas will actually be easier...

But if we do decide on plotly, we need to
1. find the country code and assign all rows a country code corresponding to the country
    1. https://www.kaggle.com/shep312/plotlycountrycodes
2. do some data wrangling and change some country names
    1. some country names have extra notes in the end, which we should delete
3. some values are represented with <b>*</b>, which means the data is confidential; those should be set to 0

## Relevant Libraries

In [26]:
# import plotly.plotly as py
# from plotly.grid_objs import Grid, Column
# from plotly.tools import FigureFactory as FF 

import pandas as pd
import time
import re
import numpy as np

In [27]:
refugee_df = pd.read_csv('2011-2018_monthly_refugee.csv')
# refugee_df

In [28]:
# displays list of unique country names in the data frame
# refugee_df['Country / territory of asylum/residence'].unique()

Replace:
1. Czech Rep. --> Czech Republic
2. United Kingdom of Great Britain and Northern Ireland --> ??? (currently United Kingdom)
3. Rep. of Korea --> Korea, South
4. The former Yugoslav Rep. of Macedonia --> Macedonia
5. USA (INS/DHS) --> United States
6. USA (EOIR) --> United States
7. Serbia and Kosovo: S/RES/1244 (1999) --> Serbia

In [29]:
code_df = pd.read_csv('plotly_countries_and_codes.csv')
# code_df

In [30]:
# displays list of unique country names in the data frame
# code_df['COUNTRY'].unique()

In [31]:
# replacing values in refugee_df dataframe
new_refugee_df = refugee_df.replace({
    'Czech Rep.': 'Czech Republic',
    'United Kingdom of Great Britain and Northern Ireland': 'United Kingdom',
    'Rep. of Korea': 'Korea, South',
    'The former Yugoslav Rep. of Macedonia': 'Macedonia',
    'USA (INS/DHS)': 'United States',
    'USA (EOIR)': 'United States',
    'Serbia and Kosovo: S/RES/1244 (1999)': 'Serbia'
})

In [38]:
# joining two tables together
merged_df = new_refugee_df.set_index('Country / territory of asylum/residence').join(code_df.set_index('COUNTRY'))
merged_df = merged_df.reset_index()
merged_df

Unnamed: 0,index,Origin,Year,Month,Value,GDP (BILLIONS),CODE
0,Albania,Syrian Arab Rep.,2012,December,3,13.4,ALB
1,Albania,Syrian Arab Rep.,2013,March,8,13.4,ALB
2,Albania,Syrian Arab Rep.,2013,August,4,13.4,ALB
3,Albania,Syrian Arab Rep.,2013,September,4,13.4,ALB
4,Albania,Syrian Arab Rep.,2013,December,8,13.4,ALB
5,Albania,Syrian Arab Rep.,2014,May,1,13.4,ALB
6,Albania,Syrian Arab Rep.,2014,June,1,13.4,ALB
7,Albania,Syrian Arab Rep.,2014,July,10,13.4,ALB
8,Albania,Syrian Arab Rep.,2014,August,6,13.4,ALB
9,Albania,Syrian Arab Rep.,2014,September,4,13.4,ALB


In [46]:
# rename index column to 'Country'
merged_df.rename(columns={'index': 'Country'}, inplace=True)

# remove * and convert values to int values
merged_df['Value'] = merged_df['Value'].apply(lambda x : 0 if x == '*' else int(x))

# sum values and group each country by year
total_df = pd.DataFrame(merged_df.groupby(['Country','Year','CODE']).agg({'Value':np.sum}))
type(total_df)
total_df

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Value
Country,Year,CODE,Unnamed: 3_level_1
Albania,2012,ALB,3
Albania,2013,ALB,24
Albania,2014,ALB,92
Albania,2015,ALB,71
Albania,2016,ALB,132
Albania,2017,ALB,105
Albania,2018,ALB,1676
Australia,2011,AUS,93
Australia,2012,AUS,143
Australia,2013,AUS,139
