# Refugee Migration Predictor

## Intro

In this notebook I am experimenting with ways to take 140k historical entries of refugee to __predict future trends__ of refugee migration and __mitigate__ humanitarian aid resource __bottlenecks__ using machine learning.

## Tools Used

- sklearn
- nodejs and react
- dataset from kaggle link:
- random forest regressor

In [2]:
import pandas as pd
import numpy as np
import sklearn as sk
import os

# columns (training):
#   country of asylumn     [v]
#   origin                 [v]
#   RSD (G or U or J)      [v]
#   UN assisted            [v]
#   applied during year    [v]
#   decisions_recognized   [v]
#   decisions_other        [v]
#   rejected               [v]
#   otherwise closed       [v]
#   total decisions        [v]
#   total pending end-year [v]
#   UN assisted            [v]

train_df = pd.read_csv("asylum_seekers.csv", low_memory=False)

# clean up the dataframe
train_df = train_df.dropna(subset=['Tota pending start-year'])
train_df['Tota pending start-year'] = pd.to_numeric(train_df['Tota pending start-year'], errors='coerce')

train_df = train_df.dropna(subset=['of which UNHCR-assisted(start-year)'])
train_df['of which UNHCR-assisted(start-year)'] = \
    pd.to_numeric(train_df['of which UNHCR-assisted(start-year)'], errors='coerce')

train_df = train_df.dropna(subset=['Applied during year'])
train_df['Applied during year'] = pd.to_numeric(train_df['Applied during year'], errors='coerce')

train_df = train_df.dropna(subset=['decisions_recognized'])
train_df['decisions_recognized'] = pd.to_numeric(train_df['decisions_recognized'], errors='coerce')


train_df = train_df.dropna(subset=['decisions_other'])
train_df['decisions_other'] = pd.to_numeric(train_df['decisions_other'], errors='coerce')

train_df = train_df.dropna(subset=['Rejected'])
train_df['Rejected'] = pd.to_numeric(train_df['Rejected'], errors='coerce')

train_df = train_df.dropna(subset=['Otherwise closed'])
train_df['Otherwise closed'] = pd.to_numeric(train_df['Otherwise closed'], errors='coerce')

train_df = train_df.dropna(subset=['Total decisions'])
train_df['Total decisions'] = pd.to_numeric(train_df['Total decisions'], errors='coerce')

train_df = train_df.dropna(subset=['Total pending end-year'])
train_df['Total pending end-year'] = pd.to_numeric(train_df['Total pending end-year'], errors='coerce')

train_df = train_df.dropna(subset=['of which UNHCR-assisted(end-year)'])
train_df['of which UNHCR-assisted(end-year)'] = pd.to_numeric(train_df['of which UNHCR-assisted(end-year)'], errors='coerce')

train_df.dtypes

Year                                         int64
Country / territory of asylum/residence     object
Origin                                      object
RSD procedure type / level                  object
Tota pending start-year                    float64
of which UNHCR-assisted(start-year)        float64
Applied during year                        float64
decisions_recognized                       float64
decisions_other                            float64
Rejected                                   float64
Otherwise closed                           float64
Total decisions                            float64
Total pending end-year                     float64
of which UNHCR-assisted(end-year)          float64
dtype: object

In [3]:
# one hot encoding
from sklearn.preprocessing import OneHotEncoder
pd.set_option('display.max_columns', None)  
train_df = pd.get_dummies(train_df)
train_df

Unnamed: 0,Year,Tota pending start-year,of which UNHCR-assisted(start-year),Applied during year,decisions_recognized,decisions_other,Rejected,Otherwise closed,Total decisions,Total pending end-year,of which UNHCR-assisted(end-year),Country / territory of asylum/residence_Afghanistan,Country / territory of asylum/residence_Albania,Country / territory of asylum/residence_Algeria,Country / territory of asylum/residence_Angola,Country / territory of asylum/residence_Antigua and Barbuda,Country / territory of asylum/residence_Argentina,Country / territory of asylum/residence_Armenia,Country / territory of asylum/residence_Aruba,Country / territory of asylum/residence_Australia,Country / territory of asylum/residence_Austria,Country / territory of asylum/residence_Azerbaijan,Country / territory of asylum/residence_Bahamas,Country / territory of asylum/residence_Bahrain,Country / territory of asylum/residence_Bangladesh,Country / territory of asylum/residence_Barbados,Country / territory of asylum/residence_Belarus,Country / territory of asylum/residence_Belgium,Country / territory of asylum/residence_Belize,Country / territory of asylum/residence_Benin,Country / territory of asylum/residence_Bolivia (Plurinational State of),Country / territory of asylum/residence_Bosnia and Herzegovina,Country / territory of asylum/residence_Botswana,Country / territory of asylum/residence_Brazil,Country / territory of asylum/residence_British Virgin Islands,Country / territory of asylum/residence_Bulgaria,Country / territory of asylum/residence_Burkina Faso,Country / territory of asylum/residence_Burundi,Country / territory of asylum/residence_Cambodia,Country / territory of asylum/residence_Cameroon,Country / territory of asylum/residence_Canada,Country / territory of asylum/residence_Cayman Islands,Country / territory of asylum/residence_Central African Rep.,Country / territory of asylum/residence_Chad,Country / territory of asylum/residence_Chile,Country / territory of asylum/residence_China,"Country / territory of asylum/residence_China, Hong Kong SAR","Country / territory of asylum/residence_China, Macao SAR",Country / territory of asylum/residence_Colombia,Country / territory of asylum/residence_Congo,Country / territory of asylum/residence_Costa Rica,Country / territory of asylum/residence_Croatia,Country / territory of asylum/residence_Cuba,Country / territory of asylum/residence_Curaçao,Country / territory of asylum/residence_Cyprus,Country / territory of asylum/residence_Czech Rep.,Country / territory of asylum/residence_Côte d'Ivoire,Country / territory of asylum/residence_Dem. Rep. of the Congo,Country / territory of asylum/residence_Denmark,Country / territory of asylum/residence_Djibouti,Country / territory of asylum/residence_Dominica,Country / territory of asylum/residence_Dominican Rep.,Country / territory of asylum/residence_Ecuador,Country / territory of asylum/residence_Egypt,Country / territory of asylum/residence_El Salvador,Country / territory of asylum/residence_Eritrea,Country / territory of asylum/residence_Estonia,Country / territory of asylum/residence_Ethiopia,Country / territory of asylum/residence_Fiji,Country / territory of asylum/residence_Finland,Country / territory of asylum/residence_France,Country / territory of asylum/residence_Gabon,Country / territory of asylum/residence_Gambia,Country / territory of asylum/residence_Georgia,Country / territory of asylum/residence_Germany,Country / territory of asylum/residence_Ghana,Country / territory of asylum/residence_Greece,Country / territory of asylum/residence_Grenada,Country / territory of asylum/residence_Guatemala,Country / territory of asylum/residence_Guinea,Country / territory of asylum/residence_Guinea-Bissau,Country / territory of asylum/residence_Guyana,Country / territory of asylum/residence_Haiti,Country / territory of asylum/residence_Honduras,Country / territory of asylum/residence_Hungary,Country / territory of asylum/residence_Iceland,Country / territory of asylum/residence_India,Country / territory of asylum/residence_Indonesia,Country / territory of asylum/residence_Iran (Islamic Rep. of),Country / territory of asylum/residence_Iraq,Country / territory of asylum/residence_Ireland,Country / territory of asylum/residence_Israel,Country / territory of asylum/residence_Italy,Country / territory of asylum/residence_Jamaica,Country / territory of asylum/residence_Japan,Country / territory of asylum/residence_Jordan,Country / territory of asylum/residence_Kazakhstan,Country / territory of asylum/residence_Kenya,Country / territory of asylum/residence_Kuwait,Country / territory of asylum/residence_Kyrgyzstan,Country / territory of asylum/residence_Lao People's Dem. Rep.,Country / territory of asylum/residence_Latvia,Country / territory of asylum/residence_Lebanon,Country / territory of asylum/residence_Lesotho,Country / territory of asylum/residence_Liberia,Country / territory of asylum/residence_Libya,Country / territory of asylum/residence_Liechtenstein,Country / territory of asylum/residence_Lithuania,Country / territory of asylum/residence_Luxembourg,Country / territory of asylum/residence_Madagascar,Country / territory of asylum/residence_Malawi,Country / territory of asylum/residence_Malaysia,Country / territory of asylum/residence_Mali,Country / territory of asylum/residence_Malta,Country / territory of asylum/residence_Mauritania,Country / territory of asylum/residence_Mexico,Country / territory of asylum/residence_Monaco,Country / territory of asylum/residence_Mongolia,Country / territory of asylum/residence_Montenegro,Country / territory of asylum/residence_Montserrat,Country / territory of asylum/residence_Morocco,Country / territory of asylum/residence_Mozambique,Country / territory of asylum/residence_Namibia,Country / territory of asylum/residence_Nauru,Country / territory of asylum/residence_Nepal,Country / territory of asylum/residence_Netherlands,Country / territory of asylum/residence_New Zealand,Country / territory of asylum/residence_Nicaragua,Country / territory of asylum/residence_Niger,Country / territory of asylum/residence_Nigeria,Country / territory of asylum/residence_Norway,Country / territory of asylum/residence_Oman,Country / territory of asylum/residence_Pakistan,Country / territory of asylum/residence_Palau,Country / territory of asylum/residence_Panama,Country / territory of asylum/residence_Papua New Guinea,Country / territory of asylum/residence_Paraguay,Country / territory of asylum/residence_Peru,Country / territory of asylum/residence_Philippines,Country / territory of asylum/residence_Poland,Country / territory of asylum/residence_Portugal,Country / territory of asylum/residence_Qatar,Country / territory of asylum/residence_Rep. of Korea,Country / territory of asylum/residence_Rep. of Moldova,Country / territory of asylum/residence_Romania,Country / territory of asylum/residence_Russian Federation,Country / territory of asylum/residence_Rwanda,Country / territory of asylum/residence_Saint Kitts and Nevis,Country / territory of asylum/residence_Saint Lucia,Country / territory of asylum/residence_Saint Vincent and the Grenadines,Country / territory of asylum/residence_Samoa,Country / territory of asylum/residence_Saudi Arabia,Country / territory of asylum/residence_Senegal,Country / territory of asylum/residence_Serbia and Kosovo (S/RES/1244 (1999)),Country / territory of asylum/residence_Sierra Leone,Country / territory of asylum/residence_Singapore,Country / territory of asylum/residence_Sint Maarten (Dutch part),Country / territory of asylum/residence_Slovakia,Country / territory of asylum/residence_Slovenia,Country / territory of asylum/residence_Solomon Islands,Country / territory of asylum/residence_Somalia,Country / territory of asylum/residence_South Africa,Country / territory of asylum/residence_South Sudan,Country / territory of asylum/residence_Spain,Country / territory of asylum/residence_Sri Lanka,Country / territory of asylum/residence_Sudan,Country / territory of asylum/residence_Suriname,Country / territory of asylum/residence_Swaziland,Country / territory of asylum/residence_Sweden,Country / territory of asylum/residence_Switzerland,Country / territory of asylum/residence_Syrian Arab Rep.,Country / territory of asylum/residence_Tajikistan,Country / territory of asylum/residence_Thailand,Country / territory of asylum/residence_The former Yugoslav Republic of Macedonia,Country / territory of asylum/residence_Timor-Leste,Country / territory of asylum/residence_Togo,Country / territory of asylum/residence_Tonga,Country / territory of asylum/residence_Trinidad and Tobago,Country / territory of asylum/residence_Tunisia,Country / territory of asylum/residence_Turkey,Country / territory of asylum/residence_Turkmenistan,Country / territory of asylum/residence_Turks and Caicos Islands,Country / territory of asylum/residence_Uganda,Country / territory of asylum/residence_Ukraine,Country / territory of asylum/residence_United Arab Emirates,Country / territory of asylum/residence_United Kingdom,Country / territory of asylum/residence_United Rep. of Tanzania,Country / territory of asylum/residence_United States of America,Country / territory of asylum/residence_Uruguay,Country / territory of asylum/residence_Uzbekistan,Country / territory of asylum/residence_Vanuatu,Country / territory of asylum/residence_Venezuela (Bolivarian Republic of),Country / territory of asylum/residence_Yemen,Country / territory of asylum/residence_Zambia,Country / territory of asylum/residence_Zimbabwe,Origin_Afghanistan,Origin_Albania,Origin_Algeria,Origin_Andorra,Origin_Angola,Origin_Antigua and Barbuda,Origin_Argentina,Origin_Armenia,Origin_Aruba,Origin_Australia,Origin_Austria,Origin_Azerbaijan,Origin_Bahamas,Origin_Bahrain,Origin_Bangladesh,Origin_Barbados,Origin_Belarus,Origin_Belgium,Origin_Belize,Origin_Benin,Origin_Bermuda,Origin_Bhutan,Origin_Bolivia (Plurinational State of),Origin_Bosnia and Herzegovina,Origin_Botswana,Origin_Brazil,Origin_British Virgin Islands,Origin_Brunei Darussalam,Origin_Bulgaria,Origin_Burkina Faso,Origin_Burundi,Origin_Cabo Verde,Origin_Cambodia,Origin_Cameroon,Origin_Canada,Origin_Cayman Islands,Origin_Central African Rep.,Origin_Chad,Origin_Chile,Origin_China,"Origin_China, Hong Kong SAR","Origin_China, Macao SAR",Origin_Colombia,Origin_Comoros,Origin_Congo,Origin_Cook Islands,Origin_Costa Rica,Origin_Croatia,Origin_Cuba,Origin_Curaçao,Origin_Cyprus,Origin_Czech Rep.,Origin_Côte d'Ivoire,Origin_Dem. People's Rep. of Korea,Origin_Dem. Rep. of the Congo,Origin_Denmark,Origin_Djibouti,Origin_Dominica,Origin_Dominican Rep.,Origin_Ecuador,Origin_Egypt,Origin_El Salvador,Origin_Equatorial Guinea,Origin_Eritrea,Origin_Estonia,Origin_Ethiopia,Origin_Fiji,Origin_Finland,Origin_France,Origin_French Guiana,Origin_Gabon,Origin_Gambia,Origin_Georgia,Origin_Germany,Origin_Ghana,Origin_Gibraltar,Origin_Greece,Origin_Grenada,Origin_Guadeloupe,Origin_Guatemala,Origin_Guinea,Origin_Guinea-Bissau,Origin_Guyana,Origin_Haiti,Origin_Holy See (the),Origin_Honduras,Origin_Hungary,Origin_Iceland,Origin_India,Origin_Indonesia,Origin_Iran (Islamic Rep. of),Origin_Iraq,Origin_Ireland,Origin_Israel,Origin_Italy,Origin_Jamaica,Origin_Japan,Origin_Jordan,Origin_Kazakhstan,Origin_Kenya,Origin_Kiribati,Origin_Kuwait,Origin_Kyrgyzstan,Origin_Lao People's Dem. Rep.,Origin_Latvia,Origin_Lebanon,Origin_Lesotho,Origin_Liberia,Origin_Libya,Origin_Liechtenstein,Origin_Lithuania,Origin_Luxembourg,Origin_Madagascar,Origin_Malawi,Origin_Malaysia,Origin_Maldives,Origin_Mali,Origin_Malta,Origin_Marshall Islands,Origin_Mauritania,Origin_Mauritius,Origin_Mexico,Origin_Micronesia (Federated States of),Origin_Monaco,Origin_Mongolia,Origin_Montenegro,Origin_Morocco,Origin_Mozambique,Origin_Myanmar,Origin_Namibia,Origin_Nauru,Origin_Nepal,Origin_Netherlands,Origin_New Caledonia,Origin_New Zealand,Origin_Nicaragua,Origin_Niger,Origin_Nigeria,Origin_Niue,Origin_Norway,Origin_Oman,Origin_Pakistan,Origin_Palau,Origin_Palestinian,Origin_Panama,Origin_Papua New Guinea,Origin_Paraguay,Origin_Peru,Origin_Philippines,Origin_Poland,Origin_Portugal,Origin_Puerto Rico,Origin_Qatar,Origin_Rep. of Korea,Origin_Rep. of Moldova,Origin_Romania,Origin_Russian Federation,Origin_Rwanda,Origin_Saint Kitts and Nevis,Origin_Saint Lucia,Origin_Saint Vincent and the Grenadines,Origin_Saint-Pierre-et-Miquelon,Origin_Samoa,Origin_San Marino,Origin_Sao Tome and Principe,Origin_Saudi Arabia,Origin_Senegal,Origin_Serbia and Kosovo (S/RES/1244 (1999)),Origin_Seychelles,Origin_Sierra Leone,Origin_Singapore,Origin_Slovakia,Origin_Slovenia,Origin_Solomon Islands,Origin_Somalia,Origin_South Africa,Origin_South Sudan,Origin_Spain,Origin_Sri Lanka,Origin_Stateless,Origin_Sudan,Origin_Suriname,Origin_Svalbard and Jan Mayen,Origin_Swaziland,Origin_Sweden,Origin_Switzerland,Origin_Syrian Arab Rep.,Origin_Tajikistan,Origin_Thailand,Origin_The former Yugoslav Republic of Macedonia,Origin_Tibetan,Origin_Timor-Leste,Origin_Togo,Origin_Tonga,Origin_Trinidad and Tobago,Origin_Tunisia,Origin_Turkey,Origin_Turkmenistan,Origin_Turks and Caicos Islands,Origin_Tuvalu,Origin_Uganda,Origin_Ukraine,Origin_United Arab Emirates,Origin_United Kingdom,Origin_United Rep. of Tanzania,Origin_United States of America,Origin_Uruguay,Origin_Uzbekistan,Origin_Vanuatu,Origin_Various/Unknown,Origin_Venezuela (Bolivarian Republic of),Origin_Viet Nam,Origin_Western Sahara,Origin_Yemen,Origin_Zambia,Origin_Zimbabwe,RSD procedure type / level_G / AR,RSD procedure type / level_G / BL,RSD procedure type / level_G / CA,RSD procedure type / level_G / EO,RSD procedure type / level_G / FA,RSD procedure type / level_G / FI,RSD procedure type / level_G / IN,RSD procedure type / level_G / JR,RSD procedure type / level_G / NA,RSD procedure type / level_G / RA,RSD procedure type / level_G / SP,RSD procedure type / level_G / TA,RSD procedure type / level_G / TP,RSD procedure type / level_G / TR,RSD procedure type / level_G / ar,RSD procedure type / level_G / fi,RSD procedure type / level_J / AR,RSD procedure type / level_J / FA,RSD procedure type / level_J / FI,RSD procedure type / level_J / RA,RSD procedure type / level_U / AR,RSD procedure type / level_U / FA,RSD procedure type / level_U / FI,RSD procedure type / level_U / JR,RSD procedure type / level_U / RA
0,2000,0.0,0.0,5.0,5.0,0.0,0.0,0.0,5.0,0.0,0.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
2,2000,265.0,265.0,2156.0,747.0,0.0,112.0,327.0,1186.0,1235.0,1235.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
3,2000,196.0,0.0,225.0,151.0,0.0,31.0,68.0,250.0,171.0,0.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
4,2000,193.0,0.0,218.0,182.0,0.0,51.0,40.0,273.0,150.0,0.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
5,2000,40.0,0.0,662.0,275.0,0.0,412.0,0.0,687.0,23.0,0.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
6,2000,67.0,67.0,81.0,29.0,0.0,24.0,49.0,102.0,46.0,46.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
7,2000,416.0,416.0,169.0,126.0,0.0,121.0,210.0,457.0,128.0,128.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0
8,2000,2172.0,30.0,165.0,112.0,0.0,0.0,1992.0,2104.0,233.0,40.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
9,2000,0.0,0.0,2.0,1.0,0.0,1.0,0.0,2.0,0.0,0.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0
10,2000,5.0,0.0,25.0,14.0,1.0,2.0,4.0,21.0,9.0,5.0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0


In [4]:
# feature engineering
train_df["entries"] = train_df["Total pending end-year"] + \
                      train_df["Applied during year"] + \
                      train_df["decisions_recognized"] + \
                      train_df["decisions_other"] + \
                      train_df["Rejected"] + \
                      train_df["Otherwise closed"] + \
                      train_df["Total decisions"] + train_df["Total pending end-year"] + train_df["of which UNHCR-assisted(end-year)"]

In [5]:
# adding the correct output
train_df["acceptance-rate"] = (train_df["Total decisions"] - train_df["Rejected"]) / train_df["entries"]
train_df = train_df.dropna(subset=['acceptance-rate'])
    
print(train_df)

        Year  Tota pending start-year  of which UNHCR-assisted(start-year)  \
0       2000                      0.0                                  0.0   
2       2000                    265.0                                265.0   
3       2000                    196.0                                  0.0   
4       2000                    193.0                                  0.0   
5       2000                     40.0                                  0.0   
6       2000                     67.0                                 67.0   
7       2000                    416.0                                416.0   
8       2000                   2172.0                                 30.0   
9       2000                      0.0                                  0.0   
10      2000                      5.0                                  0.0   
11      2000                    311.0                                  0.0   
12      2000                      0.0                           

In [6]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import Imputer
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.pipeline import make_pipeline
from sklearn.cross_validation import cross_val_score

# doing actual test learning (entries)
seed = 7
rfr_pipeline = make_pipeline(Imputer(), \
                             RandomForestRegressor(n_jobs=4))

# cross val testing with cv = 4
def getmean(scores):
    sum = 0
    for score in scores:
        sum += score
    return sum/float(len(scores))

# grab features (excluding)
features_excluded = [
    "Tota pending start-year",
    "of which UNHCR-assisted(start-year)",
    "Applied during year",
    "decisions_recognized",
    "decisions_other",
    "Rejected",
    "Otherwise closed",
    "Total decisions",
    "Total pending end-year",
    "of which UNHCR-assisted(end-year)",
    "acceptance-rate",
    "Year"
]
    
features = []
for f in train_df.columns:
    if f not in features_excluded:
        features.append(f)
        
print("features: ")
print(features)

X_train = train_df[features]
y_train = train_df["entries"]

scores = cross_val_score(rfr_pipeline, X_train, y_train, cv = 4)
print("Random Forest Regressor "+str(getmean(scores)))

features: 
['Country / territory of asylum/residence_Afghanistan', 'Country / territory of asylum/residence_Albania', 'Country / territory of asylum/residence_Algeria', 'Country / territory of asylum/residence_Angola', 'Country / territory of asylum/residence_Antigua and Barbuda', 'Country / territory of asylum/residence_Argentina', 'Country / territory of asylum/residence_Armenia', 'Country / territory of asylum/residence_Aruba', 'Country / territory of asylum/residence_Australia', 'Country / territory of asylum/residence_Austria', 'Country / territory of asylum/residence_Azerbaijan', 'Country / territory of asylum/residence_Bahamas', 'Country / territory of asylum/residence_Bahrain', 'Country / territory of asylum/residence_Bangladesh', 'Country / territory of asylum/residence_Barbados', 'Country / territory of asylum/residence_Belarus', 'Country / territory of asylum/residence_Belgium', 'Country / territory of asylum/residence_Belize', 'Country / territory of asylum/residence_Benin'



Random Forest Regressor 0.9719564938452818


In [7]:
countries = ["Afghanistan",
"Angola",
"Albania",
"United Arab Emirates",
"Argentina",
"Armenia",
"Australia",
"Austria",
"Azerbaijan",
"Belgium",
"Bangladesh",
"Bulgaria",
"Bahrain",
"Bahamas",
"Bosnia and Herzegovina",
"Belarus",
"Belize",
"Brazil",
"Barbados",
"Botswana",
"Central African Rep.",
"Canada",
"Switzerland",
"Chile",
"China",
"Côte d'Ivoire",
"Cameroon",
"Dem. Rep. of the Congo",
"Congo",
"Colombia",
"Costa Rica",
"Cuba",
"Cyprus",
"Czech Rep.",
"Germany",
"Djibouti",
"Denmark",
"Algeria",
"Ecuador",
"Egypt",
"Eritrea",
"Spain",
"Estonia",
"Ethiopia",
"Finland",
"France",
"United Kingdom",
"Georgia",
"Ghana",
"Guinea",
"Gambia",
"Greece",
"Croatia",
"Haiti",
"Hungary",
"Indonesia",
"India",
"Ireland",
"Iraq",
"Israel",
"Italy",
"Jamaica",
"Jordan",
"Japan",
"Kazakhstan",
"Kenya",
"Cambodia",
"Rep. of Korea",
"Kuwait",
"Lao People's Dem. Rep.",
"Lebanon",
"Liberia",
"Libya",
"Lithuania",
"Luxembourg",
"Latvia",
"Morocco",
"Madagascar",
"Mexico",
"Mali",
"Mongolia",
"Mozambique",
"Malaysia",
"Namibia",
"Niger",
"Nigeria",
"Netherlands",
"Norway",
"Nepal",
"New Zealand",
"Oman",
"Pakistan",
"Panama",
"Peru",
"Philippines",
"Papua New Guinea",
"Poland",
"Portugal",
"Paraguay",
"Qatar",
"Romania",
"Russian Federation",
"Rwanda",
"Saudi Arabia",
"Sudan",
"Senegal",
"Singapore",
"Solomon Islands",
"Sierra Leone",
"El Salvador",
"Somalia",
"Slovakia",
"Slovenia",
"Sweden",
"Swaziland",
"Syrian Arab Rep.",
"Chad",
"Togo",
"Thailand",
"Tajikistan",
"Turkmenistan",
"Tunisia",
"Turkey",
"United Rep. of Tanzania",
"Uganda",
"Ukraine",
"Uruguay",
"United States of America",
"Uzbekistan",
"Venezuela (Bolivarian Republic of)",
"Samoa",
"Yemen",
"South Africa",
"Zambia",
"Zimbabwe"]

print(countries)

['Afghanistan', 'Angola', 'Albania', 'United Arab Emirates', 'Argentina', 'Armenia', 'Australia', 'Austria', 'Azerbaijan', 'Belgium', 'Bangladesh', 'Bulgaria', 'Bahrain', 'Bahamas', 'Bosnia and Herzegovina', 'Belarus', 'Belize', 'Brazil', 'Barbados', 'Botswana', 'Central African Rep.', 'Canada', 'Switzerland', 'Chile', 'China', "Côte d'Ivoire", 'Cameroon', 'Dem. Rep. of the Congo', 'Congo', 'Colombia', 'Costa Rica', 'Cuba', 'Cyprus', 'Czech Rep.', 'Germany', 'Djibouti', 'Denmark', 'Algeria', 'Ecuador', 'Egypt', 'Eritrea', 'Spain', 'Estonia', 'Ethiopia', 'Finland', 'France', 'United Kingdom', 'Georgia', 'Ghana', 'Guinea', 'Gambia', 'Greece', 'Croatia', 'Haiti', 'Hungary', 'Indonesia', 'India', 'Ireland', 'Iran (Islamic Rep. of)', 'Iraq', 'Israel', 'Italy', 'Jamaica', 'Jordan', 'Japan', 'Kazakhstan', 'Kenya', 'Cambodia', 'Rep. of Korea', 'Kuwait', "Lao People's Dem. Rep.", 'Lebanon', 'Liberia', 'Libya', 'Lithuania', 'Luxembourg', 'Latvia', 'Morocco', 'Madagascar', 'Mexico', 'Mali', 'Mong

In [8]:
# get results

pipelines = [] #for n countries
curr = 0
for c in countries:
    pipelines.append(make_pipeline(Imputer(), \
                             RandomForestRegressor(n_jobs=4)))
    seperated = train_df.loc[train_df["Country / territory of asylum/residence_"+c] == 1]
    
    if seperated.shape[0] <= 0:
        continue
    
    features_excluded = [
        "Year",
        "Tota pending start-year",
        "of which UNHCR-assisted(start-year)",
        "Applied during year",
        "decisions_recognized",
        "decisions_other",
        "Rejected",
        "Otherwise closed",
        "Total decisions",
        "Total pending end-year",
        "of which UNHCR-assisted(end-year)",
        "acceptance-rate"
    ]

    features = []
    for f in seperated.columns:
        if f not in features_excluded:
            features.append(f)

    X_train = seperated[features]
    y_train = seperated["entries"]
    
    pipelines[curr].fit(X_train, y_train)
    
    curr+=1 # updated the current index

In [9]:
# save results
import pickle
pickle.dump(pipelines, open("pipelines", 'wb'))

In [73]:
selected_countries = [
    "United Arab Emirates",
    "Argentina",
    "Australia",
    "Austria",
    "Azerbaijan",
    "Belgium",
    "Bangladesh",
    "Bulgaria",
    "Canada",
    "Switzerland",
    "Chile",
    "China",
    "Czech Rep.",
    "Germany",
    "Djibouti",
    "Finland",
    "France",
    "United Kingdom",
    "Greece",
    "India",
    "Ireland",
    "Iran (Islamic Rep. of)",
    "Iraq",
    "Israel",
    "Kuwait",
    "Nigeria",
    "Netherlands",
    "Norway",
    "New Zealand",
    "Pakistan",
    "Qatar",
    "Russian Federation",
    "Rwanda",
    "Saudi Arabia",
    "Sudan",
    "Sweden",
    "Turkey",
    "South Africa",
    "Zimbabwe"
]

# example
features_excluded = [
        "Year",
        "Tota pending start-year",
        "of which UNHCR-assisted(start-year)",
        "Applied during year",
        "decisions_recognized",
        "decisions_other",
        "Rejected",
        "Otherwise closed",
        "Total decisions",
        "Total pending end-year",
        "of which UNHCR-assisted(end-year)",
        "acceptance-rate"
    ]

class CountryRoute(object):
    def __init__(self):
        self.country_name = ""
        self.entries = 0
        self.long = 0
        self.lat = 0
        
    def __cmp__(self, other):
        if self.entries > other.entries:
            return -1
        else:
            return 1

def inference(country_origin):
    query = train_df.copy()
    query = query.loc[[0,1]]
    
    matches = []
              
    for i in range(1, len(countries)):
        try:
            dest = countries[i]
            for f in query.columns:
                query[f] = 0

            query["Origin_" + country_origin] = 1
            query["Country / territory of asylum/residence_" + dest] = 1

            # get X from query
            features_excluded = [
                "Year",
                "Tota pending start-year",
                "of which UNHCR-assisted(start-year)",
                "Applied during year",
                "decisions_recognized",
                "decisions_other",
                "Rejected",
                "Otherwise closed",
                "Total decisions",
                "Total pending end-year",
                "of which UNHCR-assisted(end-year)",
                "acceptance-rate"
            ]

            features = []
            for f in seperated.columns:
                if f not in features_excluded:
                    features.append(f)

            y = pipelines[i].predict(query[features]) # grab results  

            countryInstance = CountryRoute() # somehow write results to an array
            countryInstance.country_name = dest
            countryInstance.entries = y[0]

            matches.append(countryInstance)
        
        except:
            continue
    
    matches = sorted(matches, key=lambda objeto: objeto.entries, reverse=True)
    return matches

country_data = inference("Somalia")
for i in range(1, len(country_data)):
    cntr = country_data[i]
    print(cntr.country_name +" "+ str(cntr.entries))


Passing list-likes to .loc or [] with any missing label will raise
KeyError in the future, you can use .reindex() as an alternative.

See the documentation here:
http://pandas.pydata.org/pandas-docs/stable/indexing.html#deprecate-loc-reindex-listlike


Eritrea 54.1
Haiti 6.8
Solomon Islands 6.0
Madagascar 4.5
Lao People's Dem. Rep. 4.2
Barbados 3.0
Bahamas 2.8
Mongolia 2.7
Tajikistan 2.4
Turkmenistan 2.4
Zambia 2.4
Jamaica 2.3
Senegal 2.3
South Africa 2.3
Bangladesh 2.2
Swaziland 2.2
Uzbekistan 2.2
United Arab Emirates 2.1
Iraq 2.1
Mali 2.1
Singapore 2.1
Chad 2.1
United Rep. of Tanzania 2.1
Angola 2.0
Albania 2.0
Argentina 2.0
Armenia 2.0
Australia 2.0
Austria 2.0
Azerbaijan 2.0
Belgium 2.0
Bulgaria 2.0
Bahrain 2.0
Bosnia and Herzegovina 2.0
Belarus 2.0
Belize 2.0
Brazil 2.0
Botswana 2.0
Central African Rep. 2.0
Canada 2.0
Switzerland 2.0
Chile 2.0
China 2.0
Côte d'Ivoire 2.0
Cameroon 2.0
Dem. Rep. of the Congo 2.0
Congo 2.0
Colombia 2.0
Costa Rica 2.0
Cuba 2.0
Cyprus 2.0
Czech Rep. 2.0
Germany 2.0
Djibouti 2.0
Denmark 2.0
Algeria 2.0
Ecuador 2.0
Egypt 2.0
Spain 2.0
Estonia 2.0
Ethiopia 2.0
Finland 2.0
France 2.0
United Kingdom 2.0
Georgia 2.0
Ghana 2.0
Guinea 2.0
Gambia 2.0
Greece 2.0
Croatia 2.0
Hungary 2.0
Indonesia 2.0
India 2.0


In [75]:
# postprocess into a file
summ = 0
for i in range(1,6):
    summ += country_data[i].entries

for i in range(1,6):
    country_data[i].entries /= summ

for i in range(1,6):
    cntr = country_data[i]
    print(cntr.country_name +" "+ str(cntr.entries))


Eritrea 0.14571213100624864
Haiti 0.018315018315018312
Solomon Islands 0.016160310277957335
Madagascar 0.012120232708468001
Lao People's Dem. Rep. 0.8076923076923077
