# Country, continent and ISO code convert - Global Passport Index

## About
- **Data**: [MakeoverMonday 2023/W28: The Henley Passport Index](https://data.world/makeovermonday/2023w28) + links to country flags from [Worldometer](https://www.worldometers.info/geography/flags-of-the-world/)
- **Purpose**: get ISO-3 codes and continent from country name using country_converter

## Import libraries and read .xslx file

In [1]:
import pandas as pd
import numpy as np
import country_converter as coco

In [2]:
#read Excel file and create data frame
passport_df = pd.read_excel("MM_Global_Passport_Rankings_1.xlsx")

In [3]:
# check info
passport_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 398 entries, 0 to 397
Data columns (total 4 columns):
 #   Column                  Non-Null Count  Dtype 
---  ------                  --------------  ----- 
 0   Year                    398 non-null    int64 
 1   Country                 398 non-null    object
 2   Visa-free Destinations  398 non-null    int64 
 3   Flag URL                398 non-null    object
dtypes: int64(2), object(2)
memory usage: 12.6+ KB


## Convert country names to ISO and continent

In [4]:
converter = coco.CountryConverter()

In [5]:
# get country names from column
country_names = passport_df.Country

In [6]:
# chekc country names
country_names

0            Japan
1        Singapore
2      South Korea
3          Germany
4            Spain
          ...     
393          Yemen
394       Pakistan
395           Iraq
396          Syria
397    Afghanistan
Name: Country, Length: 398, dtype: object

In [7]:
# generate ISO-3 codes from country names and adds to new column
passport_df["ISO-3"] = converter.convert(names = country_names, src = "name_short", to = "ISO3")
passport_df.head()

Brunei not found in name_short
Saint Kitts and Nevis not found in name_short
Vatican City not found in name_short
Saint Vincent and the Grenadines not found in name_short
Saint Lucia not found in name_short
Micronesia not found in name_short
Turkey not found in name_short
East Timor not found in name_short
Cape Verde not found in name_short
Kyrgyzstan not found in name_short
São Tomé and Príncipe not found in name_short
Ivory Coast not found in name_short
Congo not found in name_short
Brunei not found in name_short
Saint Kitts and Nevis not found in name_short
Vatican City not found in name_short
Saint Vincent and the Grenadines not found in name_short
Saint Lucia not found in name_short
Micronesia not found in name_short
Turkey not found in name_short
East Timor not found in name_short
Cape Verde not found in name_short
Kyrgyzstan not found in name_short
São Tomé and Príncipe not found in name_short
Ivory Coast not found in name_short
Congo not found in name_short


Unnamed: 0,Year,Country,Visa-free Destinations,Flag URL,ISO-3
0,2022,Japan,193,https://www.worldometers.info/img/flags/ja-fla...,JPN
1,2022,Singapore,192,https://www.worldometers.info/img/flags/sn-fla...,SGP
2,2022,South Korea,192,https://www.worldometers.info/img/flags/ks-fla...,KOR
3,2022,Germany,190,https://www.worldometers.info/img/flags/gm-fla...,DEU
4,2022,Spain,190,https://www.worldometers.info/img/flags/sp-fla...,ESP


- Note: due to name differences, converter doesn't recognize some of the countries.
- Workaround options: 
    - change names in original file to coco standards
    - add codes manually to the dataframe based on condition using .loc
    - add codes manually to the final file

In [8]:
# generate continent from country names and add in new column
# same error expected as above, due to name discrepancies
passport_df["Continent"] = converter.convert(names = country_names, src = "name_short", to = "continent")
passport_df.head()

Brunei not found in name_short
Saint Kitts and Nevis not found in name_short
Vatican City not found in name_short
Saint Vincent and the Grenadines not found in name_short
Saint Lucia not found in name_short
Micronesia not found in name_short
Turkey not found in name_short
East Timor not found in name_short
Cape Verde not found in name_short
Kyrgyzstan not found in name_short
São Tomé and Príncipe not found in name_short
Ivory Coast not found in name_short
Congo not found in name_short
Brunei not found in name_short
Saint Kitts and Nevis not found in name_short
Vatican City not found in name_short
Saint Vincent and the Grenadines not found in name_short
Saint Lucia not found in name_short
Micronesia not found in name_short
Turkey not found in name_short
East Timor not found in name_short
Cape Verde not found in name_short
Kyrgyzstan not found in name_short
São Tomé and Príncipe not found in name_short
Ivory Coast not found in name_short
Congo not found in name_short


Unnamed: 0,Year,Country,Visa-free Destinations,Flag URL,ISO-3,Continent
0,2022,Japan,193,https://www.worldometers.info/img/flags/ja-fla...,JPN,Asia
1,2022,Singapore,192,https://www.worldometers.info/img/flags/sn-fla...,SGP,Asia
2,2022,South Korea,192,https://www.worldometers.info/img/flags/ks-fla...,KOR,Asia
3,2022,Germany,190,https://www.worldometers.info/img/flags/gm-fla...,DEU,Europe
4,2022,Spain,190,https://www.worldometers.info/img/flags/sp-fla...,ESP,Europe


In [9]:
# see an example where name not found
passport_df.loc[passport_df.Country == "Brunei"]

Unnamed: 0,Year,Country,Visa-free Destinations,Flag URL,ISO-3,Continent
48,2022,Brunei,166,https://www.worldometers.info/img/flags/bx-fla...,not found,not found
247,2023,Brunei,168,https://www.worldometers.info/img/flags/bx-fla...,not found,not found


Not the best solution, but works. I will add the missing ISO-3 codes and continent manually to the Excel file.

## Write Excel

In [10]:
#passport_df.to_excel("MM_Global_Passport_Rankings.xlsx", index = False)