# How to...normalize country information

This notebook shows how to use the entitymatching api to normalize information about countries.

In [1]:
# Sets up the location of the api relative to this notebook 
import sys
sys.path.append('../../')

In [2]:
# Import the module for normalizing country information
from financialcleaner import country

## 1. Basic usage

The API allows to get full information about a country (alpha2, alpha3 and name) by passing one of these data to the get_country_info() function. The result of the search can be None (country not found) or a dictionary.

In [3]:
# Good examples of country information that can be found in data sources
country_alpha2 = 'PT'
country_alpha3 = 'BRA'
country_name = ' China '

In [4]:
# Get the complete information about a country given and alpha2 code
country_info_dict = country.get_country_info(country_alpha2)
print('Complete country information: {}'.format(country_info_dict))

Complete country information: {'country_name': 'portugal', 'country_alpha2': 'pt', 'country_alpha3': 'prt'}


In [5]:
# Get the complete information about a country given alpha3 code
country_info_dict = country.get_country_info(country_alpha3)
print('Complete country information: {}'.format(country_info_dict))

Complete country information: {'country_name': 'brazil', 'country_alpha2': 'br', 'country_alpha3': 'bra'}


In [6]:
# Get the complete information about a country given a name
country_info_dict = country.get_country_info(country_name)
print('Complete country information: {}'.format(country_info_dict))

Complete country information: {'country_name': 'china', 'country_alpha2': 'cn', 'country_alpha3': 'chn'}


## 2. Some bad examples (country not found)

The API performs an exact matching between the value passed by parameter and its internal dictionary of country information. Therefore, if the information contains strange characters or the name is not a country's name, the API will return a None object.

In [7]:
# Bad examples of country information that can be found in data sources
country_alpha2 = '123'
country_alpha3 = '%fff'
country_name = ' Chinatown '

In [8]:
# Get the complete information about a country given and alpha2 code
print(country.get_country_info(country_alpha2))

None


In [9]:
# Get the complete information about a country given alpha3 code
print(country.get_country_info(country_alpha3))

None


In [10]:
# Get the complete information about a country given a name
print(country.get_country_info(country_name))

None


## 3. Customizing the output letter case

It is possible to indicate in the output_lettercase parameter of the API that the result must be in lower, upper or title case. By default, the API uses output_lettercase='lower'.

In [11]:
# Good examples of country information that can be found in data sources
country_alpha2 = 'no'
country_alpha3 = 'rus'
country_name = 'england'

In [12]:
# Get the complete information about a country given and alpha2 code
country_info_dict = country.get_country_info(country_alpha2, output_lettercase='upper')
print('Complete country information: {}'.format(country_info_dict))

Complete country information: {'country_name': 'NORWAY', 'country_alpha2': 'NO', 'country_alpha3': 'NOR'}


In [13]:
# Get the complete information about a country given and alpha2 code
country_info_dict = country.get_country_info(country_alpha2, output_lettercase='title')
print('Complete country information: {}'.format(country_info_dict))

Complete country information: {'country_name': 'Norway', 'country_alpha2': 'No', 'country_alpha3': 'Nor'}
