# Findex 2021

Financial inclusion is a cornerstone of development, and since 2011, the Global Findex Database has been the definitive source of data on global access to financial services from payments to savings and borrowing. The 2021 edition, based on nationally representative surveys of about 128,000 adults in 123 economies during the COVID-19 pandemic, contains updated indicators on access to and use of formal and informal financial services and digital payments, and offers insights into the behaviours that enable financial resilience. The data also identify gaps in access to and usage of financial services by women and poor adults.

We will use the country-level data from 2021 (it is up to 2021 actually, including the data from 2011, 2014, 2017, and 2021). The data and codebook are publicly accessible here and are also available on Learn: https://www.worldbank.org/en/publication/globalfindex/Data 

The data is given in an Excel file, the first sheet includes the country-level data and the second sheet is the codebook describing the variables.


In [None]:
#import required packages
import pandas as pd
import numpy as np

: 

In [2]:
#read-in the data- Sheet one is the data and the other ones are codebook
data = pd.read_excel('DatabankWide.xlsx', sheet_name="Data")

: 

In [None]:
#see the first 5 rows
data.head()

: 

In [4]:
#some variables are categorical, make sure to change their type to category
data[['Country name', 'Country code', 'Year', 'Region', 'Income group']] = data[['Country name', 'Country code', 'Year', 'Region', 'Income group']].astype('category')

: 

In [None]:
print(data.dtypes)

: 

In [6]:
data[['Country name', 'Year', 'Region', 'Income group']].describe()

Unnamed: 0,Country name,Year,Region,Income group
count,658,658,595,584
unique,184,5,7,4
top,Afghanistan,2017,High income,High income
freq,4,169,178,182


In [None]:
#check how many observations there are for each year- the 2022 sample is small and mostly NA, so it's better not to use it
data[['Year']].value_counts()

: 

In [None]:
#make a summary table
mean_by_year_income = data.groupby(['Year', 'Income group'])['Account (% age 15+)'].mean().reset_index()
mean_by_year_income

: 

In [None]:
#we can make a subset of the data with all the variables including a specific word, for example: Mobile 
#and use this smaller dataset if it makes some analysis easier
mobile_columns = data.filter(like='mobile')
mobile_columns = mobile_columns.dropna(how="all")
mobile_columns


: 

In [None]:
#you can choose one year to focus on
data_2021 = data.loc[data['Year'] == 2021]
data_2021

: 

__Note__:

- You can change a variable to binary if desired to have two levels of 0/1 and use it as a response variable in a logistic regression or other classification methods.  
- You can also use external data at country level for specific years and combine it with the existing data (according to name of the country) to answer different questions. Examples of such data could be, GDP per capita, CO2 emission, happiness level, education accessibility for women, human development index (HDI), life expectancy, poverty rate, Gini coefficient, etc. Some sources to find such data are, UN: https://data.un.org/Default.aspx and Worldbank: https://data.worldbank.org/?_gl=1*1w8s4l1*_gcl_au*MTQ0MDQ4MzE0LjE3MjQ0MTc5MDk. 


In [11]:
#change name of variables for easier calling, for example
data_2021 = data_2021.rename(columns = {'Financial institution account (% age 15+)' : 'FinanInst-acc',
                              'Used a mobile phone or the internet to access an account, older (% age 25+)' : 'Mob-Int-old'})
data_2021

Unnamed: 0,Country name,Country code,Year,Adult populaiton,Region,Income group,Account (% age 15+),FinanInst-acc,First financial institution account ever was opened to receive a wage payment or money from the government (% age 15+),First financial institution account ever was opened to receive a wage payment (% age 15+),...,"Used a mobile phone or the internet to access an account, young (% ages 15-24)",Mob-Int-old,"Used a mobile phone or the internet to access an account, primary education or less (% ages 15+)","Used a mobile phone or the internet to access an account, secondary education or more (% ages 15+)","Used a mobile phone or the internet to access an account, income, poorest 40% (% ages 15+)","Used a mobile phone or the internet to access an account, income, richest 60% (% ages 15+)","Used a mobile phone or the internet to access an account, rural (% age 15+)","Used a mobile phone or the internet to access an account, urban (% age 15+)","Used a mobile phone or the internet to access an account, out of labor force (% age 15+)","Used a mobile phone or the internet to access an account, in labor force (% age 15+)"
3,Afghanistan,AFG,2021,2.264750e+07,South Asia,Low income,0.096538,0.096538,,,...,,,,,,,,,,
7,Albania,ALB,2021,2.348634e+06,Europe & Central Asia (excluding high income),Upper middle income,0.441742,0.441742,0.282540,0.254148,...,,,,,,,,,,
11,Algeria,DZA,2021,3.035215e+07,Middle East & North Africa (excluding high inc...,Lower middle income,0.440970,0.440970,0.349182,0.260234,...,,,,,,,,,,
17,Argentina,ARG,2021,3.428881e+07,Latin America & Caribbean (excluding high income),Upper middle income,0.716271,0.663254,0.358641,0.331710,...,,,,,,,,,,
21,Armenia,ARM,2021,2.345923e+06,Europe & Central Asia (excluding high income),Upper middle income,0.553477,0.522469,0.220799,0.194768,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
642,Latin America & Caribbean (excluding high income),LAC,2021,4.526771e+08,Latin America & Caribbean (excluding high income),,0.729474,0.710044,0.389696,0.364618,...,,,,,,,,,,
645,Middle East & North Africa (excluding high inc...,MNA,2021,2.731634e+08,Middle East & North Africa (excluding high inc...,,0.480926,0.469150,0.192473,0.141693,...,,,,,,,,,,
649,South Asia,SAS,2021,1.344725e+09,South Asia,,0.678895,0.658343,0.434108,0.306103,...,,,,,,,,,,
653,Sub-Saharan Africa (excluding high income),SSA,2021,6.587700e+08,Sub-Saharan Africa (excluding high income),,0.550697,0.396974,0.183956,0.159863,...,,,,,,,,,,


: 