Motivation: The International Monetary Fund (IMF) is one of the largest global institutions that involve almost all countries in the world. The IMF provides various types of loans to all countries under different programs, from non-consessional to concessional. While all member countries have access to the General Resources Account for non-concessional loans (i.e. with an interest), the IMF does provide concessional loans (i.e. zero-interest loans) to low-income countries. The process of assessing which countries should receive such loans can be very time consuming since there are many aspects taken into account, so we believe it would be beneficial to build a model to predict if a country should be approved for receiving such loans by the IMF. If our model has a high accuracy, it would save time and money for the IMF to review all of the materials.

In [None]:
import numpy as np
import pandas as pd
import datetime

We first started with our data collection process by searching for data of what countries have received IMF fundings by year. This would be the core of our models without which it would be impossible to build many models. We decided that our primary data source would be the official website of the IMF.



# MONA

In [None]:
mona = pd.read_excel("Description.xlsx")
mona

Unnamed: 0,Arrangement Number,Country Name,Country Code,Arrangement Type,Approval Date,Approval Year,Initial End date,Initial End Year,Revised End Date,Duration Of Annual Arrangement From,...,Purchase,Purchaseschedule,Setaside,Publetterofintentcode,Pubstaffreport,Conditionality Text Box included in Staff Report,Delayedby,Cancelled,Comments,Sort
0,570,"AFGHANISTAN,ISLAMIC REPUBLIC OF",512,PRGF,2006-06-26,2006,2009-06-25,2009,NaT,,...,,Original,,Y,Y,,,,The fiscal year in Afghanistan is the solar ye...,0
1,570,"AFGHANISTAN,ISLAMIC REPUBLIC OF",512,PRGF,2006-06-26,2006,2009-06-25,2009,NaT,,...,,Revised,,Y,Y,,,N,The fiscal year in Afghanistan is the solar ye...,1
2,570,"AFGHANISTAN,ISLAMIC REPUBLIC OF",512,PRGF,2006-06-26,2006,2009-06-25,2009,NaT,,...,,Revised,,Y,Y,,,N,The fiscal year in Afghanistan is the solar ye...,2
3,570,"AFGHANISTAN,ISLAMIC REPUBLIC OF",512,PRGF,2006-06-26,2006,2009-06-25,2009,NaT,,...,,Revised,,Y,Y,,,N,The fiscal year in Afghanistan is the solar ye...,3
4,570,"AFGHANISTAN,ISLAMIC REPUBLIC OF",512,PRGF,2006-06-26,2006,2009-06-25,2009,NaT,,...,,Revised,,Y,Y,,,N,The fiscal year in Afghanistan is the solar ye...,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1508,594,ZAMBIA,754,PRGF,2008-06-04,2008,2011-06-03,2011,NaT,,...,,Revised,,Y,Y,No,,N,"At the joint R1 and R2, access was augmented b...",1
1509,594,ZAMBIA,754,PRGF,2008-06-04,2008,2011-06-03,2011,NaT,,...,,Revised,,Y,Y,Yes,,N,"At the joint R1 and R2, access was augmented b...",3
1510,594,ZAMBIA,754,PRGF,2008-06-04,2008,2011-06-03,2011,NaT,,...,,Revised,,Y,Y,Yes,,N,"At the joint R1 and R2, access was augmented b...",4
1511,594,ZAMBIA,754,PRGF,2008-06-04,2008,2011-06-03,2011,NaT,,...,,Revised,,Y,Y,Yes,,N,"At the joint R1 and R2, access was augmented b...",5


In [None]:
mona['Arrangement Type'].value_counts()

SBA         351
PRGF        340
ECF         304
EFF         189
PSI         128
FCL          46
ECF-EFF      40
PCI          30
SBA-SCF      23
PLL          18
PRGF-EFF     17
SCF          16
ESF           6
SBA-ESF       2
PCL           2
SLL           1
Name: Arrangement Type, dtype: int64

In [None]:
len(mona['Country Name'].unique())

105

In [None]:
mona.groupby(['Country Name', 'Approval Year']).first()['Arrangement Type'].value_counts()

SBA         78
ECF         67
PRGF        61
EFF         32
FCL         27
PSI         17
ECF-EFF     10
SBA-SCF      7
PCI          6
PLL          5
SCF          5
PRGF-EFF     3
ESF          3
SLL          1
SBA-ESF      1
PCL          1
Name: Arrangement Type, dtype: int64

For the MONA database, we first explored what types of arrangements that the database contains and noticed that there are too many programs. We also checked how many countries are in this database to make sure the data is not too imbalanced. Finally, we grouped by country and year to see the number of funds under each type of arrangement.

# Flows

In [None]:
data = pd.read_csv("flows.csv", header = 9)
data.head()

Unnamed: 0,Flow Type,Member,Member Code,Description,Transaction Value Date,Amount,Original Disbursement Date,Original Arrangement Date
0,Net SDR Charges/Interest,"Afghanistan, Islamic Republic of",AFG,Net SDR Charges,2/10/2003,4200000,n.a.,n.a.
1,Net SDR Charges/Interest,"Afghanistan, Islamic Republic of",AFG,Net SDR Charges,2/20/2003,559721,n.a.,n.a.
2,Net SDR Charges/Interest,"Afghanistan, Islamic Republic of",AFG,Net SDR Charges,2/25/2003,3295325,n.a.,n.a.
3,SDR Assessments,"Afghanistan, Islamic Republic of",AFG,SDR Assessments,5/7/2003,1993,n.a.,n.a.
4,Net SDR Charges/Interest,"Afghanistan, Islamic Republic of",AFG,Net SDR Charges,5/7/2003,122132,n.a.,n.a.


In [None]:
data['Flow Type'].unique()

array(['Net SDR Charges/Interest', 'SDR Assessments',
       'PRGT Disbursements', 'PRGT Interest', 'PRGT Repayments',
       'GRA Repurchases', 'GRA Charges', 'GRA Purchases'], dtype=object)

In [None]:
prgt = data[data['Flow Type'].str.contains('PRGT')]
prgt

Unnamed: 0,Flow Type,Member,Member Code,Description,Transaction Value Date,Amount,Original Disbursement Date,Original Arrangement Date
23,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,1/19/2007,13200000,1/19/2007,6/26/2006
25,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,3/29/2007,11300000,3/29/2007,6/26/2006
28,PRGT Interest,"Afghanistan, Islamic Republic of",AFG,PRGT Interest,6/29/2007,43353,n.a.,n.a.
29,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,7/23/2007,11300000,7/23/2007,6/26/2006
32,PRGT Interest,"Afghanistan, Islamic Republic of",AFG,PRGT Interest,12/28/2007,85541,n.a.,n.a.
...,...,...,...,...,...,...,...,...
43433,PRGT Repayments,Zimbabwe,ZWE,Extended Credit Facility,10/3/2016,107321,5/28/1993,9/11/1992
43434,PRGT Repayments,Zimbabwe,ZWE,Extended Credit Facility,10/20/2016,10940000,9/18/1992,9/11/1992
43435,PRGT Repayments,Zimbabwe,ZWE,Extended Credit Facility,10/20/2016,10711926,5/28/1993,9/11/1992
43436,PRGT Repayments,Zimbabwe,ZWE,Extended Credit Facility,10/20/2016,16700000,2/22/1994,9/11/1992


In [None]:
prgt_disbursements = prgt[prgt['Flow Type'] == 'PRGT Disbursements']
prgt_disbursements

Unnamed: 0,Flow Type,Member,Member Code,Description,Transaction Value Date,Amount,Original Disbursement Date,Original Arrangement Date
23,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,1/19/2007,13200000,1/19/2007,6/26/2006
25,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,3/29/2007,11300000,3/29/2007,6/26/2006
29,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,7/23/2007,11300000,7/23/2007,6/26/2006
34,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,2/28/2008,11300000,2/28/2008,6/26/2006
38,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,7/15/2008,11300000,7/15/2008,6/26/2006
...,...,...,...,...,...,...,...,...
42983,PRGT Disbursements,Zambia,ZMB,Extended Credit Facility,12/30/2009,51013000,12/30/2009,6/4/2008
42989,PRGT Disbursements,Zambia,ZMB,Extended Credit Facility,7/15/2010,18395000,7/15/2010,6/4/2008
42993,PRGT Disbursements,Zambia,ZMB,Extended Credit Facility,12/22/2010,18395000,12/22/2010,6/4/2008
42999,PRGT Disbursements,Zambia,ZMB,Extended Credit Facility,6/29/2011,18395000,6/29/2011,6/4/2008


In [None]:
prgt_disbursements['Description'].unique()

array(['Extended Credit Facility', 'Rapid Credit Facility',
       'Exogenous Shocks Facility - RAC',
       'Exogenous Shocks Facility - HAC', 'Standby Credit Facility'],
      dtype=object)

In [None]:
len(prgt_disbursements['Member'].unique())

70

In [None]:
prgt_disbursements['Description'].value_counts()

Extended Credit Facility           801
Rapid Credit Facility               95
Exogenous Shocks Facility - HAC     18
Standby Credit Facility             18
Exogenous Shocks Facility - RAC     10
Name: Description, dtype: int64

In [None]:
prgt_disbursements.groupby(['Member', 'Original Arrangement Date'])['Description'].value_counts()

Member                            Original Arrangement Date  Description             
Afghanistan, Islamic Republic of  11/14/2011                 Extended Credit Facility    2
                                  11/6/2020                  Extended Credit Facility    2
                                  6/26/2006                  Extended Credit Facility    7
                                  7/20/2016                  Extended Credit Facility    7
                                  n.a.                       Rapid Credit Facility       1
                                                                                        ..
Yemen, Republic of                n.a.                       Rapid Credit Facility       2
Zambia                            3/25/1999                  Extended Credit Facility    3
                                  6/16/2004                  Extended Credit Facility    8
                                  6/4/2008                   Extended Credit Facility    6
    

Then we found this IMF Financial Data Query Tool. We first tried selecting every entry under the “Flows”. We then researched other programs (SDR, GRA) and made sure these are not relevant to our topic. Then we filtered the dataset such that it only contains PGRT disbursements records. We checked the types of disbursements and noticed that in addition to ECF, SCF and RCF, there are two “Exogenous Shocks Facility” types. We found that there was a change of name in 2010, and the ESFs are simply the current RCF. After that, we checked the number of countries and number of records under each type of disbursement. Finally we checked if there are duplicate records in terms of countries receiving approval of funds on the same day by disbursement type.


# Commitments

In [None]:
commitments = pd.read_csv('commitments.csv', header = 9)
commitments

Unnamed: 0,Member,Member Code,Type,Facility,Date Of Commitment,Expiration Date,Amount Agreed,% Of Quota,Precautionary,NAB Eligible,Expiration Date.1,Amount Agreed.1,% Of Quota.1,Amount Drawn,Amount Outstanding,Status
0,"Afghanistan, Islamic Republic of",AFG,PRGT,Extended Credit Facility,26-Jun-06,25-Jun-09,81000000,50.0,N,N,25-Sep-10,81000000,25.0,75350000,0,Expired
1,"Afghanistan, Islamic Republic of",AFG,PRGT,Extended Credit Facility,14-Nov-11,13-Nov-14,85000000,52.5,N,N,13-Nov-14,85000000,26.3,24000000,0,Expired
2,"Afghanistan, Islamic Republic of",AFG,PRGT,Extended Credit Facility,20-Jul-16,19-Jul-19,32380000,10.0,N,N,31-Dec-19,32380000,10.0,32380000,31480000,Expired
3,"Afghanistan, Islamic Republic of",AFG,PRGT,Extended Credit Facility,6-Nov-20,5-May-24,259040000,80.0,N,N,5-May-24,259040000,80.0,184566000,184566000,Current
4,Albania,ALB,PRGT,Extended Credit Facility,21-Jun-02,20-Jun-05,28000000,57.5,N,N,20-Nov-05,28000000,20.1,28000000,0,Expired
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
158,"Yemen, Republic of",YMN,PRGT,Extended Credit Facility,2-Sep-14,1-Sep-17,365250000,150.0,N,N,1-Mar-16,365250000,75.0,48750000,19500000,Expired
159,Zambia,ZMB,PRGT,Extended Credit Facility,16-Jun-04,15-Jun-07,220095000,45.0,N,N,30-Sep-07,220095000,22.5,220095000,0,Expired
160,Zambia,ZMB,PRGT,Extended Credit Facility,4-Jun-08,3-Jun-11,48910000,10.0,N,N,29-Jun-11,220095000,22.5,220095000,0,Expired
161,Zambia,ZMB,PRGT,Extended Credit Facility,31-Aug-22,30-Oct-25,978200000,100.0,N,N,30-Oct-25,978200000,100.0,139880000,139880000,Current


In [None]:
commitments['Facility'].value_counts()

Extended Credit Facility     143
Standby Credit Facility       12
Exogenous Shocks Facility      7
Name: Facility, dtype: int64

In [None]:
len(commitments['Member'].unique())

56

Then we tried the commitments. We also checked the number of records within each facility and number of countries. Since the number of data is far less than that of “Flows”, we decided to use the flows dataset. 


# FLOWS all

In [None]:
flows = pd.read_csv('FLOWS all.csv')
flows

Unnamed: 0,Flow Type,Member,Member Code,Description,Transaction Value Date,Amount,Original Disbursement Date,Original Arrangement Date
0,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,1/19/2007,13200000,1/19/2007,6/26/2006
1,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,3/29/2007,11300000,3/29/2007,6/26/2006
2,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,7/23/2007,11300000,7/23/2007,6/26/2006
3,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,2/28/2008,11300000,2/28/2008,6/26/2006
4,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,7/15/2008,11300000,7/15/2008,6/26/2006
...,...,...,...,...,...,...,...,...
1543,PRGT Disbursements,Zambia,ZMB,Extended Credit Facility,9/2/2022,139880000,9/2/2022,8/31/2022
1544,PRGT Disbursements,Zimbabwe,ZWE,Extended Credit Facility,9/18/1992,54700000,9/18/1992,9/11/1992
1545,PRGT Disbursements,Zimbabwe,ZWE,Extended Credit Facility,5/28/1993,30400000,5/28/1993,9/11/1992
1546,PRGT Disbursements,Zimbabwe,ZWE,Extended Credit Facility,2/22/1994,33400000,2/22/1994,9/11/1992


In [None]:
date = pd.to_datetime(flows['Transaction Value Date'])
date.sort_values()

228    1986-08-14
490    1986-09-23
988    1986-09-25
1272   1986-11-14
1140   1986-11-21
          ...    
847    2022-08-26
1271   2022-09-02
1543   2022-09-02
887    2022-09-14
1055   2022-09-23
Name: Transaction Value Date, Length: 1548, dtype: datetime64[ns]

In [None]:
len(flows['Member'].unique())

74

In [None]:
flows['Year'] = pd.DatetimeIndex(flows['Transaction Value Date']).year
flows

Unnamed: 0,Flow Type,Member,Member Code,Description,Transaction Value Date,Amount,Original Disbursement Date,Original Arrangement Date,Year
0,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,1/19/2007,13200000,1/19/2007,6/26/2006,2007
1,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,3/29/2007,11300000,3/29/2007,6/26/2006,2007
2,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,7/23/2007,11300000,7/23/2007,6/26/2006,2007
3,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,2/28/2008,11300000,2/28/2008,6/26/2006,2008
4,PRGT Disbursements,"Afghanistan, Islamic Republic of",AFG,Extended Credit Facility,7/15/2008,11300000,7/15/2008,6/26/2006,2008
...,...,...,...,...,...,...,...,...,...
1543,PRGT Disbursements,Zambia,ZMB,Extended Credit Facility,9/2/2022,139880000,9/2/2022,8/31/2022,2022
1544,PRGT Disbursements,Zimbabwe,ZWE,Extended Credit Facility,9/18/1992,54700000,9/18/1992,9/11/1992,1992
1545,PRGT Disbursements,Zimbabwe,ZWE,Extended Credit Facility,5/28/1993,30400000,5/28/1993,9/11/1992,1993
1546,PRGT Disbursements,Zimbabwe,ZWE,Extended Credit Facility,2/22/1994,33400000,2/22/1994,9/11/1992,1994


In [None]:
flows.groupby(['Member', 'Year']).count()

Unnamed: 0_level_0,Unnamed: 1_level_0,Flow Type,Member Code,Description,Transaction Value Date,Amount,Original Disbursement Date,Original Arrangement Date
Member,Year,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
"Afghanistan, Islamic Republic of",2007,3,3,3,3,3,3,3
"Afghanistan, Islamic Republic of",2008,2,2,2,2,2,2,2
"Afghanistan, Islamic Republic of",2009,1,1,1,1,1,1,1
"Afghanistan, Islamic Republic of",2010,1,1,1,1,1,1,1
"Afghanistan, Islamic Republic of",2011,1,1,1,1,1,1,1
...,...,...,...,...,...,...,...,...
Zambia,2022,1,1,1,1,1,1,1
Zimbabwe,1992,1,1,1,1,1,1,1
Zimbabwe,1993,1,1,1,1,1,1,1
Zimbabwe,1994,1,1,1,1,1,1,1


In [None]:
(74*37-1008)/74/37

0.6318480642804968

Our final step is to download the dataset of all flows of PRGT disbursements and conduct a final sanity check. We first checked the timeframe of all records which is from 1985 to 2022, and checked the number of countries, which is 74. This is not so imbalanced given that there are 190 countries in the IMF. Finally we checked the accuracy of the baseline model, i.e. what percentage of countries and years have not received any funds. It turns out to be a little above 63%, which is acceptable.


# Factors

Extended credit facility (ECF) was the IMF’s main tool for supporting country needs, which is under Poverty Reduction and Growth Trust (PRGT). It is on a base-to-base basis to determine the country’s ability to access ECF, which is evaluated by the country's balance of payments need, the strength of its economic program and capacity to repay the Fund, the amount of outstanding Fund credit and the member’s record of past use of Fund credit. 

Reading the descriptions of IMF’s Standby Credit Facility (SCF) and Rapid Credit Facility (RCF), both available to PRGT-eligible countries, we learned that they are meant to help with short-term financing needs caused by shocks. 

In conclusion, to determine whether or not a country should receive funds, we consider the following factors: balance of payments needs, strength of its economic program, capacity to repay the Fund, the amount of outstanding Fund credit, the member’s record of past use of Fund credit, and shocks.

The Balance of Payments (BOP) refers to the statistical data of transactions made between a specific country and the rest of the world. The current account balance refers to the nation’s trade in goods and services while the capital account balance refers to transactions between financial institutions. By integrating these two parts as well as financial account balances and reserves, we can examine the economic activity of the nation with the rest of the world. 

A country's economic strength is an important factor that is measured by the IMF. This strength is typically reflected in a country's economic diversity (industry diversity, knowledgeable/high-tech or high-paid jobs), entrepreneurship (income per entrepreneur, entrepreneur density, start-up density), and growth (Real Income Per Capita, Real Income Growth, GDP growth). Most of the data related to economic diversity and entrepreneurship are hard to find, and the available data we get is incomplete with respect to time or country. For example, we can only find the number of start-up companies in the recent 5 years, and most of them are in developed countries. Also, features like Industry diversity are hard to quantify. After research, the final comprehensive and reliable dataset we find is the GDP growth (annual percent change); thus we use this feature to capture each country's economic strength.

Capacity to repay the fund is hard to evaluate. It is related to a country’s government income (mainly tax), government spending, as well as political factors that may influence the stability of a country’s government and the government’s willingness to repay the fund. Plus, we didn’t find any well-formed data about this. Although the “Flows” section under the data query tool includes history of repayments and interests that might represent the capacity, we decided that these are not enough. After all, past repayments are not really representative of a country’s future capacity to repay the loans. Another reason we are not using it is that there should be some metrics to evaluate the historical repayments, like number of repayments, percentage of repayments, etc. However, this would require more research and data manipulation, so it doesn’t fit in the scope and time limit of our project.

For the total PRG credit outstanding, we again used the IMF Financial Data Query Tool. We selected PRGT Credit Outstanding under position to download the total PRG credit outstanding for all countries and lenders from 1985 to 2022. 

The member’s record of past use of Fund credit is another useful factor that is used by the IMF. It demonstrates the behavior of the country when they were given funds in the past. However, since most of the record of past use of Fund credit is on a base-to-base basis on determining the ability to receive funds, we are not able to find all the data that demonstrate the member’s record of past use of Fund credit.

There are many indicators of shocks, including natural disasters, food supplies, domestic instabilities, exogenous shocks, etc. While searching for data, we found that domestic instabilities and exogenous shocks were not easily quantifiable. We couldn’t find well-established data for our purpose, thus we chose only to use natural disasters and food supplies as features.

For natural disaster data, we used EM-DAT: the international disasters database, which contains essential core data on the occurrence and effects of over 22,000 mass disasters in the world from 1900 to the present day. We used the EM-DAT Query Tool to download natural disaster records for countries from 1985 to 2022.

For food supply data, we used FAOSTAT, which is a database created by the Food and Agriculture Organization of the United Nations. While it provides comprehensive data on every single kind of food, we decided to use the grand total Food Supply (kcal/capita/day) per year to represent the general food supply status, and we downloaded the data for countries from 1985 to 2019 from the website.
