# International-Debt-Statistics-Analysis

## 1.Understand the Dataset

In [14]:
# The first line of code connects us to the international_debt database where the table international_debt is residing.
# Let's first SELECT all of the columns from the international_debt table. 
# Also, we'll limit the output to the first ten rows to keep the output clean. */
# import all packages and set plots to be embedded inline
import numpy as np
import pandas as pd
import statsmodels.api as sm
import matplotlib.pyplot as plt
import seaborn as sb

import pandas as pd

df= pd.read_csv('International_debt.csv')

# select * from international_debt
# limit 10;
df.head(10)

Unnamed: 0,country_name,country_code,indicator_name,indicator_code,debt
0,Afghanistan,AFG,"Disbursements on external debt, long-term (DIS...",DT.DIS.DLXF.CD,72894453.7
1,Afghanistan,AFG,"Interest payments on external debt, long-term ...",DT.INT.DLXF.CD,53239440.1
2,Afghanistan,AFG,"PPG, bilateral (AMT, current US$)",DT.AMT.BLAT.CD,61739336.9
3,Afghanistan,AFG,"PPG, bilateral (DIS, current US$)",DT.DIS.BLAT.CD,49114729.4
4,Afghanistan,AFG,"PPG, bilateral (INT, current US$)",DT.INT.BLAT.CD,39903620.1
5,Afghanistan,AFG,"PPG, multilateral (AMT, current US$)",DT.AMT.MLAT.CD,39107845.0
6,Afghanistan,AFG,"PPG, multilateral (DIS, current US$)",DT.DIS.MLAT.CD,23779724.3
7,Afghanistan,AFG,"PPG, multilateral (INT, current US$)",DT.INT.MLAT.CD,13335820.0
8,Afghanistan,AFG,"PPG, official creditors (AMT, current US$)",DT.AMT.OFFT.CD,100847181.9
9,Afghanistan,AFG,"PPG, official creditors (DIS, current US$)",DT.DIS.OFFT.CD,72894453.7


## 2. Finding the number of distinct countries

In [15]:
# /* From the first ten rows, we can see the amount of debt owed by Afghanistan in the different debt indicators. 
# But we do not know the number of different countries we have on the table.
# There are repetitions in the country names because a country is most likely to have debt in more than one debt indicator.

# Without a count of unique countries, we will not be able to perform our statistical analyses holistically. 
# In this section, we are going to extract the number of unique countries present in the table. */

# Find the distinct countries
distinct_countries = df['country_name'].unique()

# Print the distinct countries
for country in distinct_countries:
    print(country)

Afghanistan
Albania
Algeria
Angola
Armenia
Azerbaijan
Bangladesh
Belarus
Belize
Benin
Bhutan
Bolivia
Bosnia and Herzegovina
Botswana
Brazil
Bulgaria
Burkina Faso
Burundi
Cabo Verde
Cambodia
Cameroon
Central African Republic
Chad
China
Colombia
Comoros
Congo, Dem. Rep.
Congo, Rep.
Costa Rica
Cote d'Ivoire
Djibouti
Dominica
Dominican Republic
Georgia
Ecuador
Egypt, Arab Rep.
El Salvador
Eritrea
Eswatini
Ethiopia
Fiji
Gabon
Gambia, The
Ghana
Grenada
Guatemala
Guinea
Guinea-Bissau
Guyana
Haiti
Honduras
IDA only
India
Indonesia
Iran, Islamic Rep.
Jamaica
Jordan
Kazakhstan
Kenya
Kosovo
Kyrgyz Republic
Lao PDR
Lesotho
Least developed countries: UN classification
Lebanon
Liberia
Macedonia, FYR
Madagascar
Malawi
Maldives
Mali
Mauritania
Mauritius
Mexico
Mongolia
Moldova
Montenegro
Morocco
Mozambique
Myanmar
Nepal
Nicaragua
Niger
Nigeria
Pakistan
Papua New Guinea
Paraguay
Peru
Philippines
Romania
Russian Federation
Rwanda
Samoa
Sao Tome and Principe
Senegal
Serbia
Sierra Leone
Solomon Islands
So

## 3. Finding out the distinct debt indicators

 We can see there are a total of 124 countries present on the table.

 As we saw in the first section, there is a column called indicator_name that briefly specifies the purpose of taking the debt. 
Just beside that column, there is another column called indicator_code which symbolises the category of these debts.

Knowing about these various debt indicators will help us to understand the areas in which a country can possibly be indebted to.

In [16]:
# Get the distinct debt indicators
distinct_debt_indicators = df['indicator_code'].unique()

# Sort the distinct debt indicators
distinct_debt_indicators = sorted(distinct_debt_indicators)

# Print the distinct debt indicators
for indicator in distinct_debt_indicators:
    print(indicator)

DT.AMT.BLAT.CD
DT.AMT.DLXF.CD
DT.AMT.DPNG.CD
DT.AMT.MLAT.CD
DT.AMT.OFFT.CD
DT.AMT.PBND.CD
DT.AMT.PCBK.CD
DT.AMT.PROP.CD
DT.AMT.PRVT.CD
DT.DIS.BLAT.CD
DT.DIS.DLXF.CD
DT.DIS.MLAT.CD
DT.DIS.OFFT.CD
DT.DIS.PCBK.CD
DT.DIS.PROP.CD
DT.DIS.PRVT.CD
DT.INT.BLAT.CD
DT.INT.DLXF.CD
DT.INT.DPNG.CD
DT.INT.MLAT.CD
DT.INT.OFFT.CD
DT.INT.PBND.CD
DT.INT.PCBK.CD
DT.INT.PROP.CD
DT.INT.PRVT.CD



## 4.Totaling the amount of debt owed by the countries

As mentioned earlier, the financial debt of a particular country represents its economic state. 
But if we were to project this on an overall global scale, how will we approach it?

Let's switch gears from the debt indicators now and find out the total amount of debt (in USD) that is owed by the different countries. 
This will give us a sense of how the overall economy of the entire world is holding up.

In [17]:
# Calculate the total debt and round the result
total_debt = round(df['debt'].sum() / 1000000, 2)

# Print the total debt
print("Total Debt:", total_debt)

Total Debt: 3079734.49


## 5. Country with the highest debt

*/ "Blessed are the young for they shall inherit the national debt." -Herbert Hoover

Now that we have the exact total of the amounts of debt owed by several countries, 
let's now find out the country that owns the highest amount of debt along with the amount. 
Note that this debt is the sum of different debts owed by a country across several categories. 
his will help to understand more about the country in terms of its socio-economic scenarios. 
We can also find out the category in which the country owns its highest debt. But we will leave that for now. */

In [19]:
# Group by country_name and calculate the total debt for each country
grouped = df.groupby('country_name')['debt'].sum().reset_index()

# Convert the debt to millions and round to 2 decimal places
grouped['total_debt'] = grouped['debt'] / 1000000
grouped['total_debt'] = grouped['total_debt'].round(2)

# Sort by total_debt in descending order and retrieve the country with the highest debt
result = grouped.sort_values('total_debt', ascending=False).head(1)

# Print the result
print(result[['country_name', 'total_debt']])

   country_name  total_debt
23        China   285793.49


## 6. Average amount of debt across indicators

So, it was China. A more in-depth breakdown of China's debts can be found at: https://datatopics.worldbank.org/debt/ids/country/CHN

We now have a brief overview of the dataset and a few of its summary statistics. 
We already have an idea of the different debt indicators in which the countries owe their debts. 
We can dig even further to find out on an average how much debt a country owes? 
This will give us a better sense of the distribution of the amount of debt across different indicators.

In [20]:
# Calculate the average debt and round it to 2 decimal places
df['average_debt'] = df['debt'] / 1000000
df['average_debt'] = df['average_debt'].round(2)

# Group the data by debt_indicator and indicator_name, calculate the average_debt
grouped_df = df.groupby(['indicator_code', 'indicator_name'])['average_debt'].mean().reset_index()

# Sort the data by average_debt in descending order
sorted_df = grouped_df.sort_values('average_debt', ascending=False)

# Get the top 10 rows
result_df = sorted_df.head(10)

# Print the result
print(result_df)

    indicator_code                                     indicator_name  \
1   DT.AMT.DLXF.CD  Principal repayments on external debt, long-te...   
2   DT.AMT.DPNG.CD  Principal repayments on external debt, private...   
10  DT.DIS.DLXF.CD  Disbursements on external debt, long-term (DIS...   
12  DT.DIS.OFFT.CD         PPG, official creditors (DIS, current US$)   
8   DT.AMT.PRVT.CD          PPG, private creditors (AMT, current US$)   
17  DT.INT.DLXF.CD  Interest payments on external debt, long-term ...   
9   DT.DIS.BLAT.CD                  PPG, bilateral (DIS, current US$)   
18  DT.INT.DPNG.CD  Interest payments on external debt, private no...   
4   DT.AMT.OFFT.CD         PPG, official creditors (AMT, current US$)   
5   DT.AMT.PBND.CD                      PPG, bonds (AMT, current US$)   

    average_debt  
1    5904.867984  
2    5161.194684  
10   2152.041545  
12   1958.983934  
8    1803.694286  
17   1644.024435  
9    1223.139735  
18   1220.410506  
4    1191.188065  
5    1

## 7. The highest amount of principal repayments

We can see that the indicator DT.AMT.DLXF.CD tops the chart of average debt. This category includes repayment of long term debts. 
Countries take on long-term debt to acquire immediate capital. More information about this category can be found here.

An interesting observation in the above finding is that there is a huge difference in the amounts of the indicators after the second one. 
This indicates that the first two indicators might be the most severe categories in which the countries owe their debts.

We can investigate this a bit more so as to find out which country owes the highest amount of debt in the category of long term debts (DT.AMT.DLXF.CD). 
Since not all the countries suffer from the same kind of economic disturbances, 
this finding will allow us to understand that particular country's economic condition a bit more specifically.


In [21]:
# Filter the DataFrame for rows with indicator_code = 'DT.AMT.DLXF.CD'
filtered_df = df[df['indicator_code'] == 'DT.AMT.DLXF.CD']

# Calculate the average debt and round it to 2 decimal places
filtered_df['average_debt'] = filtered_df['debt'] / 1000000
filtered_df['average_debt'] = filtered_df['average_debt'].round(2)

# Group the data by country_name, indicator_name, and indicator_code, and calculate the average_debt
grouped_df = filtered_df.groupby(['country_name', 'indicator_name', 'indicator_code'])['average_debt'].mean().reset_index()

# Sort the data by average_debt in descending order
sorted_df = grouped_df.sort_values('average_debt', ascending=False)

# Get the top 10 rows
result_df = sorted_df.head(10)

# Print the result
print(result_df)

                                     country_name  \
23                                          China   
14                                         Brazil   
90                             Russian Federation   
113                                        Turkey   
100                                    South Asia   
52                                          India   
53                                      Indonesia   
57                                     Kazakhstan   
73                                         Mexico   
62   Least developed countries: UN classification   

                                        indicator_name  indicator_code  \
23   Principal repayments on external debt, long-te...  DT.AMT.DLXF.CD   
14   Principal repayments on external debt, long-te...  DT.AMT.DLXF.CD   
90   Principal repayments on external debt, long-te...  DT.AMT.DLXF.CD   
113  Principal repayments on external debt, long-te...  DT.AMT.DLXF.CD   
100  Principal repayments on external debt, lo

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df['average_debt'] = filtered_df['debt'] / 1000000
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df['average_debt'] = filtered_df['average_debt'].round(2)


## 8. The most common debt indicator

China has the highest amount of debt in the long-term debt (DT.AMT.DLXF.CD) category. This is verified by The World Bank. 
It is often a good idea to verify our analyses like this since it validates that our investigations are correct.

We saw that long-term debt is the topmost category when it comes to the average amount of debt. 
But is it the most common indicator in which the countries owe their debt? Let's find that out.

In [22]:

# Group the data by indicator_code and indicator_name, calculate the indicator_count
grouped_df = df.groupby(['indicator_code', 'indicator_name']).size().reset_index(name='indicator_count')

# Sort the data by indicator_count in descending order
sorted_df = grouped_df.sort_values('indicator_count', ascending=False)

# Get the top 20 rows
result_df = sorted_df.head(20)

# Print the result
print(result_df)

    indicator_code                                     indicator_name  \
20  DT.INT.OFFT.CD         PPG, official creditors (INT, current US$)   
3   DT.AMT.MLAT.CD               PPG, multilateral (AMT, current US$)   
4   DT.AMT.OFFT.CD         PPG, official creditors (AMT, current US$)   
19  DT.INT.MLAT.CD               PPG, multilateral (INT, current US$)   
17  DT.INT.DLXF.CD  Interest payments on external debt, long-term ...   
1   DT.AMT.DLXF.CD  Principal repayments on external debt, long-te...   
10  DT.DIS.DLXF.CD  Disbursements on external debt, long-term (DIS...   
0   DT.AMT.BLAT.CD                  PPG, bilateral (AMT, current US$)   
16  DT.INT.BLAT.CD                  PPG, bilateral (INT, current US$)   
12  DT.DIS.OFFT.CD         PPG, official creditors (DIS, current US$)   
11  DT.DIS.MLAT.CD               PPG, multilateral (DIS, current US$)   
9   DT.DIS.BLAT.CD                  PPG, bilateral (DIS, current US$)   
8   DT.AMT.PRVT.CD          PPG, private creditors 

## 9. Other viable debt issues and conclusion

 There are a total of six debt indicators in which all the countries listed in our dataset have taken debt. 
The indicator DT.AMT.DLXF.CD is also there in the list. So, this gives us a clue that all these countries are suffering from a common economic issue. But that is not the end of the story, a part of the story rather.


Let's change tracks from debt_indicators now and focus on the amount of debt again. 
Let's find out the maximum amount of debt across the indicators along with the respective country names. 
With this, we will be in a position to identify the other plausible economic issues a country might be going through. 
By the end of this section, we will have found out the debt indicators in which a country owes its highest debt.


we took a look at debt owed by countries across the globe. 
We extracted a few summary statistics from the data and unravelled some interesting facts and figures. 
We also validated our findings to make sure the investigations are correct.

In [23]:
# Group the data by country_name and indicator_name, get the maximum debt
grouped_df = df.groupby(['country_name', 'indicator_name'])['debt'].max().reset_index()

# Sort the data by maximum_debt in descending order
sorted_df = grouped_df.sort_values('debt', ascending=False)

# Get the top 20 rows
result_df = sorted_df.head(20)

# Print the result
print(result_df)

                                      country_name  \
462                                          China   
304                                         Brazil   
463                                          China   
1785                            Russian Federation   
2179                                        Turkey   
1954                                    South Asia   
301                                         Brazil   
1786                            Russian Federation   
305                                         Brazil   
1220  Least developed countries: UN classification   
2180                                        Turkey   
1235  Least developed countries: UN classification   
299                                         Brazil   
986                                       IDA only   
1001                                      IDA only   
1033                                         India   
1058                                     Indonesia   
1932                        