## Exploring GDP per Capita Based on Purchasing Power Parity (PPP) Across Countries

The dataset used in this analysis was downloaded from Kaggle. This dataset provides information on the Gross Domestic Product (GDP) of various countries, measured based on Purchasing Power Parity (PPP). GDP per capita is a commonly used economic indicator that reflects the average wealth or income of individuals in a country. By considering the PPP, this dataset offers a more accurate comparison of economic well-being among different nations, as it accounts for the differences in the cost of living and currency exchange rates. The dataset is a valuable resource for analyzing and understanding the economic landscape across countries.

URL: https://www.kaggle.com/datasets/nitishabharathi/gdp-per-capita-all-countries.

In [1]:
#Import the necessary libraries
import pandas as pd

In [2]:
#Load the CSV file into a pandas DataFrame
df = pd.read_csv('GDP.csv', encoding='latin1')

In [3]:
df.head()

Unnamed: 0,Country,Country Code,1990,1991,1992,1993,1994,1995,1996,1997,...,2010,2011,2012,2013,2014,2015,2016,2017,2018,2019
0,Aruba,ABW,24101.10943,25870.75594,26533.3439,27430.7524,28656.52021,28648.99002,28499.08943,30215.94923,...,33732.84745,35492.61849,35498.98209,37419.89282,38223.37226,38249.05487,38390.27165,39454.62983,,
1,Afghanistan,AFG,,,,,,,,,...,1637.377987,1626.764793,1806.76393,1874.765634,1897.525938,1886.692977,1896.99252,1934.636754,1955.006208,
2,Angola,AGO,3089.683369,3120.356148,2908.160798,2190.76816,2195.532289,2496.199493,2794.896906,2953.342709,...,6230.297028,6346.395122,6772.528333,6980.423038,7199.245478,7096.600615,6756.935074,6650.58494,6452.355165,
3,Albania,ALB,2549.473022,1909.114038,1823.307673,2057.449657,2289.873135,2665.764906,2980.066288,2717.362124,...,9628.025783,10207.75235,10526.23545,10571.01065,11259.22589,11662.03048,11868.17897,12930.14003,13364.1554,
4,Arab World,ARB,6808.206995,6872.273195,7255.328362,7458.647059,7645.682856,7774.20736,8094.149842,8397.515692,...,14127.77802,14518.82745,15423.46539,15824.78011,16153.24486,16501.79259,16935.3833,17099.88939,17570.1376,


In [18]:
# check how many row are in Country Column
df.shape[0]

260

In [22]:
# Dupicate in Country column
df['Country '].duplicated().any()

False

In [23]:
# Another way to confirm that is there any duplicates
df['Country '].is_unique

True

In [4]:
#Get the columns name
column_names = df.columns.tolist()
print(column_names)

['Country ', 'Country Code', '1990', '1991', '1992', '1993', '1994', '1995', '1996', '1997', '1998', '1999', '2000', '2001', '2002', '2003', '2004', '2005', '2006', '2007', '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015', '2016', '2017', '2018', '2019']


In [8]:
#For getting Columns having at least 1 null value (column names)
df.columns[df.isnull().any()]

Index(['1990', '1991', '1992', '1993', '1994', '1995', '1996', '1997', '1998',
       '1999', '2000', '2001', '2002', '2003', '2004', '2005', '2006', '2007',
       '2008', '2009', '2010', '2011', '2012', '2013', '2014', '2015', '2016',
       '2017', '2018', '2019'],
      dtype='object')

In [9]:
# For getting Columns with count, with having at least 1 null value.
df[df.columns[df.isnull().any()]].isnull().sum()

1990     51
1991     49
1992     47
1993     45
1994     42
1995     36
1996     36
1997     35
1998     34
1999     33
2000     24
2001     23
2002     22
2003     22
2004     22
2005     22
2006     22
2007     21
2008     20
2009     19
2010     19
2011     17
2012     19
2013     19
2014     20
2015     22
2016     22
2017     22
2018     29
2019    260
dtype: int64

In [10]:
# For getting percentage of the null count.
df[df.columns[df.isnull().any()]].isnull().sum() * 100 / df.shape[0]

1990     19.615385
1991     18.846154
1992     18.076923
1993     17.307692
1994     16.153846
1995     13.846154
1996     13.846154
1997     13.461538
1998     13.076923
1999     12.692308
2000      9.230769
2001      8.846154
2002      8.461538
2003      8.461538
2004      8.461538
2005      8.461538
2006      8.461538
2007      8.076923
2008      7.692308
2009      7.307692
2010      7.307692
2011      6.538462
2012      7.307692
2013      7.307692
2014      7.692308
2015      8.461538
2016      8.461538
2017      8.461538
2018     11.153846
2019    100.000000
dtype: float64