## DATA SET: 

#### AI Global Index

CSV file from:
https://www.kaggle.com/datasets/katerynameleshenko/ai-index/data

#### GDP (WORLD BANK GROUP API)

https://datahelpdesk.worldbank.org/knowledgebase/articles/898599-indicator-api-queries

#### GDP per country

https://www.worldometers.info/gdp/gdp-by-country/

In [1]:
pip install pandas

Note: you may need to restart the kernel to use updated packages.


In [2]:
import pandas as pd

In [3]:
path1 = "AI_index_db.csv"
df_ai = pd.read_csv(path1)
pd.set_option('display.max_columns', None)
df_ai

Unnamed: 0,Country,Talent,Infrastructure,Operating Environment,Research,Development,Government Strategy,Commercial,Total score,Region,Cluster,Income group,Political regime
0,United States of America,100.00,94.02,64.56,100.00,100.00,77.39,100.00,100.00,Americas,Power players,High,Liberal democracy
1,China,16.51,100.00,91.57,71.42,79.97,94.87,44.02,62.92,Asia-Pacific,Power players,Upper middle,Closed autocracy
2,United Kingdom,39.65,71.43,74.65,36.50,25.03,82.82,18.91,40.93,Europe,Traditional champions,High,Liberal democracy
3,Canada,31.28,77.05,93.94,30.67,25.78,100.00,14.88,40.19,Americas,Traditional champions,High,Liberal democracy
4,Israel,35.76,67.58,82.44,32.63,27.96,43.91,27.33,39.89,Middle East,Rising stars,High,Liberal democracy
...,...,...,...,...,...,...,...,...,...,...,...,...,...
57,Sri Lanka,6.27,34.64,35.79,0.12,0.95,35.57,0.09,6.62,Asia-Pacific,Nascent,Lower middle,Electoral democracy
58,Egypt,1.11,38.84,0.00,2.08,1.54,68.72,0.31,4.83,Middle East,Nascent,Lower middle,Electoral autocracy
59,Kenya,0.75,14.11,29.84,0.07,12.15,7.75,0.31,2.30,Africa,Nascent,Lower middle,Electoral autocracy
60,Nigeria,2.74,0.00,50.10,0.45,2.06,7.75,0.33,1.38,Africa,Nascent,Lower middle,Electoral autocracy


In [4]:
df_ai.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 62 entries, 0 to 61
Data columns (total 13 columns):
 #   Column                 Non-Null Count  Dtype  
---  ------                 --------------  -----  
 0   Country                62 non-null     object 
 1   Talent                 62 non-null     float64
 2   Infrastructure         62 non-null     float64
 3   Operating Environment  62 non-null     float64
 4   Research               62 non-null     float64
 5   Development            62 non-null     float64
 6   Government Strategy    62 non-null     float64
 7   Commercial             62 non-null     float64
 8   Total score            62 non-null     float64
 9   Region                 62 non-null     object 
 10  Cluster                62 non-null     object 
 11  Income group           62 non-null     object 
 12  Political regime       62 non-null     object 
dtypes: float64(8), object(5)
memory usage: 6.4+ KB


In [5]:
df_ai.isnull().any()

Country                  False
Talent                   False
Infrastructure           False
Operating Environment    False
Research                 False
Development              False
Government Strategy      False
Commercial               False
Total score              False
Region                   False
Cluster                  False
Income group             False
Political regime         False
dtype: bool

In [6]:
df_ai.duplicated().sum()

0

In [9]:
df_ai["Country"].unique()

array(['United States of America', 'China', 'United Kingdom', 'Canada',
       'Israel', 'Singapore', 'South Korea', 'The Netherlands', 'Germany',
       'France', 'Australia', 'Ireland', 'Finland', 'Denmark',
       'Luxembourg', 'Japan', 'India', 'Switzerland', 'Sweden',
       'Hong Kong', 'Spain', 'Austria', 'Estonia', 'Taiwan', 'Norway',
       'Saudi Arabia', 'Belgium', 'Poland', 'Slovenia', 'New Zealand',
       'Italy', 'Russia', 'Malta', 'United Arab Emirates', 'Portugal',
       'Czech Republic', 'Iceland', 'Lithuania', 'Brazil', 'Greece',
       'Slovakia', 'Hungary', 'Malaysia', 'Mexico', 'Chile', 'Argentina',
       'Qatar', 'Turkey', 'Colombia', 'Uruguay', 'Bahrain', 'Vietnam',
       'Indonesia', 'Tunisia', 'South Africa', 'Morocco', 'Armenia',
       'Sri Lanka', 'Egypt', 'Kenya', 'Nigeria', 'Pakistan'], dtype=object)

## GDP (WORLD BANK GROUP API)

In [None]:
import requests
url = "http://api.worldbank.org/v2/indicator"
response = requests.get(url)
response.status_code

In [None]:
import requests
url2 = "http://api.worldbank.org/v2/indicators/NY.GDP.MKTP.CD?format=json"
response2 = requests.get(url2)
response2.status_code

In [None]:
data_gdp = response2.json()
data_gdp

In [46]:
import requests
import pandas as pd

def get_gdp_for_all_countries(year):
    url = f"http://api.worldbank.org/v2/country/all/indicator/NY.GDP.MKTP.CD?date={year}&format=json"
    response = requests.get(url)
    
    if response.status_code == 200:
        data = response.json()
        if data and isinstance(data, list) and len(data) > 1:
            gdp_data = []
            for entry in data[1]:
                country_name = entry['country']['value']
                gdp_value = entry['value']
                gdp_data.append({
                    "Country": country_name,
                    "Year": year,
                    "GDP": gdp_value
                })
            return gdp_data
        else:
            print("No GDP data available.")
            return []
    else:
        print(f"Failed to fetch data: {response.status_code}")
        return []

year = "2022"
gdp_data = get_gdp_for_all_countries(year)

# Convert to DataFrame
df_gdp = pd.DataFrame(gdp_data)

# Save DataFrame to a CSV file
df_gdp.to_csv(f"gdp_data_{year}.csv", index=False)


In [44]:
df_gdp

Unnamed: 0,Country,Year,GDP
0,Africa Eastern and Southern,2022,1183962000000.0
1,Africa Western and Central,2022,877140800000.0
2,Arab World,2022,3613682000000.0
3,Caribbean small states,2022,48138050000.0
4,Central Europe and the Baltics,2022,1943576000000.0
5,Early-demographic dividend,2022,14131150000000.0
6,East Asia & Pacific,2022,30663750000000.0
7,East Asia & Pacific (excluding high income),2022,21113810000000.0
8,East Asia & Pacific (IDA & IBRD countries),2022,21087710000000.0
9,Euro area,2022,14224350000000.0


In [45]:
df_gdp.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 50 entries, 0 to 49
Data columns (total 3 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   Country  50 non-null     object 
 1   Year     50 non-null     object 
 2   GDP      49 non-null     float64
dtypes: float64(1), object(2)
memory usage: 1.3+ KB


In [47]:
df_gdp["Country"].unique()

array(['Africa Eastern and Southern', 'Africa Western and Central',
       'Arab World', 'Caribbean small states',
       'Central Europe and the Baltics', 'Early-demographic dividend',
       'East Asia & Pacific',
       'East Asia & Pacific (excluding high income)',
       'East Asia & Pacific (IDA & IBRD countries)', 'Euro area',
       'Europe & Central Asia',
       'Europe & Central Asia (excluding high income)',
       'Europe & Central Asia (IDA & IBRD countries)', 'European Union',
       'Fragile and conflict affected situations',
       'Heavily indebted poor countries (HIPC)', 'High income',
       'IBRD only', 'IDA & IBRD total', 'IDA blend', 'IDA only',
       'IDA total', 'Late-demographic dividend',
       'Latin America & Caribbean',
       'Latin America & Caribbean (excluding high income)',
       'Latin America & the Caribbean (IDA & IBRD countries)',
       'Least developed countries: UN classification',
       'Low & middle income', 'Low income', 'Lower middle in