<a href="https://colab.research.google.com/github/Marcos-Sanson/UC3M-Web-Analytics/blob/main/Worldbank_API_Lab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# WEB ANALYTICS – Data Science and Engineering Degree  
## *(1st Semester, 4th-Year-Level Course)*  

### APIs – World Bank  

This lab was part of my **Web Analytics** course at **Universidad Carlos III de Madrid (UC3M)**, where I studied abroad from **September 2024 to December 2024** as part of my Computer Science degree. This specific lab focused on **retrieving, processing, and analyzing economic and environmental indicators** using the **World Bank API**. The lab introduced API-based data retrieval methods to access **real-world country-level indicators**, such as **population, GDP, CO₂ emissions, and income distribution**.  

Working in a group of three students, we explored **REST API requests**, data extraction, and processing JSON responses to gain insights from the World Bank's global economic dataset.  

### **API-Based Data Retrieval and Analysis**  

We used the **World Bank API** to collect and analyze key economic indicators. Our tasks included:  
- Fetching **country-level population data** and identifying **top and bottom 10 countries** by population.  
- Computing **gender population distributions** and identifying **countries with more women or men**.  
- Analyzing **GDP per capita** trends for different income groups over **2000-2022** and **2010-2022**.  
- Identifying the **top 5 countries in GDP growth per income group** from **2010-2022**.  
- Retrieving **CO₂ emissions per capita** and ranking **the top 30 highest-emitting countries**.  

### **Milestones**  

#### **Milestone 1: Global Population Analysis (2022)**  
We retrieved the **total population for all countries in 2022**, filtering out aggregate regions, and ranked them by size. The results included:  
- **Top 10 most populous countries**, led by **India and China**.  
- **Bottom 10 least populous countries**, including **Tuvalu, Nauru, and Palau**.  

#### **Milestone 2: Gender Population Disparities**  
We computed the percentage of **men vs. women in each country** and analyzed global gender imbalances:
- **139 countries had more women than men**, with **Hong Kong, Moldova, and Macao** leading in female-majority populations.  
- **78 countries had more men than women**, with **Qatar, UAE, and Maldives** having the highest male-majority populations.  

#### **Milestone 3: GDP Per Capita Growth Analysis**  
We retrieved **GDP per capita** data from **2000, 2010, and 2022** for different **World Bank income groups** and computed percentage growth rates:  
- **Low-income economies (LIC):** **+140.4%** (2000-2022)  
- **Lower-middle-income economies (LMC):** **+334.5%** (2000-2022)  
- **Upper-middle-income economies (UMC):** **+450.5%** (2000-2022)  
- **High-income economies (HIC):** **+108.1%** (2000-2022)  

We also analyzed GDP growth from **2010-2022**, showing a **slower growth trend post-2010** in high-income nations.  

#### **Milestone 4: Top 5 Countries in GDP Growth (2010-2022)**  
For each **income group**, we identified the **top 5 countries with the highest GDP per capita growth** from **2010-2022**:  
- **Low-income economies:** **Ethiopia (+205.8%)** and **Somalia (+161.5%)** had the highest growth.  
- **Lower-middle-income economies:** **Bangladesh (+258.7%)** led among developing nations.  
- **Upper-middle-income economies:** **China (+178.3%)** and **Moldova (+135.5%)** saw significant GDP gains.  
- **High-income economies:** **Guyana (+291.0%)** and **Nauru (+180.6%)** led high-income nations.  

#### **Milestone 5: CO₂ Emissions Per Capita**  
We retrieved the **most recent CO₂ emissions per capita** (metric tons) for all countries and ranked the **top 30 highest emitters**. The results showed that **countries with small populations and high fossil fuel production had the highest per capita emissions:**

- **Qatar (31.7 metric tons)** had the **highest CO₂ emissions per capita**, followed by **Bahrain (22.0), Brunei (21.7), Kuwait (21.2), and UAE (20.3)**.  
- **Other major fossil fuel producers** like **Saudi Arabia (14.3) and Australia (14.8)** ranked among the highest emitters.  
- **Industrialized nations** such as **the United States (13.0), Canada (13.6), and Russia (11.1)** also had significant per capita emissions.  
- **China (7.8 metric tons)** ranked **20th**, reflecting its status as the **largest total CO₂ emitter** globally.  
- **European countries like Luxembourg (12.5), Netherlands (7.5), and Germany (7.3)** also appeared in the rankings due to **high energy consumption per capita**.  


### **Outcome**  
Through this lab, we gained experience in **API-based data extraction**, **JSON processing**, and **economic data analysis**. We developed skills in **handling real-world datasets**, **interpreting economic indicators**, and **identifying global economic trends**.  


# 0. LAB PREPARATION

Students have to complete the following tasks before attending the lab:

1. **Read and study the API documentation to have some initial notions of the functionality of the World Bank API. Following, we share several links to the documentation related to the World Bank API:**
- https://datahelpdesk.worldbank.org/knowledgebase/articles/898581-api-basic-call-structures
- https://datahelpdesk.worldbank.org/knowledgebase/topics/125589-developer-information
- https://datahelpdesk.worldbank.org/knowledgebase/articles/889392-about-the-indicators-api-documentation

2. **The key element of the World Bank API are the "indicators". Next, we share a link that may simplify the search of indicators through a search tool. Once you have selected an indicator you can find its codification within the url bar of the browser.**

- https://data.worldbank.org/indicator?tab=featured

# **1. INTRODUCTION**

* The goal of this lab is to gain experience testing a widely-used API such as the World Bank API that includes bunch of information about countries indicators in economy, health, education, agriculture, etc.

* The lab includes 5 milestones that will drive the student through the use of several indicators.  

* The lab will be done in groups of 2-3 students.

* The lab will use two complete consecutive sessions (4 hours). The students are expected to complete the 5 milestones proposed in the lab within these 2 sessions

* **The final mark will be computed as a function of the number of milestones successfully completed.**

* **Each group should also upload their lab notebook in the corresponding task in Aula Global.**

* Upon completing all the milestones, students should call the professor, who will check the correctness of the solution. Partial milestones checks may be allowed in some cases.

# 2. **MILESTONES**

In this section we describe one by one the milestones and leave a space to the students to implement the code to complete the requested task.

**NOTE: Unless otherwise stated, all the milestones have to deliver information about countries. Therefore, you should not consider regions or any other aggregated information in your analysis.**

# **2.1. MILESTONE 1: POPULATION**:
Retrieve the 2022 countries' population and show the Top 10 countries and the Bottom 10 countries within the World Bank database.



In [1]:
# Milestone 1
import requests

# Base URL for population data
base_url = "https://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL?format=json&date=2022&per_page=1000"

# List to store valid country populations
populations = []

# Fetch the population data
response = requests.get(base_url)
data = response.json()

# Iterate through each country in the fetched data
for item in data[1]:
    country_iso3 = item['country']['id']  # Get the ISO3 code of the country
    # Check the corresponding link for that country
    country_url = f"https://api.worldbank.org/v2/country/{country_iso3}?format=json"
    country_response = requests.get(country_url)
    country_data = country_response.json()

    # Check if the country data is not an aggregate
    if country_data[1][0]['incomeLevel']['value'] != "Aggregates":
        population = item['value']  # Get the population value
        populations.append((item['country']['value'], population))

# Extracting country population data
populations.sort(key=lambda x: x[1])  # Sort by population

# Top and bottom 10 countries
top_10 = populations[-10:]
bottom_10 = populations[:10]

print("Top 10 Countries by Population (2022):", top_10)
print("Bottom 10 Countries by Population (2022):", bottom_10)


Top 10 Countries by Population (2022): [('Mexico', 128613117), ('Russian Federation', 144236933), ('Bangladesh', 169384897), ('Brazil', 210306415), ('Nigeria', 223150896), ('Pakistan', 243700667), ('Indonesia', 278830529), ('United States', 333271411), ('China', 1412175000), ('India', 1425423212)]
Bottom 10 Countries by Population (2022): [('Tuvalu', 9992), ('Nauru', 11801), ('Palau', 17759), ('St. Martin (French part)', 28870), ('San Marino', 33755), ('Gibraltar', 37609), ('British Virgin Islands', 38319), ('Monaco', 38931), ('Liechtenstein', 39493), ('Marshall Islands', 40077)]


# **2.2. MILESTONE 2: WOMEN Vs. MEN POPULATION**:
Obtain the % of men and women for each country and compute the difference among them using the formula %women - %men. Display:

1- The number of countries with more women than men.

2- The number of countries with more men than women

3- The 10 countries with more women compared to men (ten countries with the largest positive value of the previous metric)

- The 10 countries with more men compared to women (ten countries with the largest negative value of the previous metric).

**Note**: You can use the indicator the absolute number of men and women from the World Bank API and compute the % for each country and the difference, or you can use the indicator given directly the %.



In [3]:
# Milestone 2
import requests

# World Bank API endpoints for population percentage data
url_female_percentage = "https://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL.FE.ZS?date=2022&format=json&per_page=1000"
url_male_percentage = "https://api.worldbank.org/v2/country/all/indicator/SP.POP.TOTL.MA.ZS?date=2022&format=json&per_page=1000"

# Initialize a list to store data
country_data = []
valid_countries = set([country[0] for country in populations])  # Assuming 'populations' is pre-defined

# Fetch female population percentage data
response_female_percentage = requests.get(url_female_percentage)
response_male_percentage = requests.get(url_male_percentage)

# Check if the responses are successful
if response_female_percentage.status_code == 200 and response_male_percentage.status_code == 200:
    data_female_percentage = response_female_percentage.json()
    data_male_percentage = response_male_percentage.json()

    # Create dictionaries for female and male population data
    female_percentage_dict = {item['country']['id']: item['value'] for item in data_female_percentage[1] if item['value'] is not None}
    male_percentage_dict = {item['country']['id']: item['value'] for item in data_male_percentage[1] if item['value'] is not None}

    # Process data for each valid country
    for country_id, female_percentage in female_percentage_dict.items():
        country_name = next(item['country']['value'] for item in data_female_percentage[1] if item['country']['id'] == country_id)

        if country_name in valid_countries:
            # Get male percentage from the dictionary
            male_percentage = male_percentage_dict.get(country_id)

            # Check if male percentage is valid
            if male_percentage is not None:
                # Calculate the difference (women - men)
                difference = female_percentage - male_percentage

                # Store the data
                country_data.append({
                    'country': country_name,
                    'percentage_women': female_percentage,
                    'percentage_men': male_percentage,
                    'difference': difference
                })

    # 1. Countries with more women than men
    more_women = [country for country in country_data if country['difference'] > 0]

    # 2. Countries with more men than women
    more_men = [country for country in country_data if country['difference'] < 0]

    # 3. Top 10 countries with more women compared to men
    top_10_women = sorted(more_women, key=lambda x: x['difference'], reverse=True)[:10]

    # 4. Top 10 countries with more men compared to women
    top_10_men = sorted(more_men, key=lambda x: x['difference'])[:10]

    # Display the results
    print(f"Number of countries with more women than men: {len(more_women)}")
    print(f"Number of countries with more men than women: {len(more_men)}\n")

    print("Top 10 Countries with more women compared to men:")
    for country in top_10_women:
        print(f"{country['country']}: Difference = {country['difference']:.2f}%")

    print("\nTop 10 Countries with more men compared to women:")
    for country in top_10_men:
        print(f"{country['country']}: Difference = {country['difference']:.2f}%")

else:
    print("Failed to fetch data from the World Bank API.")

Number of countries with more women than men: 139
Number of countries with more men than women: 78

Top 10 Countries with more women compared to men:
Hong Kong SAR, China: Difference = 9.78%
Moldova: Difference = 7.92%
Macao SAR, China: Difference = 7.50%
Latvia: Difference = 7.44%
Armenia: Difference = 7.39%
Russian Federation: Difference = 7.09%
Ukraine: Difference = 6.85%
Georgia: Difference = 6.82%
Belarus: Difference = 6.81%
St. Martin (French part): Difference = 6.24%

Top 10 Countries with more men compared to women:
Qatar: Difference = -43.69%
United Arab Emirates: Difference = -28.21%
Maldives: Difference = -24.47%
Bahrain: Difference = -23.95%
Oman: Difference = -23.11%
Kuwait: Difference = -22.11%
Saudi Arabia: Difference = -21.47%
Seychelles: Difference = -10.17%
Palau: Difference = -7.88%
Bhutan: Difference = -7.10%


## **2.3. MILESTONE 3: GDP PER CAPITA ACCORDING FOR INCOME LEVEL GROUPS**:

Compute the average increase/decrease in percentage for the GDP per capita in US dollars in the following two periods: 2000-2022 and  2010-2022, GDPfor the following income groups: low-income economies, lower-middle-income economies, middle economies, upper-middle-income economies and high-income economies. The following, link provides information of the different country aggregations carried out by the World Bank.  

https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups

 You should compute the %GDP increase as follows. Given country A with a PIB Per Capita \$20000 in 2000 and \$30000 in 2022 the increase/decrease should be computed as follow:

%GDP increase = 100*(30000-20000)/20000=50%.


In [4]:
# Milestone 3

import requests

def fetch_gdp_data(income_group, start_year, end_year):
    # URL for World Bank API request
    url = f"https://api.worldbank.org/v2/country/{income_group}/indicator/NY.GDP.PCAP.CD?date={start_year}:{end_year}&format=json"

    # Send GET request to the World Bank API
    response = requests.get(url)
    print(url)

    if response.status_code == 200:
        data = response.json()[1]  # Extract the data from the response
        return data
    else:
        print(f"Failed to fetch data for {income_group}")
        return None

# LIC economies
gdp_data_lic = fetch_gdp_data("LIC", 2000, 2022)

print("LIC GDP Evolution:")

def extract_years(data, years):
    return {year: next((item['value'] for item in data if item['date'] == str(year)), None) for year in years}

# Extract GDP for 2000, 2010, and 2022 for low-income economies
gdp_years_lic = extract_years(gdp_data_lic, [2000, 2010, 2022])

def calculate_percentage_increase(gdp_initial, gdp_final):
    if gdp_initial and gdp_final:
        return 100 * (gdp_final - gdp_initial) / gdp_initial
    else:
        return None

# Calculate percentage increase for 2000-2022 and 2010-2022
pib_increase_2000_2022 = calculate_percentage_increase(gdp_years_lic[2000], gdp_years_lic[2022])
pib_increase_2010_2022 = calculate_percentage_increase(gdp_years_lic[2010], gdp_years_lic[2022])

print(f"2000-2022 increase: {pib_increase_2000_2022}%")
print(f"2010-2022 increase: {pib_increase_2010_2022}%")

gdp_data_lmc = fetch_gdp_data("LMC", 2000, 2022)

gdp_years_lmc = extract_years(gdp_data_lmc, [2000, 2010, 2022])

pib_increase_2000_2022_lmc = calculate_percentage_increase(gdp_years_lmc[2000], gdp_years_lmc[2022])
pib_increase_2010_2022_lmc = calculate_percentage_increase(gdp_years_lmc[2010], gdp_years_lmc[2022])

print("LMC GDP Evolution:")
print(f"2000-2022 increase: {pib_increase_2000_2022_lmc}%")
print(f"2010-2022 increase: {pib_increase_2010_2022_lmc}%")

gdp_data_umc = fetch_gdp_data("UMC", 2000, 2022)

gdp_years_umc = extract_years(gdp_data_umc, [2000, 2010, 2022])

pib_increase_2000_2022_umc = calculate_percentage_increase(gdp_years_umc[2000], gdp_years_umc[2022])
pib_increase_2010_2022_umc = calculate_percentage_increase(gdp_years_umc[2010], gdp_years_umc[2022])

print("UMC GDP Evolution:")
print(f"2000-2022 increase: {pib_increase_2000_2022_umc}%")
print(f"2010-2022 increase: {pib_increase_2010_2022_umc}%")

gdp_data_hic = fetch_gdp_data("HIC", 2000, 2022)

gdp_years_hic = extract_years(gdp_data_hic, [2000, 2010, 2022])

pib_increase_2000_2022_hic = calculate_percentage_increase(gdp_years_hic[2000], gdp_years_hic[2022])
pib_increase_2010_2022_hic = calculate_percentage_increase(gdp_years_hic[2010], gdp_years_hic[2022])

print("HIC GDP Evolution:")
print(f"2000-2022 increase: {pib_increase_2000_2022_hic}%")
print(f"2010-2022 increase: {pib_increase_2010_2022_hic}%")

https://api.worldbank.org/v2/country/LIC/indicator/NY.GDP.PCAP.CD?date=2000:2022&format=json
LIC GDP Evolution:
2000-2022 increase: 140.35108391364778%
2010-2022 increase: 6.01106908287891%
https://api.worldbank.org/v2/country/LMC/indicator/NY.GDP.PCAP.CD?date=2000:2022&format=json
LMC GDP Evolution:
2000-2022 increase: 334.52469854166384%
2010-2022 increase: 63.325931466471644%
https://api.worldbank.org/v2/country/UMC/indicator/NY.GDP.PCAP.CD?date=2000:2022&format=json
UMC GDP Evolution:
2000-2022 increase: 450.516193920671%
2010-2022 increase: 79.04714860410512%
https://api.worldbank.org/v2/country/HIC/indicator/NY.GDP.PCAP.CD?date=2000:2022&format=json
HIC GDP Evolution:
2000-2022 increase: 108.09038428496429%
2010-2022 increase: 30.128747176603202%


# **2.4. MILESTONE 4: TOP 5 COUNTRIES INCREASE GDP PER INCOME-GROUP**

For each of the income groups included in Milestone 3 and the period 2010-2022 list the Top 5 countries in terms of %GDPR per capita increase along with the value

**NOTE**: Do not consider the countries for which you do not have data either in 2010 or 2022 or both of them

In [5]:
# Milestone 4

import requests

def get_countries_by_income_group(income_group_code):
    """
    Retrieve a list of countries within a specific income group.

    Parameters:
        income_group_code (str): The income group code (e.g., 'LIC', 'LMC', 'UMC', 'HIC').

    Returns:
        List of dictionaries containing country 'id' and 'name'.
    """
    countries = []
    base_url = "https://api.worldbank.org/v2/country"
    params = {
        'incomeLevel': income_group_code,
        'format': 'json',
        'per_page': 1000  # To ensure all countries are fetched in one request
    }

    response = requests.get(base_url, params=params)
    if response.status_code != 200:
        print(f"Failed to fetch countries for income group {income_group_code}")
        return countries

    data = response.json()
    if not data or len(data) < 2:
        print(f"No country data found for income group {income_group_code}")
        return countries

    for country in data[1]:
        if country['region']['value'] != "Aggregates":
            countries.append({
                'id': country['id'],
                'name': country['name']
            })

    return countries

def fetch_gdp_per_capita(country_id, start_year, end_year):
    """
    Fetch GDP per capita data for a specific country and year range.

    Parameters:
        country_id (str): The ISO3 country code.
        start_year (int): The starting year.
        end_year (int): The ending year.

    Returns:
        Dictionary with years as keys and GDP per capita as values.
    """
    gdp_data = {}
    base_url = f"https://api.worldbank.org/v2/country/{country_id}/indicator/NY.GDP.PCAP.CD"
    params = {
        'date': f"{start_year}:{end_year}",
        'format': 'json',
        'per_page': 1000
    }

    response = requests.get(base_url, params=params)
    if response.status_code != 200:
        print(f"Failed to fetch GDP data for country {country_id}")
        return gdp_data

    data = response.json()
    if not data or len(data) < 2:
        print(f"No GDP data found for country {country_id}")
        return gdp_data

    for entry in data[1]:
        year = int(entry['date'])
        value = entry['value']
        if year in [start_year, end_year] and value is not None:
            gdp_data[year] = value

    return gdp_data

def calculate_percentage_increase(initial, final):
    """
    Calculate the percentage increase from initial to final values.

    Parameters:
        initial (float): The initial value.
        final (float): The final value.

    Returns:
        float: The percentage increase.
    """
    try:
        return 100 * (final - initial) / initial
    except ZeroDivisionError:
        return None

def get_top_5_countries(income_group_code, income_group_name, start_year, end_year):
    """
    For a given income group, identify the top 5 countries with the highest GDP per capita increase.

    Parameters:
        income_group_code (str): The income group code.
        income_group_name (str): The descriptive name of the income group.
        start_year (int): The starting year.
        end_year (int): The ending year.

    Returns:
        List of tuples containing country name and percentage increase.
    """
    countries = get_countries_by_income_group(income_group_code)
    percentage_increases = []

    for country in countries:
        gdp = fetch_gdp_per_capita(country['id'], start_year, end_year)
        if start_year in gdp and end_year in gdp:
            pct_increase = calculate_percentage_increase(gdp[start_year], gdp[end_year])
            if pct_increase is not None:
                percentage_increases.append((country['name'], pct_increase))

    # Sort countries by percentage increase in descending order
    sorted_countries = sorted(percentage_increases, key=lambda x: x[1], reverse=True)

    # Select top 5
    top_5 = sorted_countries[:5]

    return top_5

def main():
    # Define income groups
    income_groups = {
        'LIC': 'Low-income economies',
        'LMC': 'Lower-middle-income economies',
        'UMC': 'Upper-middle-income economies',
        'HIC': 'High-income economies'
    }

    # Define the period
    start_year = 2010
    end_year = 2022

    # Dictionary to store Top 5 results for each income group
    top_5_results = {}

    for code, name in income_groups.items():
        print(f"\nFetching and processing data for {name} ({code})...")
        top_5 = get_top_5_countries(code, name, start_year, end_year)
        top_5_results[name] = top_5

    # Display the Top 5 countries for each income group
    for group_name, countries in top_5_results.items():
        print(f"\nTop 5 Countries in {group_name} by % GDP Per Capita Increase ({start_year}-{end_year}):")
        if not countries:
            print("No data available.")
            continue
        for rank, (country, pct) in enumerate(countries, start=1):
            print(f"{rank}. {country}: {pct:.2f}%")

if __name__ == "__main__":
  main()


Fetching and processing data for Low-income economies (LIC)...

Fetching and processing data for Lower-middle-income economies (LMC)...

Fetching and processing data for Upper-middle-income economies (UMC)...

Fetching and processing data for High-income economies (HIC)...

Top 5 Countries in Low-income economies by % GDP Per Capita Increase (2010-2022):
1. Ethiopia: 205.81%
2. Somalia: 161.46%
3. Congo, Dem. Rep.: 104.30%
4. Rwanda: 64.32%
5. Liberia: 51.27%

Top 5 Countries in Lower-middle-income economies by % GDP Per Capita Increase (2010-2022):
1. Bangladesh: 258.67%
2. Timor-Leste: 188.20%
3. Djibouti: 166.34%
4. Viet Nam: 144.56%
5. Cambodia: 144.16%

Top 5 Countries in Upper-middle-income economies by % GDP Per Capita Increase (2010-2022):
1. China: 178.27%
2. Moldova: 135.49%
3. Armenia: 116.09%
4. Georgia: 105.07%
5. Marshall Islands: 104.26%

Top 5 Countries in High-income economies by % GDP Per Capita Increase (2010-2022):
1. Guyana: 290.97%
2. Nauru: 180.56%
3. Ireland: 1

# **2.5. MILESTONE 5: CO2 emission per capita**

Retrieve the most recent non empty value for the amount of CO2 emission per capita (metric tons per country) for all the countries. Display the 30 countries with the highest CO2 emission per capita along with their value and the year related to that value.

**NOTE**: You cannot search manually the year and use it in your query for this milestone.


In [13]:
import requests

# Define the World Bank API base URL for CO2 emissions per capita
co2_url = "https://api.worldbank.org/v2/country/{}/indicator/EN.ATM.CO2E.PC?format=json&per_page=100"

# Fetch all country details to map names to ISO3 codes
country_response = requests.get("https://api.worldbank.org/v2/country?format=json&per_page=300")
if country_response.status_code != 200:
    print("Failed to retrieve country list.")
    exit()

country_data = country_response.json()
country_iso_map = {country["name"]: country["id"] for country in country_data[1] if "id" in country}

# Initialize a list to store the CO2 data
co2_data = []

# Iterate through the country ISO3 codes
for country_name, country_iso3 in country_iso_map.items():
    # Fetch the CO2 emission data for the country
    response = requests.get(co2_url.format(country_iso3))
    if response.status_code != 200:
        continue  # Skip if there's an issue with the request

    data = response.json()
    if not data or len(data) < 2:
        continue  # Skip if no valid data is returned

    # Extract the most recent valid CO2 emission value
    for entry in data[1]:
        co2_value = entry.get('value')
        year = entry.get('date')

        if co2_value is not None:  # Ensure the value is valid
            co2_data.append((country_name, co2_value, year))
            break  # Take only the most recent non-null value

# Sort the countries by their CO2 emissions per capita in descending order
sorted_co2_data = sorted(co2_data, key=lambda x: x[1], reverse=True)

# Get the top 30 countries with the highest CO2 emissions per capita
top_30_co2_emitters = sorted_co2_data[:30]

# Display the results
if top_30_co2_emitters:
    print("Top 30 Countries with Highest CO2 Emission Per Capita:")
    for i, (country_name, co2_value, year) in enumerate(top_30_co2_emitters, 1):
        print(f"{i}. {country_name}: {co2_value} metric tons (Year: {year})")
else:
    print("No valid CO2 emission data found.")


No valid CO2 emission data found.
