# Human Development Index (HDI) Dataset by the UN
This I think is a pretty full dataset by the UN on Human Development Index. More about it you can find here: https://hdr.undp.org/data-center/human-development-index#/indicies/HDI. \
I have also included in the repository the metadata file (HDR25_Composite_indices_metadata), in which you can find all the indicators and identifiers.\
There are pretty interesting indicators, which for Gender Inequality they have a specific category, with these indicators:\
GII Rank	gii_rank\
Gender Inequality Index (value)	gii\
Maternal Mortality Ratio (deaths per 100,000 live births)	mmr\
Adolescent Birth Rate (births per 1,000 women ages 15-19)	abr\
Population with at least some secondary education, female (% ages 25 and older)	se_f\
Population with at least some secondary education, male (% ages 25 and older)	se_m\
Share of seats in parliament, female (% held by women)	pr_f\
Share of seats in parliament, male (% held by men)	pr_m\
Labour force participation rate, female (% ages 15 and older)	lfpr_f\
Labour force participation rate, male (% ages 15 and older)	lfpr_m\
The letters are an id for the data queries.

In [1]:
import pandas as pd

class HDIData:
    def __init__(self, filepath: str):
        """
        Initialize the HDIData object by loading a CSV file into a DataFrame.
        """
        # Read CSV safely
        self.df = pd.read_csv(filepath, encoding="ISO-8859-1")
        
        # Standardize column names
        self.df.columns = self.df.columns.str.strip().str.lower().str.replace(' ', '_')
        
        # Convert wide-format year columns into long format
        self.long_df = self._reshape_long(self.df)
    
    def _reshape_long(self, df):
        """
        Converts wide-format year columns into long format for easier filtering.
        """
        id_vars = ['iso3', 'country', 'region']  # columns to keep
        value_vars = [col for col in df.columns if any(col.startswith(prefix) for prefix in ['hdi_', 'le_', 'eys_', 'mys_', 'gnipc_', 'gdi_', 'gii_', 'co2_prod_'])]
        
        # Melt the wide columns into long format
        long_df = df.melt(id_vars=id_vars, value_vars=value_vars, var_name='metric_year', value_name='value')
        
        # Split 'metric_year' into 'metric' and 'year'
        long_df[['metric', 'year']] = long_df['metric_year'].str.rsplit('_', n=1, expand=True)
        long_df['year'] = long_df['year'].astype(int)
        long_df.drop(columns='metric_year', inplace=True)
        
        return long_df

    def get_data(self, countries=None, years=None, metric=None):
        """
        Retrieve data for a given metric, list of countries, and/or years.
        """
        df = self.long_df.copy()
        
        # Filter by countries
        if countries is not None:
            if isinstance(countries, str):
                countries = [countries]
            df = df[df['country'].str.lower().isin([c.lower() for c in countries])]
        
        # Filter by years
        if years is not None:
            if isinstance(years, tuple):  # range
                df = df[(df['year'] >= years[0]) & (df['year'] <= years[1])]
            elif isinstance(years, list):
                df = df[df['year'].isin(years)]
            else:  # single year
                df = df[df['year'] == years]
        
        # Filter by metric
        if metric is not None:
            metric = metric.lower().replace(' ', '_')
            df = df[df['metric'] == metric]
        
        return df.reset_index(drop=True)


In [2]:
# Load CSV
hdi = HDIData("HDR25_Composite_indices_complete_time_series.csv")

# Get HDI for Norway, 2010-2020
norway_hdi = hdi.get_data(countries="Norway", years=(2010, 2020), metric="HDI")

# Life expectancy for Japan and Sweden in 2020
life_exp = hdi.get_data(countries=["Nicaragua", "Sweden"], years=(2000, 2020), metric="gii")


# Environmental Performance Index (EPI) - Yale
First, a quick look through the metadata file, basically all the indicators. You can also instead look at the epi2024variables2024-12-11.csv file. A nicer description of it is on the website, however, for indicator IDs its easier to use this me thinks.
Website: https://epi.yale.edu/