In [None]:
import pandas as pd
!pip install eurostat
import eurostat
import numpy as np

Collecting eurostat
  Downloading eurostat-1.1.0-py3-none-any.whl (15 kB)
Installing collected packages: eurostat
Successfully installed eurostat-1.1.0


# Territorial GHG

## Information about dataset

Territorial emissions: all emissions from residents and non-residents inside a country.

Dataset: ENV_AIR_GGE
https://ec.europa.eu/eurostat/databrowser/view/ENV_AIR_GGE/default/table?lang=en

Source: EUROSTAT

Frequency of measure: Annual

Unit: Million tonnes
-> we convert the values to tonnes

Values: Greenhouse gases (CO2, N2O in CO2 equivalent, CH4 in CO2 equivalent, HFC in CO2 equivalent, PFC in CO2 equivalent, SF6 in CO2 equivalent, NF3 in CO2 equivalent)

Sector: total net emissions per country (excluding memo items)

In [None]:
GHG_territorial = eurostat.get_data_df('env_air_gge')

In [None]:
# We rename the columns to make them more understandable
GHG_territorial = GHG_territorial.rename(columns={'src_crf': 'sector', 'geo\TIME_PERIOD': 'countries'})

# We focus our analysis on the total net emissions per country: Total (excluding memo items)
GHG_territorial = GHG_territorial.loc[GHG_territorial.sector == 'TOTXMEMO']

# We consider total GHG emissions
GHG_territorial = GHG_territorial.loc[GHG_territorial.airpol == 'GHG']

# We only need one unit, we will use: Million tonnes
GHG_territorial = GHG_territorial.loc[GHG_territorial.unit == 'MIO_T']

In [None]:
# We put the years in rows and create a new column containing the values
# This is done to facilitate the creation of a merged dataset containing all the data
GHG_territorial = GHG_territorial.melt(id_vars=['freq', 'unit', 'airpol', 'sector', 'countries'], var_name="Year", value_name="GHG")

# We only include in the dataset the important columns
# This will make the merging of all the datasets easier
GHG_territorial = GHG_territorial.loc[:,['countries','Year','GHG']]

In [None]:
GHG_territorial['GHG']

0            NaN
1            NaN
2            NaN
3            NaN
4            NaN
          ...   
1142    50.50278
1143    66.14473
1144     6.10587
1145    13.00072
1146    33.61233
Name: GHG, Length: 1147, dtype: float64

In [None]:
# We convert the unit from million tonnes to tonnes
GHG_territorial['GHG'] = GHG_territorial['GHG'] * 1e6

# We include the unit in the name of the column:
GHG_territorial = GHG_territorial.rename(columns={'GHG': 'Territorial GHG emissions [t]'})

# Residential GHG

## Information about dataset

Residential emissions: all emissions resulting from the activities of a
country’s residents, including the ones abroad.

Dataset: env_ac_ainah_r2

https://ec.europa.eu/eurostat/databrowser/view/ENV_AC_AINAH_R2/default/table?lang=en

Source: EUROSTAT

Frequency of measure: Annual

Unit: tonnes

Values: Greenhouse gases (CO2, N2O in CO2 equivalent, CH4 in CO2 equivalent, HFC in CO2 equivalent, PFC in CO2 equivalent, SF6 in CO2 equivalent, NF3 in CO2 equivalent)

Sector: All NACE (Nomenclature of Economic Activities) activities plus households

In [None]:
GHG_residential = eurostat.get_data_df('env_ac_ainah_r2')

In [None]:
# We rename the columns to make them more understandable
GHG_residential = GHG_residential.rename(columns={'nace_r2': 'sector', 'geo\TIME_PERIOD': 'countries'})

# We focus our analysis on the total net emissions per country: Total (excluding memo items)
GHG_residential = GHG_residential.loc[GHG_residential.sector == 'TOTAL_HH']

# We consider total GHG emissions
GHG_residential = GHG_residential.loc[GHG_residential.airpol == 'GHG']

# We only need one unit, we will use: Million tonnes
GHG_residential = GHG_residential.loc[GHG_residential.unit == 'T']

In [None]:
# We put the years in rows and create a new column containing the values
# This is done to facilitate the creation of a merged dataset containing all the data
GHG_residential = GHG_residential.melt(id_vars=['freq', 'unit', 'airpol', 'sector', 'countries'], var_name="Year", value_name="GHG")

# We only include in the dataset the important columns
# This will make the merging of all the datasets easier
GHG_residential = GHG_residential.loc[:,['countries','Year','GHG']]

In [None]:
GHG_residential = GHG_residential.rename(columns={'GHG': 'Residential GHG emissions [t]'})

# Footprint emissions

## Information about dataset

Transfer Emissions: The net difference between territorial and consumption emissions; representing the emissions from the production of exports minus the emissions from the production of imports.

Footprint Emissions: Residential Emissions - Transfer Emissions

Source: Global Carbon Atlas & EUROSTAT

https://ec.europa.eu/eurostat/databrowser/view/ENV_AC_AINAH_R2/default/table?lang=en

https://globalcarbonatlas.org/emissions/carbon-emissions/

In [None]:
GHG_transfer = pd.read_excel('https://raw.githubusercontent.com/AntoineTrabia/Green-Domestic-Product/main/data/Transfer_emissions.xlsx')

In [None]:
GHG_transfer.rename(columns={'Czech Republic': 'Czechia'}, inplace=True)

GHG_transfer = pd.melt(GHG_transfer, id_vars=['Year'], var_name='countries', value_name='Transfer GHG emissions [Mt]')

# We convert the unit from million tonnes to tonnes
GHG_transfer['Transfer GHG emissions [Mt]'] = GHG_transfer['Transfer GHG emissions [Mt]'] * 1e6

# We convert the unit in the name of the column:
GHG_transfer = GHG_transfer.rename(columns={'Transfer GHG emissions [Mt]': 'Transfer GHG emissions [t]'})

# We convert the column with countries to ensure that the format is the same as for the eurostat datasets:
country_mapping = {
    'Austria': 'AT',
    'Luxembourg': 'LU',
    'Switzerland': 'CH',
    'Belgium': 'BE',
    'Bulgaria': 'BG',
    'Cyprus': 'CY',
    'Czechia': 'CZ',
    'Germany': 'DE',
    'Denmark': 'DK',
    'Estonia': 'EE',
    'Greece': 'EL',
    'Spain': 'ES',
    'European Union - 27 countries (from 2020)': 'EU27_2020',
    'Finland': 'FI',
    'France': 'FR',
    'Croatia': 'HR',
    'Hungary': 'HU',
    'Ireland': 'IE',
    'Iceland': 'IS',
    'Italy': 'IT',
    'Lithuania': 'LT',
    'Latvia': 'LV',
    'Malta': 'MT',
    'Netherlands': 'NL',
    'Norway': 'NO',
    'Poland': 'PL',
    'Portugal': 'PT',
    'Romania': 'RO',
    'Sweden': 'SE',
    'Slovenia': 'SI',
    'Slovakia': 'SK'
}
# Replace country codes with full names using the dictionary
GHG_transfer['countries'].replace(country_mapping, inplace=True)

# We convert the Year column from int to str so we can merge this dataset with eurostat
GHG_transfer['Year'] = GHG_transfer['Year'].astype(str)

In [None]:
# We use residential emissions and transfer emissions to calculate footprint emissions

GHG_footprint = pd.merge(GHG_residential, GHG_transfer, on=['countries','Year'], how='outer').sort_values(['countries','Year'])

GHG_footprint['Footprint GHG emissions [t]'] = GHG_footprint['Residential GHG emissions [t]'] - GHG_footprint['Transfer GHG emissions [t]']

# GDP

## Information about dataset

Dataset: nama_10_gdp https://ec.europa.eu/eurostat/databrowser/view/nama_10_gdp__custom_10341462/default/table?lang=en

Source: EUROSTAT

Frequency of measure : Annual

Unit: Current prices, million euro

National accounts indicator: Gross domestic product at market prices

In [None]:
GDP = eurostat.get_data_df('nama_10_gdp')

In [None]:
# We only need one unit, we will use: Current prices, million euro
GDP = GDP.loc[GDP.unit == 'CP_MEUR']

# As a national accounts indicator we will use: Gross domestic product at market prices
GDP = GDP.loc[GDP.na_item == 'B1GQ']

# We rename the columns to make them more understandable
GDP = GDP.rename(columns={'geo\TIME_PERIOD': 'countries'})

In [None]:
# We put the years in rows and create a new column containing the values
# This is done to facilitate the creation of a merged dataset containing all the data
GDP = GDP.melt(id_vars=['freq', 'unit', 'na_item', 'countries'], var_name="Year", value_name="GDP")

# We only include in the dataset the important columns
# This will make the merging of all the datasets easier
GDP = GDP.loc[:,['countries','Year','GDP']]

In [None]:
# We include the unit in the name of the column:
GDP = GDP.rename(columns={'GDP': 'GDP [MIO_EURO]'})

# Population

## Information about dataset

Dataset: demo_pjan https://ec.europa.eu/eurostat/databrowser/view/demo_pjan/default/table?lang=en

Source: EUROSTAT

Frequency of measure: Annual

Unit: Number

Values: Total population

In [None]:
POP = eurostat.get_data_df('demo_pjan')

In [None]:
# We rename the columns to make them more understandable
POP = POP.rename(columns={'geo\TIME_PERIOD': 'countries'})

# We have no use for age and sex
POP = POP.loc[POP.age == 'TOTAL']
POP = POP.loc[POP.sex == 'T']

In [None]:
# We put the years in rows and create a new column containing the values
# This is done to facilitate the creation of a merged dataset containing all the data
POP = POP.melt(id_vars=["freq", "unit", "age", "sex", "countries"], var_name="Year", value_name="Population")

# We only include in the dataset the important columns
# This will make the merging of all the datasets easier
POP = POP.loc[:,['countries','Year','Population']]

# Air pollutants

## Information about dataset

Dataset: ENV_AIR_EMIS https://ec.europa.eu/eurostat/databrowser/view/ENV_AIR_EMIS__custom_773267/default/table?lang=en

Source: EUROSTAT

Frequency of measure: Annual

Unit: Tonne

Sector: National total for the entire territory (based on fuel sold)

In [None]:
AirPol = eurostat.get_data_df('env_air_emis')

In [None]:
# We rename the columns to make them more understandable
AirPol = AirPol.rename(columns={'geo\TIME_PERIOD': 'countries'})

# We focus our analysis on the: National total for the entire territory (based on fuel sold)
AirPol = AirPol.loc[AirPol.src_nfr == 'NFR_TOT_NAT']

In [None]:
# Let's see which air pollutants are included in the dataframe
AirPol.airpol.unique()

array(['AS', 'CD', 'CO', 'CR', 'CU', 'HG', 'NH3', 'NI', 'NMVOC', 'NOX',
       'PB', 'PM10', 'PM2_5', 'SE', 'SOX', 'ZN'], dtype=object)

In [None]:
# We put the years in rows and create a new column containing the values
# This is done to facilitate the creation of a merged dataset containing all the data
AirPol = AirPol.melt(id_vars=["freq", "unit", "airpol", "src_nfr", "countries"], var_name="Year", value_name="Values")

In [None]:
# Here we create a different dataset for each air pollutant in the dataset
# This ensures each air pollutant corresponds to a column in the final dataset
Arsenic = AirPol.loc[AirPol.airpol == 'AS',['countries','Year','Values']]
Carbon_monoxide = AirPol.loc[AirPol.airpol == 'CO',['countries','Year','Values']]
Lead = AirPol.loc[AirPol.airpol == 'PB',['countries','Year','Values']]
Nitrogen_oxides = AirPol.loc[AirPol.airpol == 'NOX',['countries','Year','Values']]
Sulphur_oxides = AirPol.loc[AirPol.airpol == 'SOX',['countries','Year','Values']]
Ammonia = AirPol.loc[AirPol.airpol == 'NH3',['countries','Year','Values']]
Particulates_2_5 = AirPol.loc[AirPol.airpol == 'PM2_5',['countries','Year','Values']]
Particulates_10 = AirPol.loc[AirPol.airpol == 'PM10',['countries','Year','Values']]
Non_methane_volatile_organic_compounds = AirPol.loc[AirPol.airpol == 'NMVOC',['countries','Year','Values']]
Cadmium = AirPol.loc[AirPol.airpol == 'CD',['countries','Year','Values']]
Mercury = AirPol.loc[AirPol.airpol == 'HG',['countries','Year','Values']]
Chromium = AirPol.loc[AirPol.airpol == 'CR',['countries','Year','Values']]
Copper = AirPol.loc[AirPol.airpol == 'CU',['countries','Year','Values']]
Nickel = AirPol.loc[AirPol.airpol == 'NI',['countries','Year','Values']]
Selenium = AirPol.loc[AirPol.airpol == 'SE',['countries','Year','Values']]
Zinc = AirPol.loc[AirPol.airpol == 'ZN',['countries','Year','Values']]

In [None]:
# We rename the columns of these datasets
# This ensures each column will be recognisable in the final dataset
Arsenic.rename(columns={'Values': 'As [t]'}, inplace=True)
Carbon_monoxide.rename(columns={'Values': 'CO [t]'}, inplace=True)
Lead.rename(columns={'Values': 'Pb [t]'}, inplace=True)
Nitrogen_oxides.rename(columns={'Values': 'NOx [t]'}, inplace=True)
Sulphur_oxides.rename(columns={'Values': 'SOx [t]'}, inplace=True)
Ammonia.rename(columns={'Values': 'NH3 [t]'}, inplace=True)
Particulates_2_5.rename(columns={'Values': 'PM2.5 [t]'}, inplace=True)
Particulates_10.rename(columns={'Values': 'PM10 [t]'}, inplace=True)
Non_methane_volatile_organic_compounds.rename(columns={'Values': 'NMVOC [t]'}, inplace=True)
Cadmium.rename(columns={'Values': 'Cd [t]'}, inplace=True)
Mercury.rename(columns={'Values': 'Hg [t]'}, inplace=True)
Chromium.rename(columns={'Values': 'Cr [t]'}, inplace=True)
Copper.rename(columns={'Values': 'Cu [t]'}, inplace=True)
Nickel.rename(columns={'Values': 'Ni [t]'}, inplace=True)
Selenium.rename(columns={'Values': 'Se [t]'}, inplace=True)
Zinc.rename(columns={'Values': 'Zn [t]'}, inplace=True)

In [None]:
# Let's merge all of these datasets into one final AirPol dataset
Airpol_final = pd.merge(Arsenic, Lead, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Nitrogen_oxides, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Sulphur_oxides, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Ammonia, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Particulates_2_5, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Particulates_10, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Non_methane_volatile_organic_compounds, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Cadmium, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Mercury, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Chromium, on=['countries','Year'], how='inner')
Airpol_final = pd.merge(Airpol_final, Nickel, on=['countries','Year'], how='inner')

# Merging datasets into two final datasets: one for territorial emissions and another for

In [None]:
df_final = pd.merge(GHG_territorial, GHG_footprint, on=['countries','Year'], how='inner')
df_final = pd.merge(df_final, GDP, on=['countries','Year'], how='inner')
df_final = pd.merge(df_final, POP, on=['countries','Year'], how='inner')
df_final = pd.merge(df_final, Airpol_final, on=['countries','Year'], how='inner')

In [None]:
# Define a dictionary mapping country codes to their full names
country_mapping = {
    'BE': 'Belgium',
    'BG': 'Bulgaria',
    'CY': 'Cyprus',
    'CZ': 'Czechia',
    'DE': 'Germany',
    'DK': 'Denmark',
    'EE': 'Estonia',
    'EL': 'Greece',
    'ES': 'Spain',
    'EU27_2020': 'European Union - 27 countries (from 2020)',
    'FI': 'Finland',
    'FR': 'France',
    'HR': 'Croatia',
    'HU': 'Hungary',
    'IE': 'Ireland',
    'IS': 'Iceland',
    'IT': 'Italy',
    'LT': 'Lithuania',
    'LV': 'Latvia',
    'MT': 'Malta',
    'NL': 'Netherlands',
    'NO': 'Norway',
    'PL': 'Poland',
    'PT': 'Portugal',
    'RO': 'Romania',
    'SE': 'Sweden',
    'SI': 'Slovenia',
    'SK': 'Slovakia'
}

# Replace country codes with full names using the dictionary
df_final['countries'].replace(country_mapping, inplace=True)

## Treatment of countries with missing values

countries: Austria, Luxembourg, Switzerland, Turkey, United Kingdom

In [None]:
datasets = [GDP, POP, GHG_territorial, GHG_footprint, Arsenic, Lead, Nitrogen_oxides, Sulphur_oxides, Ammonia,
            Particulates_2_5, Particulates_10, Non_methane_volatile_organic_compounds,
            Cadmium, Mercury, Chromium, Nickel]

In [None]:
def test(country_name, dataset):
  if country_name in list(dataset.countries.unique()):
    print(f'{country_name} in {dataset.columns[2]}')
  else:
    print(f'{country_name} not in {dataset.columns[2]}')

### Austria

In [None]:
country_name = 'AT'
for dataset in datasets:
  test(country_name,dataset)

AT in GDP [MIO_EURO]
AT in Population
AT in Territorial GHG emissions [t]
AT in Residential GHG emissions [t]
AT not in As [t]
AT in Pb [t]
AT in NOx [t]
AT in SOx [t]
AT in NH3 [t]
AT in PM2.5 [t]
AT in PM10 [t]
AT in NMVOC [t]
AT in Cd [t]
AT in Hg [t]
AT not in Cr [t]
AT not in Ni [t]


### Luxembourg

In [None]:
country_name = 'LU'
for dataset in datasets:
  test(country_name,dataset)

LU in GDP [MIO_EURO]
LU in Population
LU in Territorial GHG emissions [t]
LU in Residential GHG emissions [t]
LU not in As [t]
LU in Pb [t]
LU in NOx [t]
LU in SOx [t]
LU in NH3 [t]
LU in PM2.5 [t]
LU in PM10 [t]
LU in NMVOC [t]
LU in Cd [t]
LU in Hg [t]
LU not in Cr [t]
LU not in Ni [t]


### Switzerland

In [None]:
country_name = 'CH'
for dataset in datasets:
  test(country_name,dataset)

CH in GDP [MIO_EURO]
CH in Population
CH in Territorial GHG emissions [t]
CH in Residential GHG emissions [t]
CH not in As [t]
CH in Pb [t]
CH in NOx [t]
CH in SOx [t]
CH in NH3 [t]
CH in PM2.5 [t]
CH in PM10 [t]
CH in NMVOC [t]
CH in Cd [t]
CH in Hg [t]
CH not in Cr [t]
CH not in Ni [t]


### United Kingdom

In [None]:
country_name = 'UK'
for dataset in datasets:
  test(country_name,dataset)

UK in GDP [MIO_EURO]
UK in Population
UK not in Territorial GHG emissions [t]
UK not in Residential GHG emissions [t]
UK not in As [t]
UK not in Pb [t]
UK not in NOx [t]
UK not in SOx [t]
UK not in NH3 [t]
UK not in PM2.5 [t]
UK not in PM10 [t]
UK not in NMVOC [t]
UK not in Cd [t]
UK not in Hg [t]
UK not in Cr [t]
UK not in Ni [t]


### Turkey

In [None]:
country_name = 'TR'
for dataset in datasets:
  test(country_name,dataset)

TR in GDP [MIO_EURO]
TR in Population
TR not in Territorial GHG emissions [t]
TR not in Residential GHG emissions [t]
TR in As [t]
TR in Pb [t]
TR in NOx [t]
TR in SOx [t]
TR in NH3 [t]
TR in PM2.5 [t]
TR in PM10 [t]
TR in NMVOC [t]
TR in Cd [t]
TR in Hg [t]
TR in Cr [t]
TR in Ni [t]


## We can create a dataset that contains AT, CH & LU, by excluding As, Cr & Ni

In [None]:
# Let's merge all of these datasets into one final AirPol dataset
Airpol_with_CH = pd.merge(Nitrogen_oxides, Lead, on=['countries','Year'], how='inner')
Airpol_with_CH = pd.merge(Airpol_with_CH, Sulphur_oxides, on=['countries','Year'], how='inner')
Airpol_with_CH = pd.merge(Airpol_with_CH, Ammonia, on=['countries','Year'], how='inner')
Airpol_with_CH = pd.merge(Airpol_with_CH, Particulates_2_5, on=['countries','Year'], how='inner')
Airpol_with_CH = pd.merge(Airpol_with_CH, Particulates_10, on=['countries','Year'], how='inner')
Airpol_with_CH = pd.merge(Airpol_with_CH, Non_methane_volatile_organic_compounds, on=['countries','Year'], how='inner')
Airpol_with_CH = pd.merge(Airpol_with_CH, Cadmium, on=['countries','Year'], how='inner')
Airpol_with_CH = pd.merge(Airpol_with_CH, Mercury, on=['countries','Year'], how='inner')

In [None]:
df_final_with_CH = pd.merge(GHG_territorial, GHG_footprint, on=['countries','Year'], how='inner')
df_final_with_CH = pd.merge(df_final_with_CH, GDP, on=['countries','Year'], how='inner')
df_final_with_CH = pd.merge(df_final_with_CH, POP, on=['countries','Year'], how='inner')
df_final_with_CH = pd.merge(df_final_with_CH, Airpol_with_CH, on=['countries','Year'], how='inner')

In [None]:
# Define a dictionary mapping country codes to their full names
country_mapping = {
    'AT': 'Austria',
    'LU': 'Luxembourg',
    'CH': 'Switzerland',
    'BE': 'Belgium',
    'BG': 'Bulgaria',
    'CY': 'Cyprus',
    'CZ': 'Czechia',
    'DE': 'Germany',
    'DK': 'Denmark',
    'EE': 'Estonia',
    'EL': 'Greece',
    'ES': 'Spain',
    'EU27_2020': 'European Union - 27 countries (from 2020)',
    'FI': 'Finland',
    'FR': 'France',
    'HR': 'Croatia',
    'HU': 'Hungary',
    'IE': 'Ireland',
    'IS': 'Iceland',
    'IT': 'Italy',
    'LT': 'Lithuania',
    'LV': 'Latvia',
    'MT': 'Malta',
    'NL': 'Netherlands',
    'NO': 'Norway',
    'PL': 'Poland',
    'PT': 'Portugal',
    'RO': 'Romania',
    'SE': 'Sweden',
    'SI': 'Slovenia',
    'SK': 'Slovakia'
}

# Replace country codes with full names using the dictionary
df_final_with_CH['countries'].replace(country_mapping, inplace=True)

In [None]:
# We create a Dataset that contains all the data for AT, LU & CH as well as columns with only 0 for As, Ni & Cr
df_CHATLU = df_final_with_CH.loc[df_final_with_CH.countries.isin(['Austria','Luxembourg','Switzerland'])]
df_CHATLU['As [t]'] = [0] * len(df_CHATLU)
df_CHATLU['Ni [t]'] = [0] * len(df_CHATLU)
df_CHATLU['Cr [t]'] = [0] * len(df_CHATLU)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_CHATLU['As [t]'] = [0] * len(df_CHATLU)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_CHATLU['Ni [t]'] = [0] * len(df_CHATLU)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_CHATLU['Cr [t]'] = [0] * len(df_CHATLU)


In [None]:
df_AllIncluded = pd.concat([df_final, df_CHATLU], ignore_index=True)

# Downloading the Data

In [None]:
# We download the final dataset as a csv
df_AllIncluded.to_csv('Data_for_GrDP.csv', index=False)