## Choosing EU ETS vs Carbon TAX
EU ETS start only at 2005 and Carbon Tax depends on countries that have already implemented it but could go as early as 1990.

In [100]:
import numpy as np
import pandas as pd
import statsmodels.api as sm

carbon_pricing_data = pd.read_excel('./ggdp_data/world_bank_carbon_pricing.xlsx', header=1,
                                    sheet_name='Compliance_Price', na_values='-')
carbon_pricing_data = carbon_pricing_data.set_index('Name of the initiative').T
carbon_pricing_data.rename(columns=lambda x: str.lower(x.replace(' ', '_')), inplace=True)
#dropping all metadata rows
carbon_pricing_data = carbon_pricing_data.iloc[7:]
carbon_pricing_data = carbon_pricing_data.apply(pd.to_numeric)
carbon_pricing_data.index = [str(x) for x in carbon_pricing_data.index]

eu_ets = carbon_pricing_data['eu_ets']
finland_carbon_tax = carbon_pricing_data['finland_carbon_tax']
#Finland has two types of carbon tax: Transport fuels, heating fuels which are the same across the whole series
finland_carbon_tax = finland_carbon_tax.iloc[:, 0]

In [101]:
#checking to see if EU ETS can proxy for carbon tax or the other way round. Does any one of them hold and additional information?
ets_joined_carbon_tax = pd.DataFrame(eu_ets).join(finland_carbon_tax)
ets_joined_carbon_tax = ets_joined_carbon_tax.dropna()
ets_joined_carbon_tax.head()

model = sm.OLS.from_formula('eu_ets ~ finland_carbon_tax', data=ets_joined_carbon_tax)
results = model.fit()
results.summary()


  res = hypotest_fun_out(*samples, **kwds)


0,1,2,3
Dep. Variable:,eu_ets,R-squared:,0.165
Model:,OLS,Adj. R-squared:,0.116
Method:,Least Squares,F-statistic:,3.365
Date:,"Sat, 11 May 2024",Prob (F-statistic):,0.0842
Time:,22:05:44,Log-Likelihood:,-86.898
No. Observations:,19,AIC:,177.8
Df Residuals:,17,BIC:,179.7
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,7.4432,11.154,0.667,0.514,-16.090,30.976
finland_carbon_tax,0.3522,0.192,1.834,0.084,-0.053,0.757

0,1,2,3
Omnibus:,6.489,Durbin-Watson:,0.533
Prob(Omnibus):,0.039,Jarque-Bera (JB):,4.314
Skew:,1.139,Prob(JB):,0.116
Kurtosis:,3.508,Cond. No.,114.0


### Analysing Results
I was looking to see if ets and carbon tax are very correlated, if they are then I could probably assume that one of them proxies for the other. We can see that linear relationship between the variables is insignificant. Meaning that they don't proxy linearly for each other. Thus I should probably use one of them or a combination of both when calculating the GGDP results.

## Calculating GGDP
The base formula which I'm going to use is this: $ GGDP = GDP - KtCO_{2} \cdot P_{tCO_2} - Twaste \cdot 74 kWh \cdot P_{1 kWh elec} - GNI \cdot \% NRD/100 $

First term - air pollution
Second Term - waste pollution translated into electricity cost
Third Term - Natural resource depletion

The 74 figure is based on different papers: check this

Managed to find all the data for this: arises the question of what data to include and how.

The problems with this base formula are:
1. doesn't account for defensive costs - the cost of restoring and protecting the environment. Need to find figures for this.
2. the figure 74 isn't really based on anything concrete (the citing is quite unclear)
3. no accounting for green innovation - not even sure if I want to add this, but perhaps I should

The main problem with actually deriving GGDP is data availability and lack of accounting standard that can be implemented using available data.


In [102]:
#Loading Data for ggdp
finland_ggdp = pd.DataFrame()

wdi_data = pd.read_excel('./ggdp_data/P_Data_Extract_From_World_Development_Indicators.xlsx', header=0,
                         sheet_name='Data', na_values='..')
wdi_data = wdi_data.set_index('Country Name').T
wdi_data = wdi_data.drop(['Series Code'])
wdi_data = wdi_data.rename({"Series Name": "variable_name"})
wdi_data = wdi_data.rename(index=lambda x: x[:y] if (y := x.find(' ')) != -1 else x)
wdi_data = wdi_data.dropna(axis='columns', how='all')


def replace_values_from_dict(curr_value, value_dict: dict):
    try:
        return value_dict[curr_value]
    except KeyError:
        return curr_value


wdi_variables_to_replace = {'GNI, PPP (current international $)': 'gni_ppp',
                            'Adjusted savings: carbon dioxide damage (% of GNI)': 'as_co2_damage_gni',
                            'Adjusted savings: natural resources depletion (% of GNI)': 'as_nrd_gni',
                            'Adjusted savings: particulate emission damage (% of GNI)': 'as_ped_gbi',
                            'GNI (current US$)': 'gni_dollar',
                            'CO2 emissions (kt)': 'co2_kt',
                            'Total greenhouse gas emissions (kt of CO2 equivalent)': 'total_gge_kt',
                            'GDP, PPP (current international $)': 'gdp_ppp',
                            'GDP (current US$)': 'gdp_dollar'}

#def lower_only_letters(curr_str:str):


#renaming variables
wdi_data.loc['variable_name'] = (wdi_data.loc['variable_name']).apply(replace_values_from_dict,
                                                                      args=(wdi_variables_to_replace,))


def change_col_names_to_country(df: pd.DataFrame):
    column_name_list = []
    curr_col_name: str
    for curr_col_name, curr_series in df.items():
        #'country-name'_'variable-name'
        new_column_name = f'{curr_col_name.lower()}_{curr_series["variable_name"]}'
        column_name_list.append(new_column_name)
    df.columns = column_name_list


#combine all finland data into one dataframe to avoid problems with dates
finland_data = wdi_data['Finland']
change_col_names_to_country(finland_data)
finland_data = finland_data.drop(index=['Country', 'variable_name'])
finland_data = finland_data.join(finland_carbon_tax, how='left')

finland_data = finland_data.apply(pd.to_numeric)

There's a problem with calculating waste because not enough waste data is available in order to constrcut the waste cost term. So there's a need to do imputation.
I thought it would make sense to do the imputation by regressing waste on gdp, since as the country has more economic output the more waste it's going to have.
We can see that when we run the regression we get a very high R-squared and that the model seems to be $Net Waste = 0.0005 * GDP[PPP] $, the slope coefficient is significant. We can use this result to fill in the missing values all the way to 1990.

In [103]:
#Waste calculation
# Waste generation data preparation and projections
waste_gen_data = pd.read_csv('./ggdp_data/eurostat_generation_of_waste_by_category.csv')
waste_treatment_data = pd.read_csv('./ggdp_data/eurostat_waste_treatment_by_waste_category.csv')

#getting Finland's total waste generation
finland_gen_total = waste_gen_data[(waste_gen_data['geo'] == 'FI') & (waste_gen_data['waste'] == 'TOTAL')]
finland_gen_total.index = finland_gen_total['TIME_PERIOD']
#getting Finland's total energy recovery from waste, since we are pricing waste through energy prices.
finland_energy_recovery = waste_treatment_data[
    (waste_treatment_data['geo'] == 'FI') & (waste_treatment_data['wst_oper'] == 'RCV_E')]
finland_energy_recovery.index = finland_energy_recovery['TIME_PERIOD']

net_waste_data = pd.DataFrame()
#calculating net waste: net_waste = generated_waste - energy_recovered_from_waste
net_waste_data['finland_net_waste'] = finland_gen_total['OBS_VALUE'] - finland_energy_recovery['OBS_VALUE']

net_waste_data.index = [str(x) for x in net_waste_data.index]

#is GDP correlated with waste

gdp_waste_data = net_waste_data.join(finland_data['finland_gdp_ppp'])

model = sm.OLS.from_formula('finland_net_waste ~ finland_gdp_ppp', data=gdp_waste_data)
results = model.fit()
results.summary()

  res = hypotest_fun_out(*samples, **kwds)


0,1,2,3
Dep. Variable:,finland_net_waste,R-squared:,0.774
Model:,OLS,Adj. R-squared:,0.742
Method:,Least Squares,F-statistic:,23.97
Date:,"Sat, 11 May 2024",Prob (F-statistic):,0.00176
Time:,22:05:48,Log-Likelihood:,-158.04
No. Observations:,9,AIC:,320.1
Df Residuals:,7,BIC:,320.5
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,-1.773e+07,2.24e+07,-0.792,0.454,-7.07e+07,3.52e+07
finland_gdp_ppp,0.0005,9.76e-05,4.896,0.002,0.000,0.001

0,1,2,3
Omnibus:,0.956,Durbin-Watson:,2.067
Prob(Omnibus):,0.62,Jarque-Bera (JB):,0.645
Skew:,0.244,Prob(JB):,0.724
Kurtosis:,1.783,Cond. No.,1330000000000.0


In [104]:
#adding the model from the regression to the data
finland_data['net_waste'] = finland_data['finland_gdp_ppp'] * results.params['finland_gdp_ppp']

### Electricity Price
We use electricity price to be able to price solid waste. The problem is electricity price is tiered so the calculation has to be done accordingly. What I plan on doing is taking the fractional part of the total waste converted to electricity respective to house-hold use and industrial use in Finland and then look at each one of them as if it were a single consumer. This is an obvious over-simplification but otherwise there is no way of calculating it because of a lack of data.

In [105]:
from math import inf

#code to consumption tier. consumption tier in kWh
tax_tier_map_pre_2007_hh = {'4161050': 600, '4161100': 1200, '4161150': 3500, '4161200': 7500, '4161250': inf}

tax_tier_map_post_2007_hh = {'KWH_LT1000': 1000, 'KWH1000-2499': 2500, 'KWH2500-4999': 5000, 'KWH5000-14999': 15000,
                             'KWH_GE15000': inf}

tax_tier_map_post_2007_nh = {'MWH_LT20': 20000, 'MWH20-499': 500000, 'MWH500-1999': 2000000, 'MWH2000-19999': 20000000,
                             'MWH20000-69999': 70000000, 'MWH70000-149999': 150000000, 'MWH_GE150000': inf}

tax_tier_map_pre_2007_nh = {'4162050': 30000, '4162100': 50000, '4162150': 160000, '4162200': 1250000,
                            '4162250': 2000000, '4162300': 10000000, '4162350': 24000000, '4162400': 50000000,
                            '4162450': inf}
tax_tier_map = {**tax_tier_map_pre_2007_hh, **tax_tier_map_post_2007_hh, **tax_tier_map_post_2007_nh,
                **tax_tier_map_pre_2007_nh}


#Need to take care of units (industrial and household and to take care of row with value KT_TOT)
def process_electricity_df(country_code: str, elec_df: pd.DataFrame, with_tax=True) -> pd.DataFrame:
    tax_filter = 'I_TAX' if with_tax else 'X_TAX'
    filtered_df: pd.DataFrame = elec_df[(elec_df['geo'] == country_code) & (elec_df['tax'] == tax_filter)]
    filtered_df.loc[:, 'year'] = filtered_df['TIME_PERIOD'].apply(lambda x: x[:x.find('-')])
    if 'consom' in filtered_df:
        filtered_df = filtered_df.rename(columns={'consom': 'nrg_cons'})
    filtered_df = filtered_df.assign(tax_tier=filtered_df.loc[:, 'nrg_cons'].apply(str))
    filtered_df = filtered_df.drop(columns=['nrg_cons', 'OBS_FLAG', 'DATAFLOW', 'LAST UPDATE', 'freq', 'product'])
    filtered_df = filtered_df.rename(columns={'OBS_VALUE': 'price'})
    filtered_df = filtered_df.sort_values(by=['currency', 'tax_tier', 'year'])
    #need to finish fillna
    filtered_df['price'] = filtered_df.loc[:, 'price'].fillna(method='ffill')
    annual_data = filtered_df.groupby(['currency', 'tax_tier', 'year'], as_index=False)['price'].mean()
    #annual_data = annual_data.rename(columns={'OBS_VALUE': 'price', 'TIME_PERIOD': 'year'})
    tot_kwh_indices = annual_data.index[annual_data['tax_tier'] == 'TOT_KWH'].tolist()
    annual_data = annual_data.drop(index=tot_kwh_indices)
    return annual_data


def process_electricity_consumption_df(country_code: str, df: pd.DataFrame) -> pd.DataFrame:
    filtered_df: pd.DataFrame = df[(df['geo'] == country_code)]
    filtered_df = filtered_df.drop(columns=['OBS_FLAG', 'DATAFLOW', 'LAST UPDATE', 'freq'])
    annual_data = {}
    for index, row in filtered_df.iterrows():
        current_year = row['TIME_PERIOD']
        if current_year not in annual_data:
            annual_data[current_year] = {'year': current_year, 'hh_consumption': 0, 'nh_consumption': 0}
        current_year_item = annual_data[current_year]
        if row['nrg_bal'] == 'FC_OTH_HH_E':
            current_year_item['year'] = row['TIME_PERIOD']
            current_year_item['hh_consumption'] = row['OBS_VALUE']
            current_year_item['nh_consumption'] -= row['OBS_VALUE']
        elif row['nrg_bal'] == 'FC':
            current_year_item['nh_consumption'] += row['OBS_VALUE']
    #list of dictionaries of years, each item contains year, hh and nh consumption
    annual_data = [annual_data[key] for key in annual_data]
    return pd.DataFrame(annual_data)


elec_hh_pre_2007 = pd.read_csv('./ggdp_data/eurostat_electricity/eurostat_electricity_price_hh_pre_2007.csv', header=0)
elec_hh_pre_2007_processed = process_electricity_df(country_code='FI', elec_df=elec_hh_pre_2007)
elec_hh_post_2007 = pd.read_csv('./ggdp_data/eurostat_electricity/eurostat_electricity_price_hh_post_2007.csv',
                                header=0)
elec_hh_post_2007_processed = process_electricity_df(country_code='FI', elec_df=elec_hh_post_2007)
elec_hh = pd.concat([elec_hh_post_2007_processed, elec_hh_pre_2007_processed])
elec_hh = elec_hh.replace({'tax_tier': tax_tier_map})
elec_hh['tax_tier'] = pd.to_numeric(elec_hh['tax_tier'])
elec_hh['year'] = pd.to_numeric(elec_hh['year'])

elec_nh_pre_2007 = pd.read_csv('./ggdp_data/eurostat_electricity/eurostat_electricity_price_nh_pre_2007.csv')
elec_nh_pre_2007_processed = process_electricity_df(country_code='FI', elec_df=elec_nh_pre_2007)
elec_nh_post_2007 = pd.read_csv('./ggdp_data/eurostat_electricity/eurostat_electricity_price_nh_post_2007.csv')
elec_nh_post_2007_processed = process_electricity_df(country_code='FI', elec_df=elec_nh_post_2007)
elec_nh = pd.concat([elec_nh_post_2007_processed, elec_nh_pre_2007_processed])
elec_nh = elec_nh.replace({'tax_tier': tax_tier_map})
elec_nh['tax_tier'] = pd.to_numeric(elec_nh['tax_tier'])
elec_nh['year'] = pd.to_numeric(elec_nh['year'])

elec_consumption = pd.read_csv('./ggdp_data/eurostat_electricity/eurostat_electricity_consumption.csv', header=0)
elec_consump_processed = process_electricity_consumption_df(country_code='FI', df=elec_consumption)


A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df.loc[:, 'year'] = filtered_df['TIME_PERIOD'].apply(lambda x: x[:x.find('-')])
  filtered_df['price'] = filtered_df.loc[:, 'price'].fillna(method='ffill')
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  filtered_df.loc[:, 'year'] = filtered_df['TIME_PERIOD'].apply(lambda x: x[:x.find('-')])
  filtered_df['price'] = filtered_df.loc[:, 'price'].fillna(method='ffill')
  elec_hh = elec_hh.replace({'tax_tier': tax_tier_map})
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[ro

When looking at electricity prices I only decided to look at electricity prices for household and non-household consumptio because the cost for household vs non-hosehold electricty prices are different. This presents a bit of a challange in pricing electricity becasue it raises the question of which rate to use. I decided to take the relative part of household and non-household from the WTE output. For example if HH is 2/3 of the total consumption and NH is 1/3 so we will price 2/3 of WTE using HH pricess and a 1/3 of WTE using NH prices.

In [107]:
#some electricity prices for post 2007 data are missing for finland data so I took official data from Statistics Finland’s free-of-charge statistical databases
# I plan on merging the data together to get as much data as possible
def process_fin_stat_df(df: pd.DataFrame) -> pd.DataFrame:
    df['year'] = df['Month'].apply(lambda x: x[:x.find('M')])
    df = df.replace({'..': nan})
    df = df.dropna(axis='columns', how='all')
    df = df.drop(columns=['Month'])
    df = df.apply(pd.to_numeric)
    average_yearly_price_df = df.groupby(['year'], as_index=False).mean()
    average_yearly_with_tax_tier_rows = []

    for index, row in average_yearly_price_df.iterrows():
        tax_tiers = list(row.index.values)
        tax_tiers.remove('year')
        for tier in tax_tiers:
            average_yearly_with_tax_tier_rows.append(
                {'year': row['year'],
                 'tax_tier': tier,
                 'price': row[tier] / 100,
                 'currency': 'EUR'}
            )
    return pd.DataFrame(average_yearly_with_tax_tier_rows)

    #for each year we have a price for each consumption range


fin_stat_elec_price_hh = pd.read_csv('ggdp_data/finalnd_statistics_db/finalnd_statistics_hh_elec_prices.csv', header=0)
fin_stat_elec_price_hh = process_fin_stat_df(fin_stat_elec_price_hh)

fin_stat_elec_price_nh = pd.read_csv('ggdp_data/finalnd_statistics_db/finalnd_statistics_nh_elec_prices.csv', header=0)
fin_stat_elec_price_nh = process_fin_stat_df(fin_stat_elec_price_nh)

#combining both datasets
#for hh we start at 2009 for official dataset, so anything prior to that will be taken from processed dataset


hh_elec_filtered_to_combine_eurostat = elec_hh[(elec_hh['currency'] == 'EUR') & (elec_hh['year'] < 2009)]
hh_elec_combined = pd.concat([hh_elec_filtered_to_combine_eurostat, fin_stat_elec_price_hh])
hh_elec_combined = hh_elec_combined.sort_values(by=['year', 'tax_tier'])

#


nh_elec_filtered_to_combine_eurostat = elec_nh[(elec_nh['currency'] == 'EUR') & (elec_nh['year'] < 2008)]

index_of_missing_val = fin_stat_elec_price_nh[(fin_stat_elec_price_nh['year'] == 2008) & (fin_stat_elec_price_nh['tax_tier'] == '20000')].index
price_for_filling = elec_nh[(elec_nh['year'] == 2008) & (elec_nh['currency'] == 'EUR') & (elec_nh['tax_tier'] == 20000)].squeeze(axis=0)['price']
fin_stat_elec_price_nh.loc[index_of_missing_val, 'price'] = price_for_filling

nh_elec_combined = pd.concat([nh_elec_filtered_to_combine_eurostat, fin_stat_elec_price_nh])
nh_elec_combined = nh_elec_combined.sort_values(by=['year', 'tax_tier'])
print('1234')

KeyboardInterrupt: 

In [82]:
from numpy import nan


def get_consumption_cost(elec_consumption: float, currency: str, year: int, elec_price: pd.DataFrame) -> float:
    if year == 2017:
        print('1234')
    currency_year_df = elec_price[
        (elec_price['currency'] == currency)
        & (elec_price['year'] == year)
        ]
    #looking for greatest tax_tier s.t. tax_tier>consumption && tax_tier exists in data (not all tax tiers exist for all countries)
    if currency_year_df.empty:
        return nan
    existing_tax_tiers = list(currency_year_df['tax_tier'].unique())
    existing_tax_tiers.sort()
    fitting_tax_tier = None
    for curr_tax_tier in existing_tax_tiers:
        if elec_consumption < curr_tax_tier:
            fitting_tax_tier = curr_tax_tier
    if fitting_tax_tier is None:
        #we couldn't find a tax tier that was greater, so we take the biggest one we have to complete the data
        fitting_tax_tier = existing_tax_tiers[-1]

    tax_tier = currency_year_df[currency_year_df['tax_tier'] == fitting_tax_tier]
    tax_tier = tax_tier.squeeze(axis=0)
    return elec_consumption * tax_tier['price']


def calculate_energy_cost(wte: float, year: int, currency: str, elec_consumption: pd.DataFrame,
                          hh_elec: pd.DataFrame, nh_elec: pd.DataFrame) -> float:
    """
    :param wte: waste to energy
    :param elec_consumption: 
    :param hh_elec_price: 
    :param nh_elec_price: 
    :return: 
    """
    #need to check that we have all the data to actually calculate this
    current_consumption = elec_consumption[elec_consumption['year'] == year]
    if current_consumption.empty:
        return nan
    hh_consumption = current_consumption['hh_consumption'].iloc[0]
    nh_consumption = current_consumption['nh_consumption'].iloc[0]
    hh_to_total = hh_consumption / (hh_consumption + nh_consumption)
    #the absolute part of the total wte. we do this to calculate separately the price of hh and nh elec prices
    current_hh_consumption = wte * hh_to_total
    current_nh_consumption = wte - current_hh_consumption

    hh_consumption_cost = get_consumption_cost(current_hh_consumption, currency=currency, year=year, elec_price=hh_elec)
    nh_consumption_cost = get_consumption_cost(current_nh_consumption, currency=currency, year=year, elec_price=nh_elec)
    return hh_consumption_cost + nh_consumption_cost


cost = calculate_energy_cost(wte=7.533340e+09,
                             year=2017,
                             currency='EUR',
                             elec_consumption=elec_consump_processed,
                             hh_elec=elec_hh,
                             nh_elec=elec_nh)


KeyboardInterrupt: 

In [36]:
#unittests to electricity prices
#we want to make sure all tax tiers are calculated correctly
# unittests could be more robust but this isn't production code
import unittest
from unittest import TestCase

hh_data = {
    'currency': ['EUR', 'EUR', 'EUR', 'EUR'],
    'tax_tier': [1000, 1000, 2000, 2000],
    'year': [2003, 2004, 2003, 2004],
    'price': [50, 55, 60, 65]
}

nh_data = {
    'currency': ['EUR', 'EUR', 'EUR', 'EUR'],
    'tax_tier': [1000, 1000, 2000, 2000],
    'year': [2003, 2004, 2003, 2004],
    'price': [25, 30, 35, 40]
}

consumption_data = {
    'year': [2003, 2004],
    'hh_consumption': [1, 2],
    'nh_consumption': [2, 1]
}


class ElectricityTotalCostTest(TestCase):

    @classmethod
    def setUpClass(cls):
        cls.hh_elec_df = pd.DataFrame(hh_data)
        cls.nh_elec_df = pd.DataFrame(nh_data)
        cls.consumption_df = pd.DataFrame(consumption_data)

    def test_get_consumption_cost(self):
        hh_cost_1 = get_consumption_cost(elec_consumption=500, currency='EUR', year=2003, elec_price=self.hh_elec_df)
        self.assertEquals(hh_cost_1, 500 * 50)
        hh_cost_2 = get_consumption_cost(elec_consumption=1500, currency='EUR', year=2003, elec_price=self.hh_elec_df)
        self.assertEquals(hh_cost_2, 1500 * 60)

    def test_calculate_energy_cost(self):
        cost_2003 = calculate_energy_cost(wte=2700,
                                          year=2003,
                                          currency='EUR',
                                          elec_consumption=self.consumption_df,
                                          hh_elec=self.hh_elec_df,
                                          nh_elec=self.nh_elec_df)
        #hh tax tier up to 1000 is 50
        #nh tax tier from 1000 to 2000 is 25
        self.assertEquals(cost_2003, 900 * 50 + 1800 * 35)
        cost_2004 = calculate_energy_cost(wte=2700,
                                          year=2004,
                                          currency='EUR',
                                          elec_consumption=self.consumption_df,
                                          hh_elec=self.hh_elec_df,
                                          nh_elec=self.nh_elec_df)
        #hh tax tier in 2004  from 1000 to 20000 is 65
        #nh tax tier up in 2004 to 1000 is 30
        self.assertEquals(cost_2004, 1800 * 65 + 900 * 30)


unittest.main(argv=[''], verbosity=2, exit=False)


  self.assertEquals(cost_2003, 900 * 50 + 1800 * 35)
FAIL
test_get_consumption_cost (__main__.ElectricityTotalCostTest.test_get_consumption_cost) ... FAIL

FAIL: test_calculate_energy_cost (__main__.ElectricityTotalCostTest.test_calculate_energy_cost)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\yarde\AppData\Local\Temp\ipykernel_43912\2904036468.py", line 51, in test_calculate_energy_cost
    self.assertEquals(cost_2003, 900 * 50 + 1800 * 35)
AssertionError: 117000.0 != 108000

FAIL: test_get_consumption_cost (__main__.ElectricityTotalCostTest.test_get_consumption_cost)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\Users\yarde\AppData\Local\Temp\ipykernel_43912\2904036468.py", line 38, in test_get_consumption_cost
    self.assertEquals(hh_cost_1, 500 * 50)
AssertionError: 30000 != 25000

---------------------------------------------------

<unittest.main.TestProgram at 0x2a32b635050>

In [38]:
#calculating GGDP with current data
#Calculating GGDP using base formula

#emission cost
#carbon tax is us$/ton of CO2
finland_ggdp['co2_emission_cost'] = finland_data['finland_co2_kt'] * 1000 * finland_data['finland_carbon_tax']

# cost = calculate_energy_cost(wte=1250,
#                              year=2003,
#                              currency='EUR',
#                              elec_consumption=elec_consump_processed,
#                              hh_elec=elec_hh,
#                              nh_elec=elec_nh)


#waste cost
#need to add electricity price
wte_conversion_constant = 74  # kWh/ton of waste
total_wte: pd.Series = finland_data['net_waste'] * wte_conversion_constant
years_index = list(total_wte.index)
years_series = pd.Series({year: int(year) for year in years_index}, index=years_index)
years_series.name = 'year'
wte_df = pd.concat([total_wte, years_series], axis=1)


def test_func(x):
    print(f"year:{x['year']},net_waste:{x['net_waste']}")


finland_ggdp['waste_cost'] = wte_df.apply(lambda x: calculate_energy_cost(wte=x['net_waste'],
                                                                          year=x['year'], currency='EUR',
                                                                          elec_consumption=elec_consump_processed,
                                                                          hh_elec=elec_hh, nh_elec=elec_nh)
                                          , axis=1)

#finland_ggdp['waste_cost'] = wte_df.apply(lambda x: test_func(x), axis=1)

print('1234')
#finland_ggdp['waste_cost'] = finland_data['net_waste'] * 74 * finland_data['electricity_price']


#natural resource depletion
#finland_ggdp['nrd_cost'] = finland_data['finland_gni_ppp'] * (finland_data['as_nrd_gni']/100)


#finland_data.head()
# finland_ggdp['Finland','env_cost'] = 123 
# finland_ggdp['Finland', 'depletion_cost'] = 123
# finland_ggdp['Finland'] = finland_data['GDP'] - finland_data['env_cost'] - finland_data['']
# 
# 
# 
# finland_data.head()
# 
# #carbon_pricing_data.rename(columns = lambda x: str.lower(x.replace(' ', '_')), inplace=True)
# #dropping all metadata rows
# #carbon_pricing_data = carbon_pricing_data.iloc[7:]
# #carbon_pricing_data = carbon_pricing_data.apply(pd.to_numeric)
# 
# #eu_ets = carbon_pricing_data['eu_ets']
# #finland_carbon_tax = carbon_pricing_data['finland_carbon_tax']
# #Finland has two types of carbon tax: Transport fuels, heating fuels which are the same across the whole series
# #finland_carbon_tax = finland_carbon_tax.iloc[:,0]


KeyboardInterrupt: 