## Choosing EU ETS vs Carbon TAX
EU ETS start only at 2005 and Carbon Tax depends on countries that have already implemented it but could go as early as 1990.

In [1]:
import pandas as pd
import statsmodels.api as sm

carbon_pricing_data = pd.read_excel('./ggdp_data/world_bank_carbon_pricing.xlsx', header=1, sheet_name='Compliance_Price', na_values='-')
carbon_pricing_data = carbon_pricing_data.set_index('Name of the initiative').T
carbon_pricing_data.rename(columns = lambda x: str.lower(x.replace(' ', '_')), inplace=True)
#dropping all metadata rows
carbon_pricing_data = carbon_pricing_data.iloc[7:]
carbon_pricing_data = carbon_pricing_data.apply(pd.to_numeric)
carbon_pricing_data.index = [str(x) for x in carbon_pricing_data.index]

eu_ets = carbon_pricing_data['eu_ets']
finland_carbon_tax = carbon_pricing_data['finland_carbon_tax']
#Finland has two types of carbon tax: Transport fuels, heating fuels which are the same across the whole series
finland_carbon_tax = finland_carbon_tax.iloc[:,0]

In [2]:
#checking to see if EU ETS can proxy for carbon tax or the other way round. Does any one of them hold and additional information?
ets_joined_carbon_tax = pd.DataFrame(eu_ets).join(finland_carbon_tax)
ets_joined_carbon_tax = ets_joined_carbon_tax.dropna()
ets_joined_carbon_tax.head()

model = sm.OLS.from_formula('eu_ets ~ finland_carbon_tax', data=ets_joined_carbon_tax)
results = model.fit()
results.summary()


  res = hypotest_fun_out(*samples, **kwds)


0,1,2,3
Dep. Variable:,eu_ets,R-squared:,0.165
Model:,OLS,Adj. R-squared:,0.116
Method:,Least Squares,F-statistic:,3.365
Date:,"Fri, 03 May 2024",Prob (F-statistic):,0.0842
Time:,16:06:21,Log-Likelihood:,-86.898
No. Observations:,19,AIC:,177.8
Df Residuals:,17,BIC:,179.7
Df Model:,1,,
Covariance Type:,nonrobust,,

0,1,2,3,4,5,6
,coef,std err,t,P>|t|,[0.025,0.975]
Intercept,7.4432,11.154,0.667,0.514,-16.090,30.976
finland_carbon_tax,0.3522,0.192,1.834,0.084,-0.053,0.757

0,1,2,3
Omnibus:,6.489,Durbin-Watson:,0.533
Prob(Omnibus):,0.039,Jarque-Bera (JB):,4.314
Skew:,1.139,Prob(JB):,0.116
Kurtosis:,3.508,Cond. No.,114.0


### Analysing Results
I was looking to see if ets and carbon tax are very correlated, if they are then I could probably assume that one of them proxies for the other. We can see that linear relationship between the variables is insignificant. Meaning that they don't proxy linearly for each other. Thus I should probably use one of them or a combination of both when calculating the GGDP results.

## Calculating GGDP
The base formula which I'm going to use is this: $ GGDP = GDP - KtCO_{2} \cdot P_{tCO_2} - Twaste \cdot 74 kWh \cdot P_{1 kWh elec} - GNI \cdot \% NRD/100 $

First term - air pollution
Second Term - waste pollution translated into electricity cost
Third Term - Natural resource depletion

The 74 figure is based on different papers: check this

Managed to find all the data for this: arises the question of what data to include and how.

The problems with this base formula are:
1. doesn't account for defensive costs - the cost of restoring and protecting the environment. Need to find figures for this.
2. the figure 74 isn't really based on anything concrete (the citing is quite unclear)
3. no accounting for green innovation - not even sure if I want to add this, but perhaps I should

The main problem with actually deriving GGDP is data availability and lack of accounting standard that can be implemented using available data.


In [None]:
#Loading Data for ggdp
finland_ggdp = pd.DataFrame()

wdi_data = pd.read_excel('./ggdp_data/P_Data_Extract_From_World_Development_Indicators.xlsx', header=0, sheet_name='Data', na_values='..')
wdi_data = wdi_data.set_index('Country Name').T
wdi_data = wdi_data.drop(['Series Code'])
wdi_data = wdi_data.rename({"Series Name":"variable_name"})
wdi_data = wdi_data.rename(index= lambda x: x[:y]  if (y:=x.find(' ')) != -1 else x)

def replace_values_from_dict(curr_value,value_dict:dict):
    try:
        return value_dict[curr_value]
    except KeyError:
        return curr_value
    
    

wdi_variables_to_replace = {'GNI, PPP (current international $)':'gni_ppp',
                            'Adjusted savings: carbon dioxide damage (% of GNI)':'as_co2_damage_gni',
                            'Adjusted savings: natural resources depletion (% of GNI)': 'as_nrd_gni',
                            'Adjusted savings: particulate emission damage (% of GNI)': 'as_ped_gbi',
                            'GNI (current US$)' : 'gni_dollar',
                            'CO2 emissions (kt)' : 'co2_kt',
                            'Total greenhouse gas emissions (kt of CO2 equivalent)':'total_gge_kt',
                            'GDP, PPP (current international $)':'gdp_ppp',
                            'GDP (current US$)':'gdp_dollar'}

#renaming variables
wdi_data.loc['variable_name'] = (wdi_data.loc['variable_name']).apply(replace_values_from_dict, args=(wdi_variables_to_replace,))

wdi_data.loc['variable_name'] = 1234

#combine all finland data into one dataframe to avoid problems with dates
finland_data = wdi_data['Finland']
finland_data = finland_data.join(finland_carbon_tax, how = 'left')
finland_data.head()


#Calculating GGDP using base formula

#emission cost
finland_ggdp['co2_emission_cost'] = finland_data['emissions'] * finland_carbon_tax

# finland_ggdp['Finland','env_cost'] = 123 
# finland_ggdp['Finland', 'depletion_cost'] = 123
# finland_ggdp['Finland'] = finland_data['GDP'] - finland_data['env_cost'] - finland_data['']
# 
# 
# 
# finland_data.head()
# 
# #carbon_pricing_data.rename(columns = lambda x: str.lower(x.replace(' ', '_')), inplace=True)
# #dropping all metadata rows
# #carbon_pricing_data = carbon_pricing_data.iloc[7:]
# #carbon_pricing_data = carbon_pricing_data.apply(pd.to_numeric)
# 
# #eu_ets = carbon_pricing_data['eu_ets']
# #finland_carbon_tax = carbon_pricing_data['finland_carbon_tax']
# #Finland has two types of carbon tax: Transport fuels, heating fuels which are the same across the whole series
# #finland_carbon_tax = finland_carbon_tax.iloc[:,0]