# Total Emissions
In this notebook we are going to make an easy computation of the total emissions produced per each scenario. The idea will be to create two resulting dataframes, one with the overall emissions per time-step, and the other one should include the emissions per region.

After this, the main goal will be to find 1) the best scenario for the entire country and 2) the best scenario per region.

Just recall that the data available is from 2018 actually, since that is the actual outcome of the agent-based model.

In [1]:
import numpy as np
import pandas as pd
import warnings

import datetime as dt
import time

import os
from os import listdir
from os.path import isfile,join
import matplotlib.pyplot as plt

### Defining path variables
We are going to read/write files from these variables. Make sure you modify them according to where your data is located! Notice that in PATH2 we store the data that is transformed in the jupyter notebook named `dataset_exclud_hausholdType`. Here we have .csv files that exclude the household type.

In [2]:
PATH = '/home/moni/Documents/motmo/timeSeries_files/' # original data
PATH2 = '/home/moni/Documents/motmo/data_without_hhID/' # folder in which we will store transformed data

### List with the codes of the scenarios
Tghe following retrieves a list of strings of the names of the files of each scenario, removing the `.csv` extension and the beginning `timeSeries__`.

In [3]:
def list_file_names():
    file_names = [f for f in listdir(PATH2) if isfile(join(PATH2, f))]
    # file_names.remove('.~lock.timeSeries_CH0SP0SE0WE0BP0RE0CO0DI0WO0CS0.csv#')
    return file_names
def list_str_scenarios(file_names):# gets the stringignoring the csv extension
    sc_str = [name.replace(".csv", "") for name in file_names]
    sc_str = [name.replace("timeSeries_", "") for name in sc_str]
    return sc_str

In this function, I want to add a new column called `total_emissions` to each of the files located in `PATH2`. 

IF I HAVE TIME: Additionally (maybe), I want to make a correspondence with the step number and the date and add such list at the beginning.

In [4]:
def read_csv_clean(file_name):
    df = pd.read_csv(PATH2 + file_name)
    mask = df['step']>=13 # we start in 2018
    df = df[mask]
    # dates_list = pd.date_range(start = "2018-01-01",periods = 181,freq="2M").strftime("%b-%Y").tolist()
    # ind_name = dict(zip(list(range(0,163)),dates_list))
    df.drop(columns=df.columns[0], axis=1, inplace=True)
    return df

In [5]:
def add_emissions_col(df):
    df['total_emissions'] = df.iloc[:,2:7].sum(axis=1)
    return df

In [7]:
def save_csv_emissions(df, file_name):
    if 'total_emissions' not in df:
        print(f"'WARNING: {file_name}' has no column 'total_emissions'. Make sure you have added it!!!")
        #warnings.warn('You have not added the total emissions column to this dataframe!!!')
    else:
        df.to_csv(PATH2 + file_name)    

In [8]:
def save_emi_all_files(file_names):
    scenarios = list_str_scenarios(file_names).copy()
    emis_df = pd.DataFrame(scenarios, columns=['scenario']) # new dataframe that stores the total emissions of each scenario
    t_emissions = []
    for file_name in file_names:
        df = read_csv_clean(file_name)
        df['total_emissions'] = df.iloc[:,2:7].sum(axis=1) # sum the emissions and adds it in a new column
        value = df['total_emissions'].sum()
        t_emissions.append(value)
        save_csv_emissions(df, file_name) # saves (replaces) this dataframe
    emis_df['total_emissions'] = pd.DataFrame(t_emissions)
    return emis_df

In [5]:
f_names = list_file_names().copy()

### Saving all files
Careful! Only uncomment these lines if necessary because it overwrites and takes some time!!

In [101]:
# e_df = save_emi_all_files(f_names)
# emi_df = e_df.to_csv('total_emissions.csv',index=False)

###  Getting the best scenario
Here we find the best scenario (in terms of emissions)

In [98]:
e_df = pd.read_csv("total_emissions.csv")
e_df[ e_df['total_emissions'] == e_df['total_emissions'].min() ]

Unnamed: 0,scenario,total_emissions
185,CH1SP1SE0WE1BP0RE1CO1DI1WO0CS0,2049150000000.0


# Best Scenario

The best scenario that we got is 
<center>CH1SP1SE0WE1BP0RE1CO1DI1WO0CS0</center>

Which corresponds to 
- Charging infrastructure (CH)
- Public Transport Subsidy (SP)
- Car Weight regulation (WE)
- Urban Combustion Restrictions (RE)
- Higher Gas Price (CO)
- Intermodal Digitalisation (DI)

## Finding best scenario per region
Now we want to find the best possible scenario for each region. For that, we will create a new dataframe that sums the total emissions produced per each scenario

In [None]:
# df_test = read_csv_clean('timeSeries_CH0SP0SE1WE0BP0RE0CO0DI0WO0CS0.csv')
# df_test

Unnamed: 0,step,reID,emissions_C,emissions_E,emissions_N,emissions_P,emissions_S,stock_C,stock_E,stock_N,stock_P,stock_S,total_emissions
0,13,942,4.711625e+08,1.008459e+05,265.228918,2.632302e+07,3.280411e+05,38009,6,3334,5303,23,4.979146e+08
1,13,1515,2.666598e+09,1.547238e+06,1789.451017,2.318856e+08,2.181351e+06,222710,93,22928,47385,161,2.902214e+09
2,13,1516,1.665830e+09,4.216348e+05,1160.045302,1.017940e+08,6.906481e+05,137794,20,15195,20713,56,1.768737e+09
3,13,1517,9.394458e+08,5.126730e+05,612.850403,6.383884e+07,7.770476e+05,78415,25,7709,13203,53,1.004575e+09
4,13,1518,8.234238e+07,8.551655e+03,37.387628,6.216551e+06,6.334081e+03,7352,1,499,1396,1,8.857385e+07
...,...,...,...,...,...,...,...,...,...,...,...,...,...
2683,180,2335,4.019870e+08,8.707797e+06,92.905262,1.153707e+07,6.411415e+05,35303,1558,1068,3331,46,4.228731e+08
2684,180,2336,6.846575e+08,2.028399e+07,174.683810,1.705727e+07,1.001548e+06,57802,3871,2165,5039,56,7.230005e+08
2685,180,3312,2.989731e+08,6.056278e+06,74.912309,7.128699e+06,2.995310e+05,24716,1057,804,2039,19,3.124577e+08
2686,180,3562,5.165399e+08,1.792033e+07,188.542934,1.672108e+07,1.573477e+07,38042,2940,2152,4203,1436,5.669163e+08


In [8]:
# df_sum = df_test.groupby(by=["reID"]).sum()
# df_sum2 = df_sum.iloc[:,1:6].sum(axis=1)
# df_sum2

reID
942     8.321237e+10
1515    4.835675e+11
1516    2.936653e+11
1517    1.670008e+11
1518    1.480150e+10
1519    7.828534e+10
1520    3.525042e+10
2331    1.111796e+11
2332    3.131817e+10
2333    3.350340e+11
2334    4.653450e+10
2335    6.931632e+10
2336    1.185366e+11
3312    5.101551e+10
3562    9.282005e+10
6321    2.520081e+11
dtype: float64

### Getting emissions per scenario
This function takes as input the file name and returns a list with the total emissions per region (the first entry is the file name).

In [21]:
def get_emis_region(file_name):
    in_df = read_csv_clean(file_name)
    in_df = in_df.groupby(by=["reID"]).sum()
    df = pd.DataFrame(in_df).reset_index()
    df = df.rename(columns={"reID": "reID", 0: "total_emissions"})
    # e_list = [file_name]
    e_list = [file_name] + df['total_emissions'].tolist()
    return e_list#, df

In [48]:
#  TESTING THE FUNCTION
get_emis_region('timeSeries_CH0SP0SE1WE0BP0RE0CO0DI0WO0CS0.csv')

['timeSeries_CH0SP0SE1WE0BP0RE0CO0DI0WO0CS0.csv',
 83212373678.34177,
 483567485787.0221,
 293665276927.7161,
 167000849346.54013,
 14801501193.497364,
 78285344986.8881,
 35250421473.40773,
 111179587097.51816,
 31318168037.307335,
 335034024244.6618,
 46534498136.36508,
 69316321821.74915,
 118536573118.93947,
 51015513367.2347,
 92820054628.933,
 252008101884.3359]

In [27]:
def df_emissions_per_region(file_names):
    cols = ["Scenario", 942, 1515, 1516, 1517, 1518, 1519, 1520, 2331, 2332, 2333, 2334, 2335, 2336, 3312, 3562, 6321]
    data = []
    i_df = pd.DataFrame(columns=cols) # empty df, column names
    for file_name in file_names:
        values = get_emis_region(file_name).copy()
        zipped = zip(cols, values)
        dictionary = dict(zipped)
        data.append(dictionary)
    df = i_df.append(data, True)
    return df

In [40]:
f_names = list_file_names().copy()
df2 = df_emissions_per_region(f_names)
df2

### Save emigions per region into .csv

In [30]:
df2.to_csv("t_emissions_per_region.csv") 

### Best Scenario per region

In [43]:
df2
# df2[ df2[942] == e_df['total_emissions'].min() ]

Unnamed: 0,Scenario,942,1515,1516,1517,1518,1519,1520,2331,2332,2333,2334,2335,2336,3312,3562,6321
0,timeSeries_CH1SP0SE1WE0BP1RE0CO0DI0WO0CS0.csv,8.287715e+10,4.791555e+11,2.923511e+11,1.663667e+11,1.481406e+10,7.807361e+10,3.453715e+10,1.110764e+11,3.099496e+10,3.321040e+11,3.931118e+10,6.906529e+10,1.183519e+11,5.072070e+10,9.219284e+10,2.512204e+11
1,timeSeries_CH1SP0SE1WE1BP1RE0CO1DI1WO0CS0.csv,7.703438e+10,4.409830e+11,2.714543e+11,1.540859e+11,1.367005e+10,7.268995e+10,3.166130e+10,1.036319e+11,2.887594e+10,3.101978e+11,3.819820e+10,6.378421e+10,1.090693e+11,4.705976e+10,8.576036e+10,2.340310e+11
2,timeSeries_CH0SP1SE0WE0BP0RE1CO0DI0WO1CS0.csv,8.233464e+10,4.688859e+11,2.905846e+11,1.644619e+11,1.461058e+10,7.728690e+10,3.380410e+10,1.108396e+11,3.105716e+10,3.318644e+11,4.356125e+10,6.814129e+10,1.167577e+11,5.022205e+10,9.030387e+10,2.499328e+11
3,timeSeries_CH1SP1SE0WE0BP1RE0CO0DI1WO1CS0.csv,8.189221e+10,4.658389e+11,2.883727e+11,1.634860e+11,1.436261e+10,7.660255e+10,3.348400e+10,1.099273e+11,3.068086e+10,3.293291e+11,4.251722e+10,6.736669e+10,1.154093e+11,4.958018e+10,8.938101e+10,2.482349e+11
4,timeSeries_CH1SP0SE1WE0BP0RE1CO0DI1WO0CS1.csv,8.173671e+10,4.571841e+11,2.870266e+11,1.624281e+11,1.433210e+10,7.732242e+10,3.228298e+10,1.100302e+11,3.082687e+10,3.287710e+11,4.423124e+10,6.784161e+10,1.155854e+11,5.032042e+10,8.997832e+10,2.486447e+11
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
534,timeSeries_CH0SP0SE0WE0BP1RE1CO0DI0WO1CS0.csv,8.249523e+10,4.681835e+11,2.900356e+11,1.644987e+11,1.459115e+10,7.797563e+10,3.344495e+10,1.108274e+11,3.107796e+10,3.289404e+11,3.677672e+10,6.884330e+10,1.174786e+11,5.067626e+10,9.118336e+10,2.501713e+11
535,timeSeries_CH0SP0SE0WE0BP1RE1CO1DI1WO0CS0.csv,8.079713e+10,4.506976e+11,2.832715e+11,1.599014e+11,1.408672e+10,7.680109e+10,3.131145e+10,1.085380e+11,3.029495e+10,3.235586e+11,3.978083e+10,6.693206e+10,1.140477e+11,4.970661e+10,8.916636e+10,2.456081e+11
536,timeSeries_CH0SP0SE1WE0BP1RE1CO1DI0WO1CS0.csv,8.190671e+10,4.650225e+11,2.879337e+11,1.634908e+11,1.452290e+10,7.730332e+10,3.333802e+10,1.098973e+11,3.080688e+10,3.277271e+11,3.756919e+10,6.808287e+10,1.162939e+11,5.032396e+10,9.089331e+10,2.484566e+11
537,timeSeries_CH0SP0SE0WE1BP0RE0CO0DI1WO0CS0.csv,7.706907e+10,4.443658e+11,2.717414e+11,1.549125e+11,1.374850e+10,7.246567e+10,3.247419e+10,1.034488e+11,2.895406e+10,3.115655e+11,4.313042e+10,6.398096e+10,1.095489e+11,4.704566e+10,8.568518e+10,2.341472e+11


In [72]:
def df_best_scen_per_region(df = df2): # takes as default value the dataframe we found earlier
    regions = [942, 1515, 1516, 1517, 1518, 1519, 1520, 2331, 2332, 2333, 2334, 2335, 2336, 3312, 3562, 6321]
    best_df = pd.DataFrame(columns=["reID", "best_scenario"])
    best_df["reID"]=regions
    ls_scenario = []
    for region in regions:
        ls_scenario = ls_scenario + df.loc[df2[region] == df2[region].min(), 'Scenario'].tolist()
    best_df["best_scenario"] = list_str_scenarios(ls_scenario)
    return best_df

In [73]:
df_best = df_best_scen_per_region(df2)
df_best

Unnamed: 0,reID,best_scenario
0,942,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
1,1515,CH1SP1SE0WE1BP0RE1CO1DI1WO0CS0
2,1516,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
3,1517,CH0SP1SE0WE1BP0RE1CO1DI1WO0CS0
4,1518,CH1SP1SE0WE1BP0RE1CO0DI1WO1CS0
5,1519,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
6,1520,CH1SP1SE0WE1BP0RE1CO0DI1WO1CS0
7,2331,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
8,2332,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
9,2333,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0


In [74]:
print(df_best['best_scenario'].unique())

['CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0' 'CH1SP1SE0WE1BP0RE1CO1DI1WO0CS0'
 'CH0SP1SE0WE1BP0RE1CO1DI1WO0CS0' 'CH1SP1SE0WE1BP0RE1CO0DI1WO1CS0'
 'CH0SP1SE0WE0BP1RE1CO1DI0WO0CS0']


## There are 5 unique best scenarios

**Scenario 1**: CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0

In [49]:
region_names = ["Schleswig-Holstein","Nordrhein-Westfalen","Baden-Wurttemberg","Hessen","Bremen","Thuringen",
"Hamburg","Rheinland-Pfalz","Saarland","Bayern","Berlin","Sachsen-Anhalt","Sachsen","Mecklenburg-Vorpommern",
"Brandenburg","Niedersachsen"]
region_ids = [942, 1515, 1516, 1517, 1518, 1519, 1520, 2331, 2332, 2333, 2334, 2335, 2336, 3312, 3562, 6321]
r_dict = dict(zip(region_ids,region_names))
r_dict

{942: 'Schleswig-Holstein',
 1515: 'Nordrhein-Westfalen',
 1516: 'Baden-Wurttemberg',
 1517: 'Hessen',
 1518: 'Bremen',
 1519: 'Thuringen',
 1520: 'Hamburg',
 2331: 'Rheinland-Pfalz',
 2332: 'Saarland',
 2333: 'Bayern',
 2334: 'Berlin',
 2335: 'Sachsen-Anhalt',
 2336: 'Sachsen',
 3312: 'Mecklenburg-Vorpommern',
 3562: 'Brandenburg',
 6321: 'Niedersachsen'}

In [83]:
df_best["reID"]=region_names
df_best

Unnamed: 0,reID,best_scenario
0,Schleswig-Holstein,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
1,Nordrhein-Westfalen,CH1SP1SE0WE1BP0RE1CO1DI1WO0CS0
2,Baden-Wurttemberg,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
3,Hessen,CH0SP1SE0WE1BP0RE1CO1DI1WO0CS0
4,Bremen,CH1SP1SE0WE1BP0RE1CO0DI1WO1CS0
5,Thuringen,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
6,Hamburg,CH1SP1SE0WE1BP0RE1CO0DI1WO1CS0
7,Rheinland-Pfalz,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
8,Saarland,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0
9,Bayern,CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0


In [96]:
def get_unique_scenarios(best_df):
    best_scenarios = best_df['best_scenario'].unique()
    df = pd.DataFrame(columns = best_scenarios)
    data = []
    for scenario in best_scenarios:
        regions = best_df.loc[best_df['best_scenario'] == scenario, 'reID'].tolist()
        dictionary = {scenario : regions}
        data.append(dictionary)
    return data

In [97]:
dd = get_unique_scenarios(df_best)
dd

[{'CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0': ['Schleswig-Holstein',
   'Baden-Wurttemberg',
   'Thuringen',
   'Rheinland-Pfalz',
   'Saarland',
   'Bayern',
   'Mecklenburg-Vorpommern',
   'Niedersachsen']},
 {'CH1SP1SE0WE1BP0RE1CO1DI1WO0CS0': ['Nordrhein-Westfalen', 'Brandenburg']},
 {'CH0SP1SE0WE1BP0RE1CO1DI1WO0CS0': ['Hessen', 'Sachsen-Anhalt', 'Sachsen']},
 {'CH1SP1SE0WE1BP0RE1CO0DI1WO1CS0': ['Bremen', 'Hamburg']},
 {'CH0SP1SE0WE0BP1RE1CO1DI0WO0CS0': ['Berlin']}]

Unnamed: 0,reID,best_scenario
0,Schleswig-Holstein,timeSeries_CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0.csv
1,Nordrhein-Westfalen,timeSeries_CH1SP1SE0WE1BP0RE1CO1DI1WO0CS0.csv
2,Baden-Wurttemberg,timeSeries_CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0.csv
3,Hessen,timeSeries_CH0SP1SE0WE1BP0RE1CO1DI1WO0CS0.csv
4,Bremen,timeSeries_CH1SP1SE0WE1BP0RE1CO0DI1WO1CS0.csv
5,Thuringen,timeSeries_CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0.csv
6,Hamburg,timeSeries_CH1SP1SE0WE1BP0RE1CO0DI1WO1CS0.csv
7,Rheinland-Pfalz,timeSeries_CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0.csv
8,Saarland,timeSeries_CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0.csv
9,Bayern,timeSeries_CH1SP0SE0WE1BP0RE0CO0DI0WO0CS0.csv
