<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#TODOs" data-toc-modified-id="TODOs-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>TODOs</a></span></li><li><span><a href="#Questions" data-toc-modified-id="Questions-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Questions</a></span></li><li><span><a href="#Links" data-toc-modified-id="Links-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Links</a></span></li><li><span><a href="#Interpreting-the-data" data-toc-modified-id="Interpreting-the-data-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Interpreting the data</a></span><ul class="toc-item"><li><span><a href="#EBA-data-columns" data-toc-modified-id="EBA-data-columns-4.1"><span class="toc-item-num">4.1&nbsp;&nbsp;</span>EBA data columns</a></span></li></ul></li><li><span><a href="#Load" data-toc-modified-id="Load-5"><span class="toc-item-num">5&nbsp;&nbsp;</span>Load</a></span></li><li><span><a href="#Loading-bulk-datasets" data-toc-modified-id="Loading-bulk-datasets-6"><span class="toc-item-num">6&nbsp;&nbsp;</span>Loading bulk datasets</a></span><ul class="toc-item"><li><span><a href="#Co2-dataset" data-toc-modified-id="Co2-dataset-6.1"><span class="toc-item-num">6.1&nbsp;&nbsp;</span>Co2 dataset</a></span></li><li><span><a href="#Elec-dataset" data-toc-modified-id="Elec-dataset-6.2"><span class="toc-item-num">6.2&nbsp;&nbsp;</span>Elec dataset</a></span></li><li><span><a href="#Playing-with-the-data" data-toc-modified-id="Playing-with-the-data-6.3"><span class="toc-item-num">6.3&nbsp;&nbsp;</span>Playing with the data</a></span><ul class="toc-item"><li><span><a href="#Residuals-of-this-equation-per-BA" data-toc-modified-id="Residuals-of-this-equation-per-BA-6.3.1"><span class="toc-item-num">6.3.1&nbsp;&nbsp;</span>Residuals of this equation per BA</a></span></li><li><span><a href="#Visualize-some-BAs" data-toc-modified-id="Visualize-some-BAs-6.3.2"><span class="toc-item-num">6.3.2&nbsp;&nbsp;</span>Visualize some BAs</a></span></li></ul></li></ul></li><li><span><a href="#Visualizing-MEFs" data-toc-modified-id="Visualizing-MEFs-7"><span class="toc-item-num">7&nbsp;&nbsp;</span>Visualizing MEFs</a></span></li><li><span><a href="#Demand-MEF" data-toc-modified-id="Demand-MEF-8"><span class="toc-item-num">8&nbsp;&nbsp;</span>Demand MEF</a></span></li><li><span><a href="#Plotting-some-time-series" data-toc-modified-id="Plotting-some-time-series-9"><span class="toc-item-num">9&nbsp;&nbsp;</span>Plotting some time series</a></span></li><li><span><a href="#Identifying-good-use-cases" data-toc-modified-id="Identifying-good-use-cases-10"><span class="toc-item-num">10&nbsp;&nbsp;</span>Identifying good use cases</a></span><ul class="toc-item"><li><span><a href="#high-low-availability-of-sun/wind" data-toc-modified-id="high-low-availability-of-sun/wind-10.1"><span class="toc-item-num">10.1&nbsp;&nbsp;</span>high-low availability of sun/wind</a></span></li></ul></li></ul></div>

# TODOs

- How do you compute MEFs for the general + for each fuel type? (see the literature)
- How do those MEFs vary over time? Can you have an estimate for each hour of the day, for instance? 
- How do you compute the cost functions? 
- How do you retrieve the **relative** carbon intensity? Instead of the total carbon consumption? (I guess carbon_total/demand)?

Why is the MEF generation the same as the MEF demand? does that make sense? 

# Questions

@Anthony: those are notes to self. 

- Are we looking for long range interactions? 
- Automated discovery of production mix at the node level (based on energy trade)? 
- What fraction of the clealiness is attributable to your own production vs imports. What is the biggest contributor to the metric that you care about? How do you steer imports to optimize your own carbon benefit? 
- How to scale local energy mix to benefit globally? 
- What is the correlation with weather data? 

- Do we want total carbon consumption or only carbon intensity? 

# Links

web.stanford.edu/~jdechale/emissions_app/

# Interpreting the data

@Anthony: 
You will see that the columns of both datasets have specific names. 

They are usually a combination of 
- Balacing Authorities ID (e.g. CISO) 
- A tag explaining what we are looking at. See below for a detail of those tags. 
- For electricity, the type of generation. 


EBA data columns
----------------
- D: Demand
- NG: Net Generation
- TI: Total Interchange - (positive if exports)
- ID: Interchange with directly connected balancing authorities - (positive
    if exports)

# Load

In [None]:
%load_ext autoreload
%autoreload 2

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import os

from plotting_utils import plot_mef

In [None]:
from sklearn.linear_model import LinearRegression

In [None]:
DATA_PATH = os.getenv('CARBON_NETWORKS_DATA')

# Loading bulk datasets

In [None]:
fnm_co2 = os.path.join(DATA_PATH, 'EBA_co2.csv')
fnm_elec = os.path.join(DATA_PATH, 'EBA_elec.csv')

In [None]:
df_co2 = pd.read_csv(fnm_co2, index_col=0, parse_dates=True)

df_elec = pd.read_csv(fnm_elec, index_col=0, parse_dates=True)

## Co2 dataset

In [None]:
df_co2.head()

@Anthony: for instance, the first column above gives the CO2-emissions exchanged between AEC and MISO. 

In [None]:
#extracting the names of the BAs present in the dataset
nms = []
fields = []

for c in df_co2.columns:
    
    nms.append(c.split('_')[1])
    fields.append(c.split('_')[-1])

BAs = []
for nm in nms:
    if '-' in nm:
        pass
    else:
        BAs.append(nm)

In [None]:
BAs = np.unique(BAs)

In [None]:
print("These are the names of the Balancing Authorities present in the dataset")
print(BAs)
print(f"\nThere are {len(BAs)} BAs.")

## Elec dataset

In [None]:
df_elec.head()

@Anthony: the format of this electricity dataset is a bit different. The format of the column name is 
- **EBA.ba_name-ALL.[D,NG,TI].H** for the [demand, net_generation, total_interchange] hourly for BA ba_name
- **EBA.ba_name-other_ba_name.ID.H** for the interchange between ba_name and other_ba_name

Also, you will have a column like **EBA.ba_name-ALL.NG.SOURCE.H** where **SOURCE** is going to be water, wind, coal, ...

For instance, for **MISO**, we see that it exchanges with a bunch of BAs. We also know that it produces nuclear, oil, sun, hydro, wind, and "others"

In [None]:
for c in df_elec.columns:
    if 'MISO-' in c:
        print(c)

We can check that the columns **NG.H** is the sum of columns **NG.SOURCE.H**. 


## Playing with the data

Here we check that the conservation equation is verified in the data. 

In [None]:
eba = 'AECI'

In [None]:
D_col = f'EBA.{eba}-ALL.D.H'
TI_col = f'EBA.{eba}-ALL.TI.H'
NG_col = f'EBA.{eba}-ALL.NG.H'

In [None]:
df_elec[NG_col].head()

In [None]:
(df_elec[D_col] + df_elec[TI_col]).head()

The equation at each node is simply:
$$
D + TI = NG
$$


### Residuals of this equation per BA

In [None]:
stats_res = dict()

for ba in BAs:

    D_col = f'EBA.{ba}-ALL.D.H'
    TI_col = f'EBA.{ba}-ALL.TI.H'
    NG_col = f'EBA.{ba}-ALL.NG.H'
    res = df_elec[D_col] + df_elec[TI_col] - df_elec[NG_col]
    mean_res = np.mean(res)
    std_res = np.std(res)
    
    stats_res[ba] = (mean_res, std_res)

We see that the above equation satisfied for all BAs in the dataset. 

For each BA, we see the (mean residual, std of residuals)

In [None]:
stats_res

### Visualize some BAs

In [None]:
ba = 'CISO'

In [None]:
D_col = f'EBA.{ba}-ALL.D.H'
TI_col = f'EBA.{ba}-ALL.TI.H'
NG_col = f'EBA.{ba}-ALL.NG.H'

In [None]:
plt.plot(df_elec[D_col].values[-100:], label = 'Demand')
plt.plot(df_elec[NG_col].values[-100:], label = 'NetGeneration')
plt.xlabel("Time")
plt.ylabel("Energy [units]")
plt.grid()


# Visualizing MEFs

I think the equation for MEF is usually: 
$$
 \Delta E\propto \Delta G (MEF)
$$
(I would have to check), where $\Delta G$ is the change in generation and $\Delta E$ is the change in emissions. 

Let us try to visualize the MEF for one BA in particular

In [None]:
ba = 'MISO'

_ = plot_mef(ba, df_elec, df_co2, which='generation')

The slope of the linear regression to the above point cloud is the MEF. 

However, for some BAs with a lot of renewable generation (e.g. CISO), the MEF is not as well defined. We would have to see how this very value of the MEF changes depending on the time of the day for instance. (as the share of renewables is much different from hour to hour I am guessing). 

In [None]:
ba = 'CISO'

_ = plot_mef(ba, df_elec, df_co2, which='generation')

In [None]:
ba = 'CISO'

_ = plot_mef(ba, df_elec, df_co2, which='net_generation')

In [None]:
NG_col = f'EBA.{ba}-ALL.NG.H'

In [None]:
WND_col = f'EBA.{ba}-ALL.NG.WND.H'

SUN_col = f'EBA.{ba}-ALL.NG.SUN.H'

In [None]:
df_elec[NG_col] - df_elec[[WND_col, SUN_col]].sum(axis=1)

In [None]:
df_elec.columns

# Demand MEF

Here we do the same as above for demand

We define the demand_MEF as 
$$
 \Delta E\propto \Delta D 
$$

Let us try to visualize the MEF for one BA in particular

In [None]:
ba = 'MISO'

In [None]:
ba_, ba_co2 = plot_mef(ba, df_elec, df_co2, which='demand')
plt.savefig(os.path.join(DATA_PATH, "Figures", f"ba_demand.png"), dpi=200)

In [None]:
ba_, ba_co2 = plot_mef(ba, df_elec, df_co2, which='net_demand')

In [None]:
ba = 'CISO'

In [None]:
ba_, ba_co2 = plot_mef(ba, df_elec, df_co2, which='demand')

In [None]:
ba_, ba_co2 = plot_mef(ba, df_elec, df_co2, which='net_demand')

# Plotting some time series

units are: 
- elec: MWh
- carbon: kg


In [None]:
col = f'EBA.CISO-ALL.D.H'

In [None]:
plt.plot(df_elec[col])

# Identifying good use cases

## high-low availability of sun/wind

In [None]:
def build_resource_all(df_elec, df_co2):
    resources = extract_resources(df_elec)
    BAs = extract_BAs(df_co2)
    
    df_resources = pd.DataFrame(columns=resources, index = df_elec.index)
    
    for res in resources:
        cols_res = []
        for ba in BAs:
            col_res = f'EBA.{ba}-ALL.NG.{res}.H'
            if col_res in df_elec.columns:
                cols_res.append(col_res)
                
        df_resources[res] = df_elec[cols_res].sum(axis=1)
    return df_resources

In [None]:
df_resources = df_resources.loc['2020-07-01':'2020-09-30']

In [None]:
_, ax = plt.subplots(figsize = (15, 10))
for c in df_resources.columns:
    ax.plot(df_resources[c], label=c)
    
plt.legend()

In [None]:
plt.plot(df_resources.sum(axis=1))

In [None]:
df_resources.index[df_resources.sum(axis=1).argmax()]

In [None]:
df_ren = df_resources[['SUN', 'WND']].sum(axis=1)

In [None]:
plt.figure(figsize = (15, 10))
plt.plot(df_ren)

In [None]:
df_ren.index[df_ren.argmax()]