# M5 Battery Background and Notebook Goals

This notebook was produced by Adam Morse on 1-5-25 and continues the exploration of the M5 Battery Dataset.

Refer to https://publications.rwth-aachen.de/record/985923/files/Report_04-2023.pdf for more background.

The purpose of the notebook is three-fold:
- Establish a more optimized method for storing the M5 data on disk and in memory
- Provide a high level summary of the battery system operation 
- Decode the primary control signals the BESS sends to the batteries. 
  - Frequency Containment Reserve (FCR) - a frequency support service the plant provides the grid
  - Set Point Adjustment (SPA) - a state of charge management algorithm 

# Optimizing Battery Data and Storage

## Loading Notebook Data

The M5 dataset is 1-second timeseries data that covers April 2023. It is divided between the two main system components:
- BESS (battery energy storage system)
- Data for 10 individually monitored batteries. 

### BESS Data Dictionary
| Variable | Description | Unit |
| ---- | ---- | ---- |
| DateAndTime | Date and Time | UTC Timezone ('yyyy-MM-dd HH:mm:ss')|
| M5BAT_P | Active power of M5BAT measured at the network node | kW (- = charging; + = discharging) |
| M5BAT_Q | Reactive power of M5BAT measured at the network node | kVAr |
| Grid_frequency | Grid frequency measured at the network node | mHz |
| Temperature | Ambient temperature at M5BAT site | 0.1Â°C |
| FCR_activated | Activation signal for FCR | True = FCR activated |
| FCR_P | Active Power for FCR (calculation) | kW (- = charging; + = discharging) | 
| FCR_control | Control band for FCR | kW |
| SPA_ask_P | Request for active power for setpoint adjustment | kW (- = charging; + = discharging) |
| SPA_exec_P | Active power for setpoint adjustment | kW (- = charging; + = discharging) |
| SOC | State of Charge for M5BAT (calculated) | %|
| Interpolated | Interpolation signal for data evaluation | True = Value linear interpolated|

### Battery Data Dictionary
| Variable | Description | Unit |
| --- | --- | ---- |
| DateAndTime | Date and Time | UTC Timezone ('yyyy-MM-dd HH:mm:ss')|
| P_AC_Set | Setpoint for active power of battery unit after the inverter | kW (- = charging; + = discharging) |
| Q_AC_Set | Setpoint for reactive power of battery unit after the inverter | kVAr |
| P_AC | Active power of battery unit after the inverter | kW (- = charging; + = discharging) |
| Q_AC | Reactive power of battery unit after the inverter | kVAr |
| SOC | State of Charge (BMS value) | 0.1% |
| I_DC_Batt | Current measured at battery unit | 0.1A |
| U_DC_Batt | Voltage measured at battery unit | 0.1V |
| Mode_PQ | Inverter Mode (Power output) | True / False |
| Mode_Stop | Inverter Mode (Stop) | True / False |
| Mode_Silent | Inverter Mode (Silent) | True / False |
| Mode_Wait | Inverter Mode (Switching) | True / False |
| Interpolated | Interpolation signal for data evaluation | True = Value linear interpolated |

**Before you run this notebook you must:**
- Download the 10 battery csv files and single bess file to your locale machine from this location: [M5 Data](https://publications.rwth-aachen.de/record/985923/files/M5BAT_04-2023_RAW.zip)
- Unzip the file
- Leave the `M5BAT_04-2023_RAW` in the current working directory with this notebook

In [None]:
from datetime import datetime
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots

The next couple cells read the downloaded csv files, and optimize them as parquet files for future use.

Running the cell creates one parquet file for the battery and (further down) one for the BESS data. So long as those files remain in the `M5BAT_04-2023_RAW` folder, they will be used for subsequent notebook usage--which improves loading speed.

The `batt_utils.py` file saved to the `src` folder performs the uploading and necessary manipulations.

### Loading Battery Data

In [None]:
from src.batt_utils import load_csvs_to_compact, compact_to_long, load_bess_data

In [None]:
%%time
all_batts_compact = load_csvs_to_compact()
all_batts_compact.head(2)

The battery parquet file is saved in what [Manu Joseph](https://github.com/PacktPublishing/Modern-Time-Series-Forecasting-with-Python) calls a *compact* format. It is more readable and compresses to a smaller file size more optimal for storage. It also makes it easier to included metadata like the battery power, capacity and chemistry.

However the data is more easily manipulated and analyzed in long, time-series format. The function below expands the `all_batts_compact` to a long-form timeseries dataframe `all_batts`.
- There are 12 columns for each of the 10 batteries.
- Metadata is dropped.

In [None]:
all_batts = compact_to_long(all_batts_compact)
all_batts.head(2)

In [None]:
print(f"Size of all_batts_compact: {all_batts_compact.memory_usage(deep=True).sum() / 1000000} MB")
print(f"Shape of all_batts_compact: {all_batts_compact.shape}")
print(f"Size of all_batts: {all_batts.memory_usage(deep=True).sum() / 1000000} MB")
print(f"Shape of all_batts: {all_batts.shape}")

### Loading BESS Data
For the BESS data we won't derive the compact format. The cell below creates a parquet file for the BESS data while loading the data to memory.

In [None]:
%%time
bess_raw = load_bess_data()
bess_raw.head(2)

In [None]:
print(f"Size of bess_raw: {bess_raw.memory_usage(deep=True).sum() / 1000000} MB")
print(f"Shape of bess_raw: {bess_raw.shape}")

# Battery Summary
Before getting started on the battery signals. Let's review some high-level details on the system's operation.

As the documentation mentions, there was a transformer outage Apr 3 that may impact analysis related to grid frequency (and maybe SOC). The interval was not removed from the dataset notebook to preserve a continuous index.

In [None]:
xfmr_outage = bess_raw[bess_raw['Grid_frequency'] == 0]
print(f"Transformer Outage time: {list(xfmr_outage.index[[0,-1]])}")                                     # identify first and last moments when grid frequency is zero
print(f"Outage as a Percent of Overall Run-time {len(xfmr_outage)/len(bess_raw) * 100: .2f}%")       # divide that interval by the length of the full dataset

In [None]:
# cell provides a summary of battery operation
P_filter = [col for col in all_batts.columns if 'P_AC_Set' in col]        # isolate columns with power setpoints
batt_meta_data = all_batts_compact.loc[:,'batt_chem':'batt_cell_volts']   # isolate metadata 
batt_idx = list(all_batts_compact.index)

def batt_indexes(df, new_names):        # func will be used to rename index
    df.index= new_names
    # df.columns= ['monthly_energy_kwh']
    return df

def gross_energy_kwh(df):
    return df.sum() /60 /60         # aggregation transform and unit change from kw-s to kwh  

def abs_energy_kwh(df):              # absolute value is used to avoid charging from cancelling discharging
    return df.abs().sum() /60 /60         # aggregation transform and unit change from kw-s to kwh

def derive_batt_cycles(df):
    df = df.join(batt_meta_data['batt_capacity_kwh'])
    df['cycles_performed'] = df.abs_energy_kwh / (df.batt_capacity_kwh*2)    # divide total energy exchanged by battery capacity *2
    return df

def peak_operating_kw(df):
    return df.abs().max()    

(all_batts
  .loc[:, P_filter]
  .agg([abs_energy_kwh])
  .pipe(lambda df: pd.concat([df, all_batts.loc[:, P_filter].agg([peak_operating_kw])]))  
  .pipe(lambda df: pd.concat([df, all_batts.loc[:, P_filter].agg([gross_energy_kwh])]))  
  .T
  .assign(rough_efficiency= lambda df: (1 - (abs(df.gross_energy_kwh) / df.abs_energy_kwh))*100)  # rough battery efficiency calc: assumes all batteries started/stopped at same SOC
  .pipe(batt_indexes, batt_idx)
  .pipe(derive_batt_cycles)
)

Exploring the State of Charge range the batteries occupy

In [None]:
batt_SOC_cols = [col for col in all_batts.columns if 'SOC' in col] 

fig = px.violin(all_batts.loc[:,batt_SOC_cols].resample('1h').mean().div(10))
fig.update_layout(xaxis_title=None, yaxis_title="State of Charge")
fig.show()

Exploring the average set points being sent to the batteries.

In [None]:
P_filter = [col for col in all_batts.columns if 'P_AC_Set' in col]        # isolate columns with power setpoints
batt_pow = (all_batts
            .loc[:,P_filter]
            .pipe(lambda df: df[df.ne(0)])                                # remove zero power entries
            .resample('1h').mean()                                        # preferred resampling interval
           )

fig = px.box(batt_pow)
fig.update_layout(xaxis_title=None, yaxis_title="Power Output kW")
fig.show()

# BESS Signal Analysis
The 'BESS' (battery energy storage system) is the primary node of control for power signals broadcasted to the battery inverters. The primary signals are:
- Frequency Containment Reserve (FCR)
- Set Point Adjustment (SPA)

In this notebook, we explore how the FCR and SPA signals are derived and explore their operational meaning. This question is different from how setpoints are allocated to the individual batteries (a very compelling question for a future notebook). Instead we're interested in understanding the plant-level setpoints. 

The cell below compares the global FCR and SPA signals (from the BESS) to the sum total of the power signals sent to the batteries. A new set point column `combined_batt_AC_Set` is added to the `all_batts` DataFrame to sum the 10 power signals.

This signal is visualized alongside the BESS' FCR and SPA execution signals. Use zoom function in Plotly to see that the combined battery signal is equal to the sum of the FCR and SPA signals.

In [None]:
batt_setpoint_cols = [col for col in all_batts.columns if 'P_AC_Set' in col]                   # isolate the power setpoint columns for each battery
all_setpoints = (all_batts
                        .loc[:, batt_setpoint_cols]
                        .iloc[0:50000]                                                         # limiting the range due to computational time needed
                        .join(bess_raw.loc[:,['FCR_P','SPA_ask_P','SPA_exec_P']])              # merge battery data with bess data
                        .assign(combined_batt_AC_Set= lambda df:df.loc[:, batt_setpoint_cols]  # isolate power signal columns in battery dataframe
                                                    .agg(func='sum', axis=1))                  # sum the battery power signals in a new column 
                        .drop(batt_setpoint_cols, axis=1)                                      # remove the clutter of the individual battery signals
                )

px.line(all_setpoints,                                                                         # explore other times, zoom in and explore data in plotly                                                                      
        y=['FCR_P', 'SPA_exec_P','combined_batt_AC_Set'])

By zooming in we confirm that a combined FCR + SPA signal equals the combined signals to all the batteries.

It is notable that the SPA signal operates:
- infrequently
- mainly to charge and not discharge the batteries (ie it's negative)
- operating in consistent charge blocks 
  - according to the documentation, SOC-management power is purchased in 15 minute blocks which explains why the amounts are static and not proportional to SOC
  - note that `SPA_ask_P` is the algorithm registering the need, and `SPA_exec` is the system responding to the ask with the transaction of power

We could be more precise about how *often* these modes (FCR and SPA) are actively controlling the batteries:

In [None]:
(bess_raw
  .loc[:,['FCR_activated']]                  # isolate the FCR activated column
  .pipe(lambda x: x[x.FCR_activated != 0])   # use index masking to remove inactive records
  .agg('sum').div(len(bess_raw)).mul(100)    # determine the percent of all operating time where FCR is activated
)

The result above is misleading. 'Activated' does not mean the FCR signal is non-zero. It means the system is 'ready' and not in maintenance.

If you were curious about the `FCR_control` variable, per the documentation, this is simply the maximum allowed system export to the grid. It's 3000kW, presumably +/-...

In [None]:
print(bess_raw[bess_raw.FCR_control != 0].FCR_control.agg(['max','min']))
print(f"FCR operational max {bess_raw.FCR_P.max()}kW and min {bess_raw.FCR_P.min()}kW ") 

In practice the FCR signal may be active but sitting in the dead-band region (ie FCR_P is zero). Below cell isolates the *non-zero* 'active' region for the FCR and SPA signals both.

In [None]:
print(f"Either FCR or SPA active:  {round(100*len(bess_raw[(bess_raw['FCR_P'] != 0 ) | (bess_raw['SPA_exec_P'] != 0 )]) / len(bess_raw),2)} %")
print(f"Both FCR and SPA active:   {round(100*len(bess_raw[(bess_raw['FCR_P'] != 0 ) & (bess_raw['SPA_exec_P'] != 0 )]) / len(bess_raw),2)} %")
print(f"FCR active:                {round(100*len(bess_raw[(bess_raw['FCR_P'] != 0 ) ]) / len(bess_raw),2)} %")
print(f"FCR active alone:          {round(100*len(bess_raw[(bess_raw['FCR_P'] != 0 ) & (bess_raw['SPA_exec_P'] == 0 )]) / len(bess_raw),2)} %")
print(f"SPA active:                {round(100*len(bess_raw[(bess_raw['SPA_ask_P'] != 0 ) ]) / len(bess_raw),2)} %")
print(f"SPA active alone:          {round(100*len(bess_raw[(bess_raw['FCR_P'] == 0 ) & (bess_raw['SPA_exec_P'] != 0 )]) / len(bess_raw),2)} %")
print(f"Both FCR and SPA inactive: {round(100*len(bess_raw[(bess_raw['FCR_P'] == 0 ) & (bess_raw['SPA_exec_P'] == 0 )]) / len(bess_raw),2)} %")

## Frequency Containment Reserve (FCR)

Clearly the plant's dominant operating algorithm is Frequency Containment Reserve (FCR). This entails the battery charging and discharging on the basis of the grid frequency. The theory is that when frequency is high, there is a relative shortage of load and when frequency is low, there is a shortage of generation on the grid. Because bad things can happen when the grid is deviates from the nominal frequency, the battery used is as a stabilizing force. This is a common mode of operation for utility scale batteries.

The code function below seeks to derive the FCR signal (FCR_P)--the signal sent to the batteries-- *based on the observed grid frequency (Grid_Frequency)*. Basically it seeks to transform the Grid Frequency column into something close to the FCR_P column.

To extrapolate the FCR signal it is necessary to identify the system's deadband region around nominal frequency (50Hz or 50,000mHz). This is a design feature to ensure the battery does not oscillate on small changes in frequency.

Note that because FCR is negative for charging and positive for discharging, there is an inverse relationship between the FCR signal and grid frequency.

In [None]:
def freq_convert(freq: pd.Series,                                # function to be used with pd.transform method below
                 freq_nom=50000,                                 # nominal European grid frequency in mHz
                 dead_band=9,                                    # the deadband value established via trial/error
                 multiplier= -15) -> pd.Series:                  # the multiplier parameter value established via trial/error
    
    freq_high = freq[freq > (freq_nom + dead_band)]
    freq_db = freq[(freq <= (freq_nom + dead_band)) & (freq >= (freq_nom - dead_band))]
    freq_low = freq[freq < (freq_nom - dead_band)]

    freq_high = freq_high - freq_nom                              # normalize frequency around the x-axis
    freq_db = freq_db * 0                                         # batteries do not operate in dead band
    freq_low = freq_low - freq_nom                                # normalize frequency around the x-axis
    freq = pd.concat([freq_high, freq_db, freq_low]) * multiplier   # a multiplier is used to derive battery power (kW) from frequency (mHz)
    return freq.sort_index()

freq_sig = (bess_raw
             .loc[:,['Grid_frequency','FCR_P']]                                               # isolate frequency and FCR signal columns
             .assign(FCR_extrapolated=lambda df:df['Grid_frequency'].transform(freq_convert)) # transform grid frequency to approximate FCR signal
           )

px.line(freq_sig.iloc[1:2000],  # adjust range to see various areas of FCR signal
        y=['FCR_P', 'FCR_extrapolated'])

The analysis above derives a 'extrapolated' FCR signal from the observed grid frequency. We can thus be pretty confident in how the FCR signal is being derived from grid frequency.
- A `9mHz` deadband around nominal 50Hz frequency will drop the FCR signal to zero
- A roughly `-15` multiplier is used to translate grid frequency in `mHz` to a power signal in `kW` after the nominal 50Hz offset is removed

## SPA Preliminaries

The other variable influencing battery operation is Set Point Adjustment (SPA). Per the documentation the signal does 'SOC management'. This is a signal to batteries that keeps the batteries' in a middle-of-the-road state of charge (SOC) so they can continue to provide FCR services. Beyond that it is unclear exactly how management is defined.

The goal of this notebook was to establish the correlation between the batteries' SOC and the SPA signal. The exercise was more difficult than deriving the FCR... 

For immediate sanity check, let's confirm that the BESS SOC reflects the `average` SOC for the 10 batteries. 

### Deriving the BESS SOC from Battery SOC

In [None]:
soc_filter = [col for col in all_batts.columns if 'SOC' in col]  
soc = (all_batts
       .loc[:, soc_filter]                              # isolate only the SOC column for each battery
       .join(bess_raw.SOC)                              # join the bess SOC signal alongside the battery SOC signals
       .rename(columns={'SOC':'BESS_calculated_SOC'})   # rename the bess signal for clarity
       .resample("4h").mean()                           # hourly is my preferred resampling to visualize the full month (but it takes too much space)
      )

In [None]:
soc['batt_avg_SOC'] = (soc                             # create a column to average the battery SOCs
                        .drop('BESS_calculated_SOC', axis=1)
                        .agg(func='mean', axis=1)     # aggregations across columns are often slow for pandas, another reason to resample
                        .div(10)                      # divide by 10 as the battery SOC are in 0.1% units where the BESS SOC is in % units
                     )

px.line(soc.iloc[:,:],                                # adjust range to see various areas of FCR signal
        y=['BESS_calculated_SOC', 'batt_avg_SOC'])

### SPA Signal Distribution

There is some error above but it's relatively as expected.

To begin the SPA signal analysis, we explore first the distribution of the SPA signals registered (the 'ask' in kilowatts). Following that we look at the execution of the SPA signal performed by the BESS inverters. SPA is executed in 15-minute intervals. 

In [None]:
spa_ask_mask = bess_raw.SPA_ask_P != 0
px.histogram(bess_raw.SPA_ask_P[spa_ask_mask], nbins=50)

In [None]:
spa_exec_mask = bess_raw.SPA_exec_P != 0
px.histogram(bess_raw.SPA_exec_P[spa_exec_mask], nbins=50)

Not surprisingly there is a clear correspondance between what is asked for and what is received. They aren't exact because the ask comes first and is only executed at a 15-minute interval. Similarly the ask may cease, but the system will execute the signal to the end of the 15-minute period.

This looks like a relatively crude signal. It is clearly not going to be proportional to SOC per se in a trivial sense. The SPA signal is--for the most part-- either a 500kW or 250kW charging signal or (more infrequently) a 500kW or 250kW discharging signal.

Let's look at the averages:

In [None]:
print(f"Average BESS SOC: {bess_raw.SOC.mean(): .2f} in %")
print(f"Average SPA Signal: {bess_raw.SPA_exec_P.mean(): .2f} in kW (- = charging; + = discharging)")

That SPA is biased toward charging makes sense.

If we assume that the FCR signal is evenly distributed around the nominal grid frequency of 50Hz, we need a compensation signal to avoid the SOC from progressively dropping *due to the round trip efficiency loss of the battery.* More energy needs to go into the battery than is taken out to maintain a steady SOC.

## SPA Signal Periodicity

The plot below reveals the daily cyclical behavior seen in the FCR signal, the SPA signal and the SOC. There is a clear daily cycle at work. 

The SPA signal is not as clearly aligned with the SOC as one might expect. This previews a conclusion demonstrated below. The SPA signal is not *exclusively* bound to the taks of charging a low battery and discharging a full battery.

In [None]:
sigs_by_hour =  (bess_raw
                    .loc[:,['FCR_P', 'SPA_ask_P']]
                    .assign(hour=lambda df: df.index.hour).groupby('hour').mean())
soc_by_hour =   (bess_raw
                    .loc[:,['SOC']]
                    .assign(hour=lambda df: df.index.hour).groupby('hour').mean())

fig = make_subplots(specs=[[{"secondary_y": True}]])

fig.add_trace(
    go.Scatter(x=sigs_by_hour.index, y=sigs_by_hour.FCR_P, name="FCR Signal"),
    secondary_y=False,)
fig.add_trace(
    go.Scatter(x=sigs_by_hour.index, y=sigs_by_hour.SPA_ask_P, name="SPA Ask Signal "),
    secondary_y=False,)
fig.add_trace(
    go.Scatter(x=soc_by_hour.index, y=soc_by_hour.SOC, name="BESS SOC"),
    secondary_y=True,)

fig.update_layout(title_text="BESS Signals and State of Charge (SOC) Averaged by Hour of Day")
fig.update_xaxes(title_text="Hour of Day")
fig.update_yaxes(title_text="kW (- = chg; + = dischg)", secondary_y=False)
fig.update_yaxes(title_text="SOC (%)", secondary_y=True, tickmode = 'array', tickvals = np.arange(40, 55, 2))

fig.show()

In [None]:
# another view
px.bar(
(bess_raw
  .loc[:,['FCR_P','SPA_ask_P']]
  .assign(hour=lambda x: x.index.hour)
  .groupby('hour').mean()   
),
)

## SPA Rescue Operation 

The most concrete function of the SPA signal *is* to charge a low battery and discharge a high battery. We'll call that *rescue* operation and start there.

The threshold window for rescue operations is found outside 35-70% SOC.

The cells below 'normalize' the SOC to the x-axis, and applying multiples to the SPA signal. This is done only for visualization purposes. 

In [None]:
# feel free to play with these datatime controls, memory becomes an issue plotting 1-second data over 1 day
start = datetime(year=2023, month=4, day=13, hour=0, minute=0, second=0)
end = datetime(year=2023, month=4, day=13, hour=6, minute=0, second=0)

px.line(
(bess_raw
  .loc[:,['SPA_ask_P','SOC','SPA_exec_P', 'FCR_P']]
  .assign(normed_SOC= lambda df: df.SOC.sub(50))                              # normalize the SOC by subtracting the mean 
  .assign(normed_SPA_ask= lambda df: df.SPA_ask_P.div(250))                   # divide SPA signal by 10 to visualize alongside SOC
  .assign(normed_SPA_exec= lambda df: df.SPA_exec_P.div(250))                 # divide SPA signal by 10 to visualize alongside SOC
  .assign(normed_FCR= lambda df: df.FCR_P.div(500))
  .loc[start:end]
),
y=['normed_SPA_ask','normed_SPA_exec','normed_SOC', 'normed_FCR'], 
)

To get explicit about how the Rescue SPA signal performs lets look at an interesting window of time.

If we assume 50% is the nominal SOC and 35% is the lower limit, we might subtract 35 from all SOC values to set 0 as the lower SOC threshold.

- At 4:04:03, the BESS SOC drops below 35%
- At 4:04:04, the SPA immediately registers the 'ask' (zoom in to see this clearer)
- At 4:15:01, the SPA is executed and 250kW of charge power begins to flow to the batteries
- At 4:36:31, the SOC recovers to 35% SOC
- At 4:36:31, the SPA ask singal is removed
- Starting 4:36:30, the SOC oscillates between 34-35% leading to oscillations in the SPA ask (likely because the way SOC is being calculated)
- Just 2 seconds before the 15-minute SPA execution interval (at 4:45:00) the SPA ask returns to zero, thus the SPA execution stops there

In [None]:
start = datetime(year=2023, month=4, day=2, hour=1, minute=30, second=0) # the BESS SOC is reducing to the 35% threshold at the start of this window
end = datetime(year=2023, month=4, day=2, hour=3, minute=0, second=0)

px.line(
(bess_raw
  .loc[:,['SPA_ask_P','SOC','SPA_exec_P', 'FCR_P']]
  .assign(normed_SOC= lambda df: df.SOC.sub(35))                            # set the lower SOC threshold at zero 
  .assign(normed_SPA_ask= lambda df: df.SPA_ask_P.div(250))                 # divide SPA signal by 250 to visualize alongside SOC (ie 1unit=250kW)
  .assign(normed_SPA_exec= lambda df: df.SPA_exec_P.div(250))               # divide SPA signal by 250 to visualize alongside SOC
  .assign(normed_FCR= lambda df: df.FCR_P.div(500))
  .loc[start:end]
),
y=['normed_SPA_ask','normed_SPA_exec', 'normed_SOC', 'normed_FCR'])

To find an interval where SOC goes high and SPA engages a discharge signal, I went for the dataset's largest registered SOC event and found that this was also the largest discharge event executed by the SPA signal.

The behavior indicates something like emergency operation at 75% SOC. Unlike the static 250 or 500kW blocks (that we see starting at 70%), the SPA ask begins climbing proportionally when the SOC hits 75%. At 12:00 a massive 1.5MW charge command is executed to get the SOC below 70%.

In [None]:
print(f"Max SOC: {bess_raw.SOC.max()}%")
print(f"Max SOC datetime: {bess_raw.SOC.idxmax()}")
print(f"Max SPA exec: {bess_raw.SPA_exec_P.max()}kW")
print(f"Max SPA exec datetime: {bess_raw.SPA_exec_P.idxmax()}")

In [None]:
start = datetime(year=2023, month=4, day=23, hour=10, minute=0, second=0)
end = datetime(year=2023, month=4, day=23, hour=12, minute=30, second=0)

px.line(
(bess_raw
  .loc[:,['SPA_ask_P','SOC','SPA_exec_P', 'FCR_P']]
  .assign(normed_SOC= lambda df: df.SOC.sub(75))                             # normalize the high SOC threshold to the zero axisn 
  .assign(normed_SPA_ask= lambda df: df.SPA_ask_P.div(250))                  
  .assign(normed_SPA_exec= lambda df: df.SPA_exec_P.div(250))                 
  .assign(normed_FCR= lambda df: df.FCR_P.div(500))
  .loc[start:end]
),
y=['normed_SPA_ask','normed_SPA_exec', 'normed_SOC', 'normed_FCR'])

## Top Off Operation
The SPA Rescue Algorithm does not fully explain how the SPA signal functions all the time.

To describe the remainder of SPA operation *within the SOC window* I think of this behavior as top off operation. Beyond seeing patterns, I was unable to determine exactly what is motivating the SPA signal in this range. I suspect it might have something to do with the cost of electricity at certain hours. Because the batteries will need SPA charging eventually, might as well do it when electricity is cheap instead of waiting until 35% is reached?

The documentation mentions that this energy must be purchased, as separate from the FCR services provided.

The cells below provide a number of plots that show the top off operation. I leave it to you to explain what is going on.

In [None]:
px.line(
    (bess_raw[(bess_raw['SOC'] <= 35) ]   # filter for SPA charging signals (in kWh) due to low SOC (ie <35%) 
    .loc[:,['SPA_exec_P']]
    .assign(hour=lambda df: df.index.hour)
    .groupby('hour').sum().div(3600)
    ),
labels=dict(index='hours', value='kWh Total'), title='Total SPA signal executed by hour when SOC is at or below 35%'
)

In [None]:
px.line(
    (bess_raw[(bess_raw['SOC'] >= 70)]   # filter for SPA discharge signals due to high SOC (ie >70%) 
    .loc[:,['SPA_exec_P']]
    .assign(hour=lambda df: df.index.hour)
    .groupby('hour').sum().div(3600)
    ),
labels=dict(index='hours', value='kWh Total'), title='Total SPA signal executed by hour when SOC is at or over 70%'
)

In [None]:
# create a new column to document the sizing of the SPA_ask values
spa_diff = bess_raw.assign(spa_ask_diff= lambda df: df.SPA_ask_P.diff(),
                           spa_exec_diff= lambda df: df.SPA_exec_P.diff())

In [None]:
px.bar(
(spa_diff
  .loc[abs(spa_diff['spa_ask_diff']) == 250][['SOC']]
  .assign(hour=lambda df: df.index.hour).groupby('hour').count()
),
labels=dict(index='hours', value='count'), title='Total Count of 250kW SPA signals by hour of day'
)

In [None]:
px.bar(
(spa_diff
  .loc[abs(spa_diff['spa_ask_diff']) == 500][['SOC']]
  .assign(hour=lambda df: df.index.hour).groupby('hour').count()
),
labels=dict(index='hours', value='count'), title='Total Count of 500kW SPA signals by hour of day'
)

In [None]:
px.histogram(
    (spa_diff
     .loc[abs(spa_diff['spa_ask_diff']) == 250]
     ['SOC']
    ),
labels=dict(index='hours', value='SOC'), title='SOC Level for 250kW signal', 
nbins=50)  

In [None]:
px.histogram(
    (spa_diff
     .loc[abs(spa_diff['spa_ask_diff']) == 500]
     ['SOC']
    ),
labels=dict(index='hours', value='SOC'), title='SOC Level for 500kW signal', 
nbins=50)
# interesting the rise up to 50%...

In [None]:
px.histogram(
    (spa_diff
     .loc[(abs(spa_diff['spa_ask_diff']) != 250) & (abs(spa_diff['spa_ask_diff']) != 500) & (spa_diff['spa_ask_diff'] != 0)]
     ['SOC']
    ),
labels=dict(index='hours', value='SOC'), title='SOC Level for non 250/500kW signals', 
nbins=50)  

In [None]:
px.bar(
(spa_diff
  .loc[(abs(spa_diff['spa_ask_diff']) != 250) & (abs(spa_diff['spa_ask_diff']) != 500) & (spa_diff['spa_ask_diff'] != 0)][['SOC']]
  .assign(hour=lambda df: df.index.hour).groupby('hour')[['SOC']].count()
),
labels=dict(index='hours', value='count'), title='Hour of day for non 250/500kW signals'
)

In [None]:
# if 250kW/500kW signals are mainly for low SOC, what is their proportion for the other SPA signals (much of which was 'emergency SPA'?
(spa_diff
  .assign(
          spa_250= lambda df: abs(df.spa_ask_diff.where(((spa_diff['spa_ask_diff']) == 250), 0)),
          spa_500= lambda df: abs(df.spa_ask_diff.where(((spa_diff['spa_ask_diff']) == 500), 0)),
          spa_other= lambda df: abs(df.spa_ask_diff.where(((spa_diff['spa_ask_diff']) != 250) & (abs(spa_diff['spa_ask_diff']) != 500), 0))
         )          
  .loc[:, ['spa_250','spa_500','spa_other']]
  .agg('sum')
)