## Getting started 1: the battery system and the grid
This notebook looks at the BESS data from M5bat. This part of the data describes how the battery system behaves as a whole while interacting with the grid.

The data is available in a [zip file here](https://publications.rwth-aachen.de/record/985923/files/M5BAT_04-2023_RAW.zip). There's an [overview of the data here](https://publications.rwth-aachen.de/record/985923/files/Report_04-2023.pdf).

After you download the data, unzip it in your working directory. You might need to adjust the paths below.

You'll need python (3.9 or newer should be fine). And you'll need the numpy, pandas, and plotly packages, which can be `pip install`ed.

In [None]:
from datetime import datetime
import pandas as pd
import plotly.express as px
import plotly.io as pio

pd.options.display.float_format = '{:.2f}'.format
pio.templates.default = 'seaborn'

## BESS data basics
This is the data captured where the battery system connects to the grid.

Refer to https://publications.rwth-aachen.de/record/985923/files/Report_04-2023.pdf for details.

Description of the data:

| Variable | Description | Unit |
| ---- | ---- | ---- |
| DateAndTime | Date and Time | UTC Timezone ('yyyy-MM-dd HH:mm:ss')|
| M5BAT_P | Active power of M5BAT measured at the network node | kW (- = charging; + = discharging) |
| M5BAT_Q | Reactive power of M5BAT measured at the network node | kVAr |
| Grid_frequency | Grid frequency measured at the network node | mHz |
| Temperature | Ambient temperature at M5BAT site | 0.1°C |
| FCR_activated | Activation signal for FCR | True = FCR activated |
| FCR_P | Active Power for FCR (calculation) | kW (- = charging; + = discharging) | 
| FCR_control | Control band for FCR | kW |
| SPA_ask_P | Request for active power for setpoint adjustment | kW (- = charging; + = discharging) |
| SPA_exec_P | Active power for setpoint adjustment | kW (- = charging; + = discharging) |
| SOC | State of Charge for M5BAT (calculated) | %|
| Interpolated | Interpolation signal for data evaluation | True = Value linear interpolated|

## Loading the data
The following cell shows how to read the raw data. Uncomment the code to run it.

**If you don't want to bother with the raw data, skip to the next cell.**

In [None]:
# How to read read the raw data. Again, you can skip to the next step if you don't want to bother with it.

# This is the grid connection data.
# bess = pd.read_csv('BESS.csv', sep=';', parse_dates=['DateAndTime'], index_col='DateAndTime')

# We'll downsample this massively as we do an initial exploration.
# bess = bess.sample(frac=.001, random_state=13).sort_index()  # data gets scrambled when sampling, so we re-sort

In [35]:
# Pre-sampled data (< 1 MB) stored in github.
remote_path = 'https://github.com/cedargrid/grid-battery-data/raw/e4beb41c933ee8825349315bee74a26209c99588/notebooks/bess_sample.parquet'
bess = pd.read_parquet(remote_path)

In [None]:
# Summarize the data
bess.describe()

## Grid frequency

The basic job of the battery system is to maintain grid frequency, so let's start with that column.
We can see it's generally arond 50k (==50hz), but it looks like there are some 0s that are probably 
junk. So if we plot it, it'll be hard to read.

In [None]:
px.line(bess, x=bess.index, y='Grid_frequency')

In [None]:
# You can zoom in manually, or just drop the 0s for now
tp = bess.loc[bess.Grid_frequency > 0]
px.line(tp, x=tp.index, y='Grid_frequency')  # note we can omit the x here; it will default to the index

So generally the frequency is 50hz +/- .1hz, though keep in mind we heavily downsampled, so might be losing
larger excursions.

## Temperature

In [None]:
bess['temperature_celsius'] = bess['Temperature'] / 10  # Raw data uses deci-ceslsius
px.line(bess, y='temperature_celsius', labels={'temperature_celsius': '°C'}, title='System temperature')

In [None]:
# We see a daily rhythm to the temperature. In Fahrenheight, the range is roughly 30-70°F.

# Let's also look at how these temperatures are distributed.
px.histogram(bess, x='temperature_celsius', title='Distribution of temperature', labels={'temperature_celsius': '°C'})

The mode is in the 8°-10°C range. Temperature has an impact on battery degradation, so it'll be useful to keep this range in mind.

## System power
Finally let's look at the power of the system

In [None]:
px.line(bess, y='M5BAT_P', labels={'M5BAT_P': 'kW'}, title='System power')

This feels familiar... 

At a high level, the system should be absorbing energy when the frequency is too high, and releasing energy when it is too low.

Can we see that behavior? Let's compare the active power to the grid frequency we looked at earlier.

In [None]:
tp = tp.loc[tp.Grid_frequency > 0]
px.scatter(tp, x='Grid_frequency', y='M5BAT_P', labels={'M5BAT_P': 'Power (kW)', 'Grid_frequency': 'Frequency (hz)'})


Indeed, they are closely related: when the frequency is low, the power is positive (battery discharging);
when it's high, the power is negative (battery charging).

We can also compute the correlation:

In [None]:
tp = tp.loc[tp.Grid_frequency > 0]
tp.Grid_frequency.corr(tp.M5BAT_P)

which shows a strong inverse correlation.

## Exercises

1. Look at the State Of Charge (SOC) of the batteries. How does this vary? How is it related the grid frequency?
2. We looked at the *active* power above. Does the *reactive* power look similar?
