## Overview

This notebook is intended to:
- Demonstrate how to access time series data for meters and SCADAs using Awesense's Energy Data Model (EDM) SQL API.

Please refer to the [main_concepts.ipynb](../2_main_concepts/main_concepts.ipynb) notebook for a high-level introduction to and simpler examples of the core views and functions available in Awesense's Energy Data Model (EDM).

---

## Set up

In [1]:
import getpass
import plotly.express as px
import plotly.io as pio
import time
import urllib.parse
import pandas as pd

# Allow charts to persist between notebook sessions.
pio.renderers.default='notebook'

**Connection**

Enter the EDM server address and the login credentials provided by Awesense. If you do not have the credentials, or have any trouble connecting, please contact api@awesense.com.
<span style='color:red'> **Please do NOT store the credentials in the notebook, nor share them with anyone.** </span>

In [2]:
edm_address = getpass.getpass(prompt='EDM server address: ')

print('\nEDM login information')
edm_name = getpass.getpass(prompt='Username: ')
edm_password = getpass.getpass(prompt='Password: ')
edm_password = urllib.parse.quote(edm_password)

%load_ext sql
%sql postgresql://$edm_name:$edm_password@$edm_address/edm
%config SqlMagic.displaycon = False
%config SqlMagic.feedback = False

# Delete the credential variables for security purpose.
del edm_name, edm_password

EDM server address: ········

EDM login information
Username: ········
Password: ········


---

## Time Series
Various temporary views are created below to optimize query performance within the notebook.

### Meter Time Series

*High level approach*
- Create a temporary view `grid_element_metric`, containing information on meters for grid elements and convenient access to the actual data.
- The `grid_get_downstream()` function is used to gather information of meters that are downstream of a specified grid element.
- Time series are retrieved using the `ts_data_source_select()` function.

**Grid Element Data**
* Create a temporary view `grid_element_metric`. 

In [3]:
%%sql
    
CREATE OR REPLACE TEMPORARY VIEW grid_element_metric AS
    SELECT grid_id,
            grid_element_id,
            phases,
            type,
            provider,
            direction,
            friendly_id,
            metric_key AS metric,
            valid,
            timestamp,
            value
    FROM grid_element_data_source geds
    JOIN UNNEST(geds.metrics::TEXT[]) AS metric_key
        ON true
    LEFT JOIN ts_data_source_select(grid_element_data_source_id, metric_key) AS ts
        ON true;

[]

**Downstream of a Grid** 
- Specify `grid_id` and `grid_element_id` whose downstream meter data to fetch for.

In [4]:
grid_id = input('Grid Id: ') # awefice
grid_element_id = input('Grid Element Id: ') # line_segment_57

Grid Id: North Central Zone
Grid Element Id: 12373_hvmv


Check when this grid was last updated.

In [5]:
%%sql

SELECT last_updated
FROM grid
WHERE grid_id = '{grid_id}';

last_updated
2021-01-01 20:00:00+00:00


- Create a temporary view `meter_data_source` to make it more convenient to access the data sources for the grid elements in the trace for the specified element.

In [6]:
%%sql

CREATE OR REPLACE TEMPORARY VIEW meter_data_source2 AS
    SELECT meter.grid_id,
            meter.grid_element_id,
            geds.grid_element_data_source_id,
            geds.friendly_id,
            geds.phases,
            geds.provider,
            metric_key as metric,
            lower(geds.valid) as start_time,
            upper(geds.valid) as end_time
    FROM grid_get_downstream('{grid_id}', '{grid_element_id}') AS meter
    LEFT JOIN grid_element_data_source geds
        ON meter.grid_element_id = geds.grid_element_id
        AND meter.grid_id = geds.grid_id
        AND geds.type = 'CONSUMER'
    JOIN UNNEST(geds.metrics::TEXT[]) AS metric_key
        ON true
    WHERE meter.type = 'Meter';

[]

**Consumption Data**
- Create a temporary view `meter_consumption` with the meter readings.

In [7]:
%%sql

CREATE OR REPLACE TEMPORARY VIEW meter_consumption2 AS
SELECT meter.grid_id,
        meter.grid_element_id,
        meter.friendly_id,
        timestamp,
        value AS kWh,
        meter.phases
FROM meter_data_source2 meter
LEFT JOIN grid_element_metric gem
    ON gem.grid_id = meter.grid_id
    AND gem.grid_element_id = meter.grid_element_id
WHERE gem.metric = 'kWh'
   AND gem.type = 'CONSUMER';

[]

**Summary**
- Return a high-level summary of the time series data.

In [8]:
%%sql

WITH ts_stats2 AS (
    SELECT SUM(kWh) AS kWh, MIN(timestamp) AS start_timerange, MAX(timestamp) AS end_timerange
    FROM meter_consumption2
)
SELECT name, value FROM (
    SELECT 1 AS idx, 'Meters Found' AS name, (SELECT COUNT(DISTINCT grid_element_id) FROM meter_data_source2)::text AS value
    UNION
    SELECT 2, 'Meters w/ Datasources', (SELECT COUNT(DISTINCT grid_element_id) FROM meter_data_source2 WHERE grid_element_data_source_id IS NOT NULL)::text
    UNION
    SELECT 3, 'Common DS Timerange', (SELECT CONCAT(MAX(start_time), ' - ',  MIN(end_time)) FROM meter_data_source2)::text
    UNION
    SELECT 4, 'Common Timeseries Timerange', (SELECT CONCAT(start_timerange, ' - ', end_timerange) FROM ts_stats2)::text
    UNION
    SELECT 5, 'Total Consumption', (SELECT kwh FROM ts_stats2)::text
) x ORDER BY idx
;

name,value
Meters Found,316
Meters w/ Datasources,316
Common DS Timerange,2021-01-01 08:00:00+00 -
Common Timeseries Timerange,2021-01-01 05:00:00+00 - 2024-06-14 15:00:00+00
Total Consumption,33101258.99746662


**Monthly Time Series**
- Aggregate `meter_consumption` data by month and create a data frame `df_meter` for average monthly meter consumption.

In [9]:
# Save to a Python variable first.
monthly_meter = %sql SELECT date_trunc('month', timestamp)::date AS month, \
                            AVG(kWh) AS kwh, \
                            phases \
                        FROM meter_consumption2 \
                        GROUP BY month, phases;
                    
# Sort the data by date saved as `month`.
df_meter2 = monthly_meter.DataFrame().sort_values('month')

In [26]:
#pd.to_datetime(df_meter2['month']).dt.month.value_counts()

month
1     16
2     16
3     16
4     16
5     16
6     16
7     12
8     12
9     12
10    12
11    12
12    12
Name: count, dtype: int64

**Visualization**
- Visualize monthly total consumption for each `grid_element_data_source_id`.

In [11]:
# Plot graph
px.line(df_meter2[df_meter2['phases'] != "ABC"], x='month', y='kwh', 
        title='Average hourly Consumption by Month and Phase (w\ ABC)',
        color='phases')
#grid_get_downstream()

In [74]:
#check = %sql SELECT date_trunc('month', timestamp)::date AS month,timestamp, kWh, phases FROM meter_consumption2 \
#        WHERE (date(timestamp) >='2021-01-01') AND (date(timestamp) < '2021-02-01');
#check2 = check.DataFrame()

In [75]:
#check2[check2['phases']=='C']['kwh'].sum()/check2[check2['phases']=='C']['kwh'].count()

2.392630979415936

---

### SCADA Time Series

*High level approach*
- Create a temporary view `grid_element_metric`, containing information on SCADAs for grid elements and convenient access to the actual data.
- The `grid_get_sources()` function is used for switches that are top feeders of a specified element.
- Time series are retrieved using the `ts_data_source_select()` function.

**Grid Element Data**
* Create a temporary view `grid_element_metric`.

*Please note that the below is the same view as the one from the **Meters Time Series** section. No need to re-run this `Grid Element Data` section if it was already done above.*

In [11]:
%%sql

CREATE OR REPLACE TEMPORARY VIEW grid_element_metric AS
    SELECT grid_id,
            grid_element_id,
            phases,
            type,
            provider,
            direction,
            friendly_id,
            metric_key AS metric,
            valid,
            timestamp,
            value
    FROM grid_element_data_source geds
    JOIN UNNEST(geds.metrics::TEXT[]) AS metric_key
        ON true
    LEFT JOIN ts_data_source_select(grid_element_data_source_id, metric_key) AS ts
        ON true;

[]

**Sources of a Grid** 
- Fetch the SCADA data for the specified `grid_id` and `grid_element_id`.

In [12]:
grid_id = input('Grid Id: ') # awefice
grid_element_id = input('Grid Element Id: ') # line_segment_57

Grid Id: awefice
Grid Element Id: line_segment_57


- Create a temporary view `scada_data_source` to make it more convenient to access the data sources for the source grid elements in the trace for the specified element.

In [13]:
%%sql

CREATE OR REPLACE TEMPORARY VIEW scada_data_source AS
    SELECT scada.grid_element_id,
            scada.grid_id,
            geds.friendly_id,
            geds.provider,
            metric_key as metric,
            geds.grid_element_data_source_id,
            lower(geds.valid) as start_time,
            upper(geds.valid) as end_time
    FROM grid_get_sources('{grid_id}', '{grid_element_id}', 'true') AS scada
        LEFT JOIN grid_element_data_source geds
            ON scada.grid_element_id = geds.grid_element_id
            AND scada.grid_id = geds.grid_id
            AND geds.type = 'SENSOR'
    JOIN UNNEST(geds.metrics::TEXT[]) AS metric_key
        ON true
    WHERE scada.type = 'CircuitBreaker';

[]

**SCADA Time Series**
- Create a temporary view `scada_time_series` with the SCADA readings.

In [14]:
%%sql

CREATE OR REPLACE TEMPORARY VIEW scada_time_series AS
SELECT gem.type, 
        gem.grid_id,
        gem.grid_element_id ,
        scada.friendly_id,
        timestamp,
        value AS kWh
FROM scada_data_source scada
LEFT JOIN grid_element_metric gem
    ON gem.grid_id = scada.grid_id
    AND gem.grid_element_id = scada.grid_element_id
WHERE gem.metric = 'kWh'
   AND gem.type = 'SENSOR';

[]

**Summary**
- Return a high-level summary of the time series data.

In [15]:
%%sql

WITH ts_stats AS (
    SELECT SUM(kWh) AS kWh, MIN(timestamp) AS start_timerange, MAX(timestamp) AS end_timerange
    FROM scada_time_series
)
SELECT name, value FROM (
    SELECT 1 AS idx, 'SCADAs Found' AS name, (SELECT COUNT(DISTINCT grid_element_id) FROM scada_data_source)::text AS value
    UNION
    SELECT 2, 'SCADAs w/ Datasources', (SELECT COUNT(DISTINCT grid_element_id) FROM scada_data_source WHERE grid_element_data_source_id IS NOT NULL)::text
    UNION
    SELECT 3, 'Common DS Timerange', (SELECT CONCAT(MAX(start_time), MIN(end_time)) FROM scada_data_source)::text
    UNION
    SELECT 4, 'Common Timeseries Timerange', (SELECT CONCAT(start_timerange, ' - ', end_timerange) FROM ts_stats)::text
    UNION
    SELECT 5, 'Total Distribution', (SELECT kWh FROM ts_stats)::text
) x
ORDER BY idx

name,value
SCADAs Found,1
SCADAs w/ Datasources,1
Common DS Timerange,2021-01-01 08:01:00+00
Common Timeseries Timerange,2021-01-01 08:00:00+00 - 2024-02-07 17:00:00+00
Total Distribution,425158.6579999982


**Monthly Time Series**
- Aggregate `scada_time_series` data by month and create a data frame `df_scada` for average monthly SCADA distribution.

In [16]:
# Save to a Python variable first.
monthly_scada = %sql SELECT friendly_id, \
                            date_trunc('month', timestamp)::date AS month, \
                            AVG(kWh) AS kwh \
                        FROM scada_time_series \
                        GROUP BY friendly_id, month;
                    
# Sort the data by date saved as `month`.
df_scada = monthly_scada.DataFrame().sort_values('month')

**Visualization**
- Visualize monthly total distribution for each SCADA.

In [17]:
# Plot graph
px.line(df_scada, x='month', y='kwh', 
        title='Average Hourly Distribution by SCADA',
        color='friendly_id')

---