Ireland Smart Meter Trial Experiment - Revisited with ML

In 2011 Ireland's energy commission released an experiment about the viability of exploiting consumption elasticity in view of controlling electrical demand. The results showed that price surging during peak hours reduced demand by 8% relative to the control group.

This project aims to further understand the drivers of such response. Via K-means clustering on temporal data and statistical inferences it was possible to detect the responsive subgroups of users, proving potentially useful for service providers to formulate strategies for demand management.

Context

The integration of renewable energy generation and significant changes in demand (ie. electric cars, storage, CHP) challenge the grid's overall balance.

At a household-level, smart-metering technology is a means to collect high resolution temporal data; enabling service providers to evaluate demand management strategies.

A commonly discussed strategy is controlling demand through consumption elasticity.

Data Source

The dataset derives from the CER Smart Metering experiment in Ireland, where users were allocated either to a flat (control) or variable (treatment) electricity tariff.

The CER's goal was to address the overall household response to variable tariffs while this project attempts to identify the subgroups of users that drive such response.

Irish Social Science Data Archive: Smart Meter
- Includes 4,000 anonymized household data
- household id, timestamp, consumption (kWh)
- 15-min time-resolution
- 6 csv files: 3+ GB
- http://www.ucd.ie/issda/data/commissionforenergyregulationcer/
Household allocation
- csv file relating households to Time-of-use Tariff and stimulus

General approach and challenges

Feature construction and clustering

The consumption of each user at any given time period can be thought of as an independent feature. At 15-min granularity, spanning 1.5 years and roughly 4,000 households implicates a high dimensionality matrix.

The first challenge is to reduce dimensionality and to cluster users by similar profile consumption. This was performed via temporal aggregation and feature construction.

The value of this step also lies in defining a working assumption about the clusters:

The baseline comparison for variable tariffs can be estimated by the actual loads of the corresponding control group (clustered).

The dataset includes a 6-month period where all users where exposed to same conditions and therefore an unbiased timespan to perform the clustering. Furthermore, thinking about the actual application of demand response (DR) applications, the benchmark doesn't need to be an extended period of time, it could be done within non-event DR days.

The following image shows plots for every cluster where each curve represents a user. It also shows how the clusters capture users' variability and magnitude of consumption.

Note the number of clusters was determined heuristically; service provider's input would be ideal here.

Comparative baseline

The comparative baseline is calculated as a function of the control (clustered) mean, but note that other models (ex. regression-based load-temperature model) may increase the estimation accuracy. Such variations were not explored since this dataset is limited in demographic information due to privacy concerns.

The following figure summarizes the mean daily user profile along with the relative price change between both groups.

Quantify response

At this point, through visual inspection its possible to see whether a cluster is responsive or not. For objectiveness, assuming the underlying distributions are Gaussian, a hypothesis can be formulated and tested with a typical type I error of 5% .

H0: (Time-of-use tariffs cluster)mean >= Baseload
or: Increasing price does not induce a significant decrease in consumption.

H1: (Time-of-use tariffs cluster)mean < Baseload

Given the density within clusters vary, it is also helpful to compute the statistical power of the test.

Insights

In the following figure, clusters with a dashed-square are identified as responsive.

Note the sample size is reduced as the number of cluster increases. For clusters 3 and 5, although the hypothesis test proves significant, ideally we would want to increase the sample size to reduce the probability of a Type II error (increasing statistical power).

Final thoughts

Code

OOP for future complexity and scalability; see src/Pipeline for documentation.

Libraries:

sklearn
pickle
numpy
pandas
datetime
matplotlib
seaborn

Name		Name	Last commit message	Last commit date
Latest commit History 116 Commits
figures_and_presentation		figures_and_presentation
src		src
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ireland Smart Meter Trial Experiment - Revisited with ML

Context

Data Source

General approach and challenges

Code

Libraries:

This project was presented at the Galvanize Immersive Data Science Showcase event in San Francisco on October 20th, 2016.

About

Releases

Packages

Languages

felgueres/clustered-ab-test

Folders and files

Latest commit

History

Repository files navigation

Ireland Smart Meter Trial Experiment - Revisited with ML

Context

Data Source

General approach and challenges

Code

Libraries:

This project was presented at the Galvanize Immersive Data Science Showcase event in San Francisco on October 20th, 2016.

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages