Irish Smart Meter Trial Experiment - Revisited with ML
In 2011 Ireland's energy commission released an experiment about the viability of exploiting consumption elasticity in view of controlling electrical demand. The results showed that price surging during peak hours reduced demand by 8% relative to the control group.
This project aims to further understand the drivers of such response. Via K-means clustering on temporal data and statistical inferences it was possible to detect the responsive subgroups of users, proving potentially useful for service providers to formulate strategies for demand management.
The integration of renewable energy generation and significant changes in demand (ie. electric cars, storage, CHP) challenge the grid's overall balance.
At a household-level, smart-metering technology is a means to collect high resolution temporal data; enabling service providers to evaluate demand management strategies.
A commonly discussed strategy is controlling demand through consumption elasticity.
The dataset derives from the CER Smart Metering experiment in Ireland, where users were allocated either to a flat (control) or variable (treatment) electricity tariff.
The CER's goal was to address the overall household response to variable tariffs while this project attempts to identify the subgroups of users that drive such response.
Irish Social Science Data Archive: Smart Meter
- Includes 4,000 anonymized household data
- household id, timestamp, consumption (kWh)
- 15-min time-resolution
- 6 csv files: 3+ GB
- csv file relating households to Time-of-use Tariff and stimulus
General approach and challenges
- Feature construction and clustering
The consumption of each user at any given time period can be thought of as an independent feature. At 15-min granularity, spanning 1.5 years and roughly 4,000 households implicates a high dimensionality matrix.
The first challenge is to reduce dimensionality and to cluster users by similar profile consumption. This was performed via temporal aggregation and feature construction.
The value of this step also lies in defining a working assumption about the clusters:
The baseline comparison for variable tariffs can be estimated by the actual loads of the corresponding control group (clustered).
The dataset includes a 6-month period where all users where exposed to same conditions and therefore an unbiased timespan to perform the clustering. Furthermore, thinking about the actual application of demand response (DR) applications, the benchmark doesn't need to be an extended period of time, it could be done within non-event DR days.
The following image shows plots for every cluster where each curve represents a user. It also shows how the clusters capture users' variability and magnitude of consumption.
Note the number of clusters was determined heuristically; service provider's input would be ideal here.
- Comparative baseline
The comparative baseline is calculated as a function of the control (clustered) mean, but note that other models (ex. regression-based load-temperature model) may increase the estimation accuracy. Such variations were not explored since this dataset is limited in demographic information due to privacy concerns.
The following figure summarizes the mean daily user profile along with the relative price change between both groups.
- Quantify response
At this point, through visual inspection its possible to see whether a cluster is responsive or not. For objectiveness, assuming the underlying distributions are Gaussian, a hypothesis can be formulated and tested with a typical type I error of 5% .
H0: (Time-of-use tariffs cluster)mean >= Baseload
or: Increasing price does not induce a significant decrease in consumption.
H1: (Time-of-use tariffs cluster)mean < Baseload
Given the density within clusters vary, it is also helpful to compute the statistical power of the test.
In the following figure, clusters with a dashed-square are identified as responsive.
Note the sample size is reduced as the number of cluster increases. For clusters 3 and 5, although the hypothesis test proves significant, ideally we would want to increase the sample size to reduce the probability of a Type II error (increasing statistical power).
- Final thoughts
OOP for future complexity and scalability; see src/Pipeline for documentation.