[CLOSED] Plan for balance point optimization #62
On the phone call today we agreed to work on balance point optimization in parallel over the next sprint. Please chime in on this issue soon (let's shoot for responses here before next Wed, 4/12) to share what your plan is for exploring different aspects of the balance point optimization so we don't duplicate work.
Also see Issue impactlab/caltrack#57 where folks already did a little thinking about this.
Comment by houghb
We've done a little bit of exploring with the 1000 home dataset to see whether using variable vs fixed balance points makes a difference in aggregate. We found that there is not a significant difference in median savings values regardless of the choice of [reasonable] fixed balance points compared to exploring the variable balance point range in the current spec.
What I am planning to do next is break the dataset into climate zones and explore the impact of balance point selection in each climate zone individually. We expect the inland premises may be more sensitive to balance point choices than the bay area. If that is the case we'll start to look into the impact of different methods of variable balance point selection.
Comment by matthewgee
During this sprint, the OEE team is going to explore using gradient descent vs the current grid search method for arriving at optimal balance point temps. Since one of the main advantages to gradient descent will be increasing the computational efficiency of the spec, we'll be adding runtime to our standard outputs.
Comment by matthewgee
Based on the discussion today, we decided to focus in on three (and a half) main areas that we need to make decisions on for balance point temp optimization:
Comment by tplagge
Loss function choice
In schematic form, our HDD/CDD balance point determination algorithm is:
Presently, we minimize the sum of squares (i.e., a quadratic or least squares loss function). However, this loss function is quite sensitive to outliers; more robust alternatives are available. I’ll consider four candidate loss functions, where yi is the observed usage on the i’th day and ŷi is the modeled usage on the i’th day:
Large residuals matter most for a quadratic loss function, and least for the Tukey bisquare loss function, with absolute value and Huber somewhere in between. We expect the quadratic loss function to give the best constraints in the case where outliers are relatively uncommon and/or small in magnitude, and the others to be more robust to outliers but to yield looser parameter constraints when outliers are uncommon and/or small.
I went through the 1000-home electricity data set and selected the 263 projects which were best fit by a CDD + HDD model (the remainder either had no significant heating or cooling component, or fell back to an intercept-only model). The 1-year baseline periods (prior to work start) for this subset of the 1000-home sample is what I'll use to assess these alternatives.
As a first cut, we can just take a look at the skewness and kurtosis of the residuals. If we see evidence for non-normality, then it makes sense to at least consider using the robust loss functions. And indeed, over 80% of the homes show evidence (p<0.05) for non-normality in skewness and/or kurtosis. This is perhaps not surprising given that we’re not including relevant fixed effects.
So let’s take a look at whether this non-normality has a strong influence on the best-fit balance points. If we repeatedly remove a random 10% of the days from the dataset and determine new balance points, the results shouldn’t change much. If they do, then it’s plausible that outliers are driving the fit.
I ran the fitting routine for each of the 263 baseline periods 25 times, each time throwing out a random 10% of the days in the baseline period, which still leaves over 300 days of usage and temperature data in each sample. Then I recorded the mean and the standard deviation of balance point temperatures for each home across these 25 runs. Here’s the histogram of the HDD balance point standard deviation:
For about 20% of the homes, we see deviations in HDD balance point greater than a degree Fahrenheit when we randomly censor 10% of the data. This is good enough motivation to say that this exploration is worthwhile.
I repeated this procedure for the other three candidate loss functions, and compared the results. If it's outliers driving the instability in these balance point estimates, then more robust loss functions should show more stable estimates.
However, I found that the usual quadratic loss function performed the best on the above metric: it yielded the lowest average standard deviation of balance point temperatures across the 263 sites, i.e. the most stable fits. The mean/median standard deviations were lowest for quadratic loss functions, highest for Tukey bisquare loss functions (by about a factor of two), and in the middle for the linear and Huber loss functions. (The medians were 0.39, 0.49, 0.47, and 0.79 for quadratic, linear, Huber, and Tukey, respectively.)
Below are histograms of the HDD balance point temperature mean and standard deviation across the 25 subsamples of each of the 263 homes using the four different loss functions. The Tukey biweight loss function produces the flattest distribution of balance point temperatures, as well as the loosest constraints; the quadratic loss function produces the tightest constraints (as well as, interestingly, the most estimates which are pegged to the upper edge of the 55-65 degree range).
Here’s the quadratic versus absolute value loss function estimates plotted against one another:
For the most part, the results are fairly consistent within the error bars--there doesn't seem to be a large bias. Since the scatter is smaller in the aggregate for the quadratic loss function, one might simply stop here and say it is the best choice.
An interesting thing to note here is that there are a few homes in the lower right hand corner where the results are quite different for the two loss functions shown above. Here’s one of them:
Indeed, there look to be some outliers, particularly in May 2013. Here’s temperature plotted versus usage:
The quadratic loss function gives a balance point of 65, which does indeed look to be driven by the outlier values (low usage at 65-75 degrees). The absolute value loss function, by contrast, yields 55 degrees--mask out those outliers, and that’s probably what you’d estimate by eye.
Comment by houghb
Choice of boundary conditions (and model selection criteria)
We set out to look at whether our current practice of allowing models to be selected even if their heating or cooling balance point is an extreme value (either the max or min of the explored range) should be changed. We planned to investigate the outcomes of three scenarios described in the UMP:
While doing this we also wanted to see if the choice of model selection criteria impacted this analysis, so we explored the two options (1 and 2 in the list above) for three different model selection criteria:
Here are some plots showing the CVRMSE and NMBE for the different options and climate zones.
When there is a difference in CVRMSE or NMBE scores the following conclusions hold:
Comment by houghb
After all this good exploration, and discussions on the phone and google docs over several weeks, it sounds like the conclusion is to stick with the current method in the spec of using adjusted R-squared as the loss function, maintaining the existing balance point range and step size (1 degree steps across a 10 degree range), and not to adjust balance points that pile up on the ends of the explored range.