Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accuracy #73

Closed
steevschmidt opened this issue Feb 1, 2018 · 8 comments
Closed

Accuracy #73

steevschmidt opened this issue Feb 1, 2018 · 8 comments

Comments

@steevschmidt
Copy link

HEA believes accuracy of CalTRACK results is critical for the long term success of P4P programs, should be a top priority along with the other three, and should inform the priority of other tasks.

Background --

We have been analyzing residential smart meter data since 2008, and deployed our first customer-facing disaggregation tool in 2010. We learned quickly that some customers are more energy savvy than others (e.g. Art Rosenfeld, Gil Masters, CEC & CPUC staffers, etc) and that getting the analysis right for each home was crucial to providing the right recommendations. So we too have been humbled by the challenges in this space. And since very little "ground truth" data exists, we had to come up with other methods to test the accuracy of our system.

We have pursued three primary approaches:

  1. Use Artificial Data Sets: We created our first simplistic "artificial home" in 2013 in order to test various disaggregation methods. This approach to testing energy tool accuracy was also the conclusion of a CEC research project into a possible "AMI Data Analytics Testbed" lead by Martha Brook in 2016. See the final report here, and try the prototype online service (using CBECC-Res) at www.VirtualHomeData.com to create your own artificial home data (current configurations can produce energy data for 400,000 unique artificial homes).
  2. Analyze the Homes of Energy Experts: As mentioned above, there are many industry experts who know their homes well enough to assess results of detailed analysis. Their feedback has been incredibly helpful to HEA to improve our algorithms.
  3. Use Ground Truth sub metered Data, from actual homes. When available this is the preferred approach, but it is plagued with problems. Efforts to date have failed to provide complete and reliable data sets. For example, capturing 17 individual loads doesn't help if the 18th (unmetered) load is a space heater or attic fan on a thermostat. And when all the sub metered data doesn't add up to the whole house meter readings... all readings come into question.

HEA has been a long time champion for P4P because we believe it will drive EE to be more cost effective. With one P4P contract in place and a 2nd in final negotiations, HEA is very motivated to make CalTRACK successful.

Can we discuss/develop a strategy for accuracy?

@mcgeeyoung
Copy link
Contributor

For CalTRACK, we decided to use out-of-sample testing to gauge the uncertainty associated with estimating counterfactual usage. Accuracy is probably not the best way to describe the nature of the uncertainty that we're dealing with. But it turned out that out-of-sample testing proved a reliable way to evaluate methods choices. For CalTRACK 2.0 we would like to keep the same testing regime in place. We can look at specific steps in the methodology and evaluate whether or not a revised approach would yield a better out of sample result.

@steevschmidt
Copy link
Author

In addition to validating CalTRACK regression results as suggested above, another bulk approach may be much easier: compare the calculated heating intensity (btu/sf/hdd) of homes to expected norms.

Possible approach:

  • Collect total_HDDs for the period (either baseline or reporting);
  • Use regression results (hdd_coefficient) with total_HDDs to estimate heating kWhs and heating therms during the period;
  • Use site energy conversion of kWhs to BTUs (3,412) and therms to BTUs (100k) to get total heating BTUs during period;
  • Use size of home to get BTU/sf/hdd, and compare to EIA data referenced above.

Note this would only be possible for homes where we have data on all primary heating fuels (e.g. electricity and natural gas).

Any reason this wouldn't work? If it does, it may provide a useful metric for #71.

@steevschmidt
Copy link
Author

Related to NMEC accuracy, adding a reference to an excellent paper by Sam Borgeson for PG&E on targeting EE programs for SMBs. Snippet from page 54:

Potential sources of NMEC savings bias:

  1. In large samples, mean-zero fluctuations and site-specific changes in consumption are often assumed to cancel out across premises (for every site with an increase, there is a corresponding site with a decrease). However, shared factors like droughts, prevailing economic conditions, etc. can cause shifts in consumption that do not cancel out. Further, these exogenous factors can impact certain customer segments more than others.
  2. Similarly, a weather normalization model that is overly temperature sensitive or was trained using relatively cool (or hot) weather data, could create systematic biases when trying to normalize consumption for a relatively hot (or cold) year.
  3. Trends in energy consumption (i.e. organic LED adoption or plug load growth) can also undermine the assumption that models trained on pre-period data can provide unbiased estimates of the counterfactual conditions for the post-period.

@hshaban
Copy link
Collaborator

hshaban commented Jul 26, 2018

Closing this issue as out-of-sample testing was used for CalTRACK 2

@hshaban hshaban closed this as completed Jul 26, 2018
@steevschmidt
Copy link
Author

All of these issues apply to future CalTRACK improvements; I'd like to request this ticket not be closed, but instead be moved into the "future requests" category.

@hshaban
Copy link
Collaborator

hshaban commented Jul 26, 2018

Ok, will bring this to discussion with the next working group

@steevschmidt
Copy link
Author

Recently McGee posted to the Recurve blog an internal discussion titled Accuracy: Why I Hate That Term which helped me understand his prior comments (and our differing views) on this topic. I realize now we may have been talking about two different types of accuracy.

A slide from the presentation is shown here:
Screenshot 2019-08-08 at 8 12 26 AM

From HEA's perspective, the answer to the third bullet is a resounding Yes: NMEC accuracy for residential homes should include identification of ALL non-weather-related changes in energy consumption, no matter what the cause. We are normalizing [residential] building energy use for weather, and nothing else. So it's critical that we identify HVAC loads accurately.

On the other hand, the first two bullets -- and much of the related discussion in the video -- relate to accuracy of attribution (i.e. "explaining") not accuracy of NMEC. We agree the former is unknowable and agree with McGee on his analysis of that issue. However, accuracy in NMEC, the intended focus of this Issue, is a different beast altogether and can be known and measured.

For example, the "True Value" (i.e. Ground Truth) measurement of how much of a building's energy went toward heating in a given period can be measured (not modeled): every year, Gil Masters at Stanford has his building science students do this in a small mobile home with a single resistance heater, and their grade depends on the accuracy of their analysis.

Likewise when we use CalTRACK to identify heating and cooling loads in a baseline period (in order to normalize them for weather) we could measure the accuracy of the resulting model against ground truth during the baseline period: did the model produce the same value of heating load as was measured? One simple example of such a test would be to run CalTRACK on Gil's mobile home and confirm the resulting heating coefficients during the baseline period result in a heating load similar to what the students measured. But there are other ways as well; I proposed some in #122.

McGee wrote above "we decided to use out-of-sample testing to gauge the uncertainty associated with estimating counterfactual usage". As described in #123, this works only for buildings with predictable energy use: if energy use patterns during the period used to build the model differs from the energy use patterns in the out-of-sample period, all bets are off. We need to develop other methods to assess & improve the accuracy of the CalTRACK model vs the ground truth during that same baseline period.

@arstein
Copy link

arstein commented Aug 30, 2019

We concur with Steve about the importance of model accuracy to NMEC. See a related discussion here: https://gridium.com/evo-measurement-verification-accuracy/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
CalTRACK Future Improvements Roadmap
Existing Daily Methods Improvements
Development

No branches or pull requests

4 participants