Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CalTRACK Issue: Calculate Degree Days using Degree Hours when available #120

Closed
2 tasks done
steevschmidt opened this issue May 8, 2019 · 20 comments
Closed
2 tasks done
Labels
Phase 4: Final Approval Vote on steering ocommittee approval
Projects

Comments

@steevschmidt
Copy link

steevschmidt commented May 8, 2019

Prerequisites

  • Are you opening this issue in the correct repository (caltrack/grid/seat)?
  • Did you perform a search of previous Github issues to check if this issue has been previously considered?

Article reference number in CalTRACK documentation (optional): Unsure; maybe 3.2.1?
(Also, see original issue #94, closed during CalTRACK transition from OEE to EM2.)

Description

Background: In regions with large temperature fluctuations during the day (like the SF Bay Area) residential buildings can have significant HVAC energy use during days when the average daily temperature would indicate none should be required. For example, evening and nighttime temperatures can drop well below 60F on days with average temperatures of 65F or higher. Many homes will run their furnace on nights like these. For this reason it is more accurate to calculate HDDs on an hourly basis whenever possible, instead of using average daily temperatures, as this method will identify actual degree days that would otherwise be ignored.

Issue: Current CalTRACK methods calculate degree days (DDs) using the same periodicity as the energy data. In the case of natural gas, this means that daily gas data is matched with daily average temperatures for PG&E homes in the SF Bay Area, whereas hourly electric data is matched with more accurate hourly temperatures. Because of the nonlinearity of DDs, it is more accurate to always use hourly temperature data whenever hourly data is available; otherwise DDs in regions with high daily temperature variations will be systemically underreported, resulting in poor-fit models.

Validation: A reasonable heating balance point temperature is 65F. The attached example shows hourly temperature variation for a day with an average temperature of 65F. This example Hourly vs Daily HDD.xlsx demonstrates how using daily average temperatures will underestimate degree days: the day's average temperature results in 0 HDD65s, whereas the hourly data results in 2.4 HDD65s, a meaningful difference even for one day. This type of model error is indicated in the chart below as type 1: an energy data point used during the regression analysis will be moved on the temperature axis as the number of degree days is made more accurate.
Model
The non-linear and non-negative aspect of degree days makes this a systemic one-way error; i.e. data point shifts in one direction are not offset by a similar number of shifts in the other direction. And if this type of hourly weather variation occurs year-round, the impact on the regression results can be significant.

Requested CalTRACK change: For Daily methods, calculate DDs using hourly weather data and DHs whenever it is available. Implementation should be trivial, since the code and data for handling hourly temperate data already exists and is in use for hourly smart meter data.

Proposed test methodology

[I could use some help here...]

  1. Identify a test sample of appropriate homes in a region like the SF Bay Area where temperatures vary within each 24 hour period.
  2. Run current methods on these homes and record "before" results.
  3. Adjust the method to calculate DDs as suggested above.
  4. Rerun with the updated data, and compare "after" results.

Acceptance Criteria

Monthly bias should be reduced in most cases, and never increase.

@steevschmidt steevschmidt added the Phase 1: Pre-Draft Ideas for CalTRACk updates proposed to the working group. label May 8, 2019
@jkoliner jkoliner added this to Phase 1: Pre-Draft in CalTRACK May 28, 2019
@jkoliner jkoliner moved this from Phase 1: Pre-Draft to Phase 2: Draft in CalTRACK Jun 12, 2019
@jkoliner jkoliner added Phase 2: Draft Approved by working group as a deliverable. Basis of ongoing work. and removed Phase 1: Pre-Draft Ideas for CalTRACk updates proposed to the working group. labels Jun 12, 2019
@james-russell
Copy link

We might want to quantify the impact on weather station data sufficiency as another acceptance criteria.

@steevschmidt
Copy link
Author

Attempting to capture some discussion on this issue during the working group meeting:

  • Jon noted that the different counting of degree days demonstrated in the spreadsheet (0 using average daily temperature, 2.4 using the degree-hours method) would be taken into account by the model via selection of a different (lower) balance point temperature if the heater was actually used on the day in question.
  • Ethan agreed, but noted this would cause inaccurate results for other days throughout the year that did not demonstrate the same type of temperature swings.

Perhaps confirming Jon's point: as noted in #121 we have seen abnormally low balance point temperatures from CalTRACK; perhaps this issue is a cause. Hopefully test results would show improvements in CVRMSE values if a more accurate balance point temperature is selected?

@hshaban
Copy link
Collaborator

hshaban commented Jul 23, 2019

Sharing some preliminary testing results and recommendations:

Dataset: Two years of AMI data from a sample of 650 residential buildings in northern California that participated in a Home Performance program.

Test procedure: Consumption data was upsampled to daily frequency. The default Caltrack 2.0 daily methods were then applied to the dataset. A second set of CalTRACK models was also fit to the data but modifying 3.3.4 and 3.3.5 to use the average degree-hours over the course of a day rather than the degree days based on average daily temperature.

Results:
There was a significant impact on model coefficients. When using degree hours instead of degree days:

  • Cooling balance points, cooling coefficients and heating coefficients increased for most buildings
  • Heating balance points increased for half the sample and decreased for the other half
  • The intercept term decreased overall, drastically in some cases
  • These combined changes meant that the models were apportioning more of the buildings’ energy consumption to the weather sensitive components rather than the base load component
  • There was no significant difference in the in-sample model CVRMSE (median of 30.1% using degree days vs. 30.5% using degree hours) and avoided energy use saw a 5% increase (average of 1265 kWh using degree days vs. 1329 kWh using degree hours)
  • The difference between degree days and degree hours was also assessed across most NOAA weather stations in the US. Overall, using average degree hours always results in a larger value of degree days. The percent difference in CDD is more striking in certain locations with mild summers and large daily fluctuations (West Coast, Rockies, northern Minnesota and Appalachia). HDD based on degree-hours is also larger, but the percent difference appears less significant.

Recommendations:

  • Degree hours capture weather fluctuations in areas with mild climates and large daily swings and isn’t expected to have a significant impact in many other areas.
  • Our recommendation is to accept this suggestion and modify sections 3.3.4 and 3.3.5 to use average degree hours for calculating CDD and HDD.

image
Effect of using degree-hours on CalTRACK model coefficients.

image
Effect of using degree-hours on CalTRACK model CVRMSE and avoided energy use

image
Ratio of CDD based on degree-hours to CDD based on daily average temperatures for different NOAA weather stations (Red implies a larger difference)

image
Ratio of HDD based on degree-hours to HDD based on daily average temperatures for different NOAA weather stations (Red implies a larger difference)

@steevschmidt
Copy link
Author

Wonderful analysis Hassan. Thank you for volunteering and doing such a thorough job.

You wrote:

...avoided energy use saw a 5% increase...

Wuhoo! How soon can this be deployed? :-)

Our recommendation is to accept this suggestion and modify sections 3.3.4 and 3.3.5 to use average degree hours for calculating CDD and HDD.

HEA would second this recommendation.

Jon, pending discussion of course, could we vote tomorrow whether to formally submit it to the Steering committee for review?

@jkoliner
Copy link

We can discuss today at the meeting. @hshaban , did you investigate bias in this data set? If CVRMSE didn't change but avoided energy use predictions did, I am assuming the bias of the model changed.

@hshaban
Copy link
Collaborator

hshaban commented Aug 8, 2019

@jkoliner we didn't get a chance to talk about this on the last call. Which bias would you be interested in looking at? the bias of the models themselves (with the training data) is zero because these are linear regressions. I didn't set this test up with out-of-sample data because we only had one year of baseline. Should we sample some test data in the baseline and check on the bias there? I imagine the difference in bias between the two methods would be relatively small (going by the avoided energy use results - around 0.5% of baseline or so?) Let me know what you're thinking

@jkoliner
Copy link

jkoliner commented Aug 8, 2019

@hshaban Yes, that makes sense to me. 10% holdback per home and predict that. I would calculate bias without normalizing, but I understand in the past it's been calculated with a normalized metric. Ultimately, I understand that it will be a small change, but when our overall calculations for avoided energy use change, I'd like to know whether we've gotten more biased (artificially generous) or less (correctly less stingy). @mcgeeyoung might dispute whether that's knowable, but let's take a stab at it.

@hshaban
Copy link
Collaborator

hshaban commented Aug 14, 2019

Some out-of-sample results using the same dataset as above (with 20% holdback):

The absolute bias using degree hours averaged +0.528 kWh/day compared to +0.466 kWh/day using degree days - so the difference over a year would be about 22 kWh. It's a pretty small number relative to savings and baseline consumption - also it's based on one out-of-sample run, so might be dataset-specific and could benefit from cross-validation. The site-level comparison is also shown below - the absolute value of bias goes up for 53% of homes in this sample and goes down for the other 47%. Overall, doesn't seem like a strong enough trend to change any of the recommendations above.

image

@steevschmidt
Copy link
Author

As requested in yesterday's working group meeting, proposed changes to existing methods (strikethrough for deletions, bold for additions):

3.3.2.5. 𝐻𝐷𝐷𝑝 is the average number of heating degree days per day in period 𝑝, which is a function of the selected balance point temperature, the average daily temperatures from the weather station matched to site 𝑖 during the period 𝑝, and the number of days in period 𝑝 with matched usage and weather data for site 𝑖.3.3.4

3.3.2.6. 𝐶𝐷𝐷𝑝 is the average number of cooling degree days per day in period 𝑝, which is a function of the selected balance point temperature, the average daily temperatures from the weather station matched to site 𝑖 during the period 𝑝, and the number of days in period 𝑝 with matched usage and weather data for site 𝑖.

3.3.4.1. If only daily average temperatures are available, CDD values are calculated as follows:
[no other changes to this section]

[this entire section is new]
3.3.4.2. When hourly average temperatures are available, CDD values are calculated as follows:
3.3.4.2.1. 𝐶𝐷𝐷𝑝=(1/N𝑑,𝑝)∗Σ(𝐶𝐷𝐷𝑑,𝑏), where
3.3.4.2.2. 𝐶𝐷𝐷𝑝 = Cooling degree days for period 𝑝.
3.3.4.2.3. N𝑑,𝑝 is the total number of days elapsed between the start time of the period 𝑝 and the end time of the period 𝑝.
3.3.4.2.4. Σ() = the sum of values in () over each day 𝑑 in period 𝑝.
3.3.4.2.5. 𝐶𝐷𝐷𝑑,𝑏=Σ(𝐶𝐷𝐻𝑏,h)/24, where
3.3.4.2.6. Σ() = the sum of values in () over each hour h in day 𝑑.
3.3.4.2.7. 𝐶𝐷𝐻𝑏,h=𝑚𝑎𝑥(𝑎𝑣𝑔(𝑇𝑑,h)−𝐶𝐷𝐷𝑏,0), where
3.3.4.2.8. 𝑚𝑎𝑥() = the maximum of the two values in ().
3.3.4.2.9. 𝑎𝑣𝑔(𝑇𝑑,h) = the average temperature for day 𝑑, hour h (see 2.3.4).
3.3.4.2.10. 𝐶𝐷𝐷𝑏 = the CDD balance point that provides best model fit.

[same changes for 3.3.5, with attention paid to reversed order of terms in 3.3.5.2.7]

@jkoliner
Copy link

jkoliner commented Sep 5, 2019

Based on the acceptance criteria in the original issue post, did this pass?

@steevschmidt
Copy link
Author

In my opinion the "significant impact" of this change on model parameters is due to more accurate counting of nonlinear degree days, which make the model more accurately reflect ground truth.

But apparently that has nothing to do with CVRMSE, which I proposed as a metric based on discussions with Recurve. I'd prefer if there was some way to measure the accuracy of the new model parameters against some form of ground truth, but that's a different Issue (#122 & #129).

Perhaps Recurve can add their thoughts.

@hshaban
Copy link
Collaborator

hshaban commented Sep 9, 2019

Sorry for the late response -- I'm out on leave.

Our conclusion from these tests is that the move from daily degree days to hourly degree days would only have an impact in certain geographic locations with large swings in temperature over the course of the day. The impact is relatively small when considering model fit metrics (bias and precision). There is a large impact on model coefficients, with the models appearing to better capture weather-sensitive energy consumption (this is an improvement - especially when evaluating weather-sensitive measures). Given that model fit appears to be largely unaffected, we support this change and @steevschmidt 's recommendations for updated Caltrack language.

@jkoliner
Copy link

jkoliner commented Sep 11, 2019

Given that Recurve and HEA didn't directly address my concern, I will be a little more explicit. The acceptance criteria, defined in the original issue, stated that "Monthly bias should be reduced in most cases, and never increase." While I see that this is a small change, and that it has some favorable features (better fitting to weather-sensitive portion of the data), I would like to note that it failed our test criteria. According to out-of-sample testing:

  • CVRMSE increased slightly (worsened)
  • Bias increased slightly (worsened)
  • Payable savings increased due to this increased bias (the model became erroneously generous).

In my opinion, we should test this change on a different data set and assess whether the bias is consistently upward for all data sets, but I understand there is some momentum behind voting. Instead, I am going to surmise why the model metrics degraded, and we can think about how to deal with that later.

Truncated temperature metrics (heating degree xxx) are a type of feature engineering we use to relate temperature to heating or cooling use. Our grid search to determine a balance point operates under the assumption that at some temperature, the heating or cooling system kicks on. However, anyone who has worked with this data for any length of time knows that systems turn on at variable outdoor temperatures, affected by the lag to penetrate the building shell, humidity changing the heat content of the air, wind speeding up heat transfer, variable insolation, and user behaviors governing the desired indoor temperature (or even turning systems off!). In other words, a fixed balanced point as an assumption, applied to a full year of data, creates an imperfect proxy for the "true" independent variable. The true predictive variable is really something like "amount of heat system needs to shift to attain desired indoor temperature at any given time".

For this test, we took an imperfect proxy and we counted it at a more granular level. By doing so, we had to put more faith in our feature engineering assumptions because we introduced dissimilarities between days where there were none before. If our feature was good enough to be predictive at a more granular level, we would see CVRMSE and bias improve. However, if we had a feature with some sort of intrinsic flaw, perhaps introduced through our assumptions, then counting it at a more granular level would worsen its relationship to heating and cooling system use, thereby worsening CVRMSE and bias. This is my operating understanding of what has happened here.

@BruceMast
Copy link

As purely a point of process, in strikes me as important to the integrity of the CalTRACK methods that we establish our acceptance criteria for changes upfront and then respect them when the test results come in. If I'm following this thread correctly, my impression is that the proposal is to substitute a post-hoc acceptance criterion of impact on payable savings in lieu of the criteria that were originally proposed. Can someone assure me that this isn't what's going on? Alternatively, come someone help me understand why it should be ok?

@jkoliner
Copy link

@BruceMast I think Recurve's position is that this alleviates a known issue with fitting to the weather-sensitive portion of usage, and that is the post-hoc acceptance criterion they are applying. I believe the argument is that this is permissible because the (bad) changes to CVRMSE and bias are small.

@hshaban
Copy link
Collaborator

hshaban commented Sep 12, 2019

I think this would make a good topic for the next CalTRACK meeting in general - how should acceptance criteria be worded and applied.

@steevschmidt
Copy link
Author

HEA objects to this assumption:

Payable savings increased due to this increased bias (the model became erroneously generous).

We have noted in prior comments (some of which have been deleted) our assessment that current methods consistently underreport savings. We believe this enhancement -- which takes advantage of more accurate weather data -- reduces that underreporting.

As possible supporting evidence, it appears the magnitude of the change in savings reported by Hassan significantly exceeded the change in bias.

@jkoliner jkoliner added Voting Issue has been opened for voting and removed Phase 2: Draft Approved by working group as a deliverable. Basis of ongoing work. labels Sep 26, 2019
@jkoliner
Copy link

The current recommendation is to recommend migration to the degree hour method, as specified by Steve above, to the Steering Committee. If a member for the working group disagrees, they should log their dissent in a comment below.

@steevschmidt
Copy link
Author

As requested, Hassan and I summarized the issue for the Steering Committee in this 3 page document:
CalTRACK Working Group Recommendation #120.docx

@jkoliner jkoliner added Phase 4: Final Approval Vote on steering ocommittee approval and removed Voting Issue has been opened for voting labels Oct 2, 2019
@philngo-recurve
Copy link
Contributor

Closing stale issue in preparation for new working group

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Phase 4: Final Approval Vote on steering ocommittee approval
Projects
CalTRACK
Phase 2: Draft
Development

No branches or pull requests

6 participants