New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expanded balance point search range #72
Comments
Testing: For a test dataset calculate savings using Caltrack as well as arbitrary expanded search ranges |
Explore balance point estimates by climate zone and building type? Document the diversity of results. |
To achieve accurate regressions HEA has found it is necessary to adjust balance point temperatures for each home independently across a broad temperature range, even in mild climates like the SF Bay Area.
|
That's helpful. Are you able to provide test results that show differences in results if you constrain your balance points? For example, see some of the work that was done in CalTRACK 1.0 where we established a methodology for floating balance points for daily methods. |
@margaretsheridan I wonder if you could say more about your proposal? Are you suggesting that there might be variable ranges of balance points based on the particular climate zone or building type? So we would advise limiting balance point ranges in temperate climates, but allowing more extreme values in more extreme climates? |
Energy Trust has also observed a wide range of balance points creating site-level models with good fits in various billing analyses over the years. We support increasing the range of balance point temperatures tested, but I'm not sure that it should be a user-defined. I think we should use the same ranges of balance points to be tested for every site, to be consistent. I don't have a specific recommendation for ranges of balance point temperatures to test, but we have often tested a 30 degree range of balance point temps for HDD and CDD. For example: HDD 40-70 and CDD 60-90. The HDD and CDD balance point ranges should be allowed to overlap. You may want to test both HDD60/CDD60 and HDD65/CDD70. So long as you do not allow the CDD balance point to be less than HDD balance point, I think it is fine and maybe even desirable to have some overlap between the ranges of balance point values tested. We have a slight preference for a one-degree increment in balance point temperatures tested, but it probably doesn't make that big of a difference. |
I have been looking at balance points in reference to some work I have been doing here at SMUD and I thought we may see some variation based on a number of characteristics such as climate zone, state, building type, and building age. Looking at heating and cooling slopes and balance points could help to identify outliers in cohort groups. For example, looking at large samples of homes within California building standard vintages can provide quite a bit of insight into the efficiency and use of individual homes. |
Proposed test methodology:
Acceptance criteria:
|
FWIW, I use a multi-pass grid search so that a wider search increment can be used initially. |
Great data! |
Good question, @steevschmidt . Most models didn't change with an expanded search grid and I think it's just a matter of the dense plot masking the points that did change. I went back and pulled out a 100 buildings where the optimal balance point and total savings did change and as you can see below the changes are very modest. It appears that when we change the balance point search grid, the savings breakdown (heating vs cooling vs baseload) does change quite a bit, however, the aggregate savings are relatively unchanged (usually only a 2-3% change). I'll dig in deeper when I get an opportunity. |
One comment about this: Since we will eventually, if not now, be concerned with the timing of savings, e.g. the load shape of savings, and since you wrote that the savings breakdown (heating vs cooling vs baseload) does change quite a bit, getting the appropriate balance or change point may be much more important than just for total savings, especially for interval data.
Bill Koran, P.E., CMVP
SBW Consulting, INC<http://www.sbwconsulting.com/>
503-974-9741
From: hshaban [mailto:notifications@github.com]
Sent: Thursday, March 1, 2018 11:52 AM
To: CalTRACK-2/caltrack
Cc: Bill Koran; Comment
Subject: Re: [CalTRACK-2/caltrack] Expanded balance point search range (#72)
Good question, @steevschmidt<https://github.com/steevschmidt> . Most models didn't change with an expanded search grid and I think it's just a matter of the dense plot masking the points that did change. I went back and pulled out a 100 buildings where the optimal balance point and total savings did change and as you can see below the changes are very modest. It appears that when we change the balance point search grid, the savings breakdown (heating vs cooling vs baseload) does change quite a bit, however, the aggregate savings are relatively unchanged (usually only a 2-3% change). I'll dig in deeper when I get an opportunity.
[image]<https://user-images.githubusercontent.com/13769526/36866252-a5dc1be2-1d46-11e8-847e-39576d7a4425.png>
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub<#72 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/Ai08ipbKJXlYS7LRLuCoPq3CYNRQHoJOks5taFFCgaJpZM4R1exa>.
|
Hassan wrote:
If you have an inaccurate disaggregation of heating & cooling loads, and significant changes in weather between the base and reporting years, I believe your savings calculations will be inaccurate. |
@bkoran @steevschmidt I agree that we'd ideally get an accurate load disaggregation. The results show that for some buildings, the balance points and load disaggregation change when expanding the search range, but we don't have the ground truth. The question raised by this issue is what search range we should explore when selecting models. Does using a wide search range give us a better chance at finding the best balance point or is a constrained one more appropriate? Your thoughts? |
I think wider is better, but for balance points outside of expected ranges more work can (& should?) be done. For example, I mentioned pool heating during today's call. There are homes with two different heating balance points throughout the year: one very high for pool heating during an extended summer swim period, and another during the "home heating only" parts of the year. Depending on whether your goal is "fit" or "accurate load disaggregation", one may chose different approaches. |
I also would like to emphasize the quality of the temperature data going into the model - particularly for hourly estimation. I performed correlations of hourly weather station data by proximity to fill in missing data and to help identify outliers. This cleaning activity greatly improved my adjusted R squared values. I am now using a third-party product to ensure the quality of the weather data. This should be foundational and should be part of a standardized process. |
I agree with you @steevschmidt . A wider range makes sense and we could take anomalous balance points into account with building qualification (those buildings may need custom models separate from the default Caltrack path) |
@margaretsheridan Definitely useful input. If you don't mind, could you add some thoughts around weather data cleaning and how to treat missing values in Issue #65? Or any relevant literature/documentation would also help. |
Also agree with margaretsheridan. We've seen cases where nearby NWS sites don't track each other; there are anomalous deviations for a period of time of months or more. (I've also seen where temperatures deviate dependent upon which way the wind is blowing, which is not anomalous.) This is in addition to the dropouts and bad data. Data cleaning, in general, is an important issue. |
Several related comments:
These models will almost inevitably be used in ways you don’t expect. For example, I have seen similar models used in industrial applications. In those cases, the constrained search range could be far from the truth. More below about how constrained searches can be an issue.
There is little merit, in my view, to using a narrow search range, nor to use a wide tolerance. A search for a single change point is pretty fast:
Checking using ECAM, with a search resolution of 0.05 ºF:
365 (daily) data points: less than 0.5 seconds
5110 (hourly) data points: less than 1.0 seconds
(I’m not using a decent resolution timer to get a better number.)
Of course, finding 2 change points greatly increases the iterations.
846 (hourly) data points: less than 3 seconds with a resolution of 0.18 ºF; add less than 1 second to get the resolution <0.02 ºF.
(Yes, I know these resolutions are much, much finer than we actually know the temperatures at the site.)
This is in a spreadsheet with interpreted code and worksheet functions! With compiled code and better programmers than me, on a server, maybe you can probably beat that by two orders of magnitude?
These searches use the full range of the data, merely requiring at least 3 points on each side of a change point.
Also, there is no reason to use degree-days with daily or hourly data. Just use temperature directly. There is no loss of information since degree-days are already at the daily level. Why use an intermediate calculation? It is extra, less transparent, and doesn’t plot as easily. (In most climates, we have seen very little if any value to even using degree-days at the monthly level instead of monthly average temperature, but theoretically degree-days should be better, and likely is in CA climates.)
Another point is that you need to use a wide search range because the temperatures found are not necessarily heating and cooling base temperatures: For example, in a residence with a heat pump with supplemental electric heat, there will be 2 heating slopes and hence 2 heating coefficients: One coefficient at moderately cool temperatures for heat pump heating, and a different coefficient at colder temperatures when the supplemental electric heat kicks in. Similar things occur on the cooling side with economizers and mechanical cooling. Yet another change point can occur due to equipment running out of capacity. This can occur at quite cold or quite warm temperatures.
A final, less important point: The best fit balance point temperature or change point is not always the maximum likelihood change point. This doesn’t seem to be an issue with higher-frequency data, but it can rarely be an issue with monthly billing data, and can result in one of those change points that doesn’t seem to make sense. I’ve never implemented it, but a maximum likelihood change point might avoid a few change points that fall outside of expectations.
Bill Koran, P.E., CMVP
Sr. Energy analytics engineer
SBW Consulting, Inc.
bill.koran@sbwconsulting.com<mailto:bill.koran@sbwconsulting.com>
503-974-9741
3945 parker rd.
west linn, or 97068
www.SBWConsulting.com<http://www.sbwconsulting.com/>
From: hshaban [mailto:notifications@github.com]
Sent: Thursday, March 1, 2018 1:29 PM
To: CalTRACK-2/caltrack
Cc: Bill Koran; Mention
Subject: Re: [CalTRACK-2/caltrack] Expanded balance point search range (#72)
@bkoran<https://github.com/bkoran> @steevschmidt<https://github.com/steevschmidt> I agree that we'd ideally get an accurate load disaggregation. The results show that for some buildings, the balance points and load disaggregation change when expanding the search range, but we don't have the ground truth. The question raised by this issue is what search range we should explore when selecting models. Does using a wide search range give us a better chance at finding the best balance point or is a constrained one more appropriate? Your thoughts?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<#72 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/Ai08isB3GAn7JL83ugJeCnUUuzVvX9NPks5taGgXgaJpZM4R1exa>.
|
Since I think these methods are planned to also be used for groups of homes, attached is a daily model of ~200 homes, showing 3 slopes. This is from Figure 16 in |
@bkoran I've been thinking a lot about the "elbow" issue. There's no doubt that it's observable. There are three questions that seem important to answer about it. First, what does the poor model fit do to savings estimates? If the model errors are evenly distributed over the course of a year, would we be concerned about its effects? Second, are there quantitative criteria for determining when an "elbow" would be valuable to search for? Should it be default? Should it be configurable? Should it be based on probability of occurrence? Finally, how do we test for this? The image you showed above is a kW function, not a kWh function. Is that relevant? It strikes me that when we get to hourly modeling that this is going to be a much more relevant conversation. Don't give up on this. We are benefitting from your insights on this issue. We just need a way to generalize it and test it so that it can be fit into the methods specification in a clear, broadly applicable fashion. |
I can't answer most of your questions very well. I covered savings up above; I do think it can have an effect. It could be that it's effects could be minimized by using a different fit than least squares; not sure about that. |
A bit more--this is more specific to hourly models, although it can affect daily models: The first 2 charts below are EnergyPlus models of a DOE prototype building. The first is a daily model, the second hourly. The third chart is an hourly model of a real building. |
When I was forecasting hourly system load for systems operations at SMUD years ago, I used multiple techniques including a regression with a third order polynomial for the temperature-load relationship. This would track fairly well with piecewise linear load-temp relationship. The greater simplicity of modeling would need to be examined relative to any loss in accuracy. |
This update has been integrated in CalTRACK 2. Closing this issue |
Default grid search parameters (heating: 55-65 F, cooling: 66-75 F) may be restrictive in more extreme climates.
We propose the following:
Allowing non-overlapping user-defined grid search ranges.
Allowing the balance point search to be conducted in temperature increments up to 2 deg F.
The text was updated successfully, but these errors were encountered: