New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify maximum baseline and reporting period lengths #68
Comments
HEA has found that 18 months of daily data is optimal for heating and cooling regressions. This amount provides more accuracy than just 12 months, but does not overwhelm recent trends (a risk with longer periods). |
Interesting. It's a little counterintuitive to select 18 months (given the likelihood of over-fitting to which ever season gets counted twice). Do you have test results that show why 18 months yields better results? |
Is any thought given to known or predicted variations in operational characteristics? For example, if you know that the building operation/schedule was recently changed, you probably wouldn't want to include the data from before the change. (you'd need to do a preemptive non-routine adjustment) One would prefer to have data on the operational modes that one expects to see in the reporting period. |
Energy Trust's view is that data should be selected in increments of 12-months so that a seasonal bias is not introduced when fitting the model, as pointed out by McGee. We also have a preference for limiting the baseline and reporting periods to the 12 months of data closest to the treatment period to limit the impact of factors unrelated to the treatment, as noted by Hassan. You may get better fit statistics using 24 months of data, but it may not represent the pre-retrofit conditions as closely. A longer time series may contain a blend of current and past physical and operational conditions at the site. |
Proposed test methodology:
Acceptance criteria:
|
Adding some more clarification to the choice of a 12 month baseline period based on stakeholder input: |
This update has been integrated in CalTRACK 2. Closing this issue |
Minimum baseline and reporting period lengths are defined in Caltrack’s data sufficiency requirements, but using long baseline/reporting periods results in significantly different model fits than constrained periods (due to naturally occurring savings, non-routine events etc.)
We propose setting a limit on the data included in the baseline and reporting periods: 12-months for daily data and 24 months for monthly data.
The text was updated successfully, but these errors were encountered: