-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CalTRACK Issue: Vote on/agree to uniform testing approach #129
Comments
Another way of going about this would be to submit a test methodology with a particular issue and reach consensus on the testing protocol prior to working on the issue. There are likely going to be different testing requirements and different thresholds for different issues and probably different data requirements as well. If we try to set this up beforehand, it's likely that we'll spend all of our time creating exceptions to the rules we've laid down for ourselves. |
[Pushing forward, per our working group discussion on 10/23/19.] Restating the goal: Agree on a uniform testing approach for "empirical tests to evaluate methodological choices and assumptions" (1.3.1.3) related specifically to modeling approaches. The testing approach should include:
In this context I'd like to define modeling approaches as the subset of CalTRACK methods used to identify and predict the portion of a building's energy use that fluctuates due to weather. By my count 12 of 17 open CalTRACK issues fall within this scope, while five issues do not (#129 and #122 relate to testing; #132 and #135 are procedural; #125 relates to AEU). So modeling approaches appears to be the top priority for the working group. Furthermore, it appears 10 of 12 open modeling issues are associated with improved identification (i.e. quantification) of weather related loads during the baseline period, while only two (#123 and #127) focus on prediction of energy use during reporting periods. Given this concentration of issues, and taking McGee's comment into account ("...likely going to be different testing requirements and different thresholds for different issues..."), I propose we sharpen the focus of this issue initially on testing the identification of weather related loads. In other words, testing how well CalTRACK methods identify the building loads to be normalized during a baseline period, and how each proposed enhancement impacts that ability. Benefits of this narrowed focus:
FIRST QUESTION for Caroline (original author) and the team: Is this an acceptable narrowing of scope? (Please click on the happy face icon at top right of this comment to log either a "thumbs up" or "thumbs down" on this question.) If this narrowing is supported I will next propose one such testing approach, and describe how it could be applied to our first fully approved issue (#120). |
Agree that a standard test data set could be useful, especially for issue 122 in testing impacts of version updates on results, but not so sure that standardized test methods will be that useful for testing all the different types of issues that may come up. Seems like it will just depend on what the issue is as to what kind of testing makes sense. |
Closing stale issue in preparation for new working group |
Prerequisites
Article reference number in CalTRACK documentation (optional): 1.3.1.2 and 1.3.1.3 indicate this is in scope, but it's not currently directly addressed in the methods
Description
I'd like to propose that this group agree to a uniform approach we would use to evaluate changes to modeling approaches and/or new modeling approaches. This is motivated by a desire to have a standard understanding of how versions/updates improve CalTRACK, and to take the burden of determining a testing approach off of group members who want to propose a change.
The testing approach the group settles on should include a process for testing and appropriate metrics (note: I'm talking about modeling here; you might probably need different metrics for other sections of the methods). The metrics should be able to be applied between models with data at different time resolutions (e.g., hourly and daily), and applied in out-of-sample testing. We'd also need to discuss what the counterfactual against which you'd compare would be (current CalTRACK? An older version?).
Ideally this approach would come with a standard data set (or a few), but that is not necessarily required.
Proposed test methodology
Acceptance Criteria
A supermajority or consensus vote of group members would choose a testing methodology.
The text was updated successfully, but these errors were encountered: