CalTRACK Issue: Vote on/agree to uniform testing approach #129

carolinemfrancispge · 2019-05-15T06:08:49Z

Prerequisites

Are you opening this issue in the correct repository (caltrack/grid/seat)?
Did you perform a search of previous Github issues to check if this issue has been previously considered?

Article reference number in CalTRACK documentation (optional): 1.3.1.2 and 1.3.1.3 indicate this is in scope, but it's not currently directly addressed in the methods

Description

I'd like to propose that this group agree to a uniform approach we would use to evaluate changes to modeling approaches and/or new modeling approaches. This is motivated by a desire to have a standard understanding of how versions/updates improve CalTRACK, and to take the burden of determining a testing approach off of group members who want to propose a change.

The testing approach the group settles on should include a process for testing and appropriate metrics (note: I'm talking about modeling here; you might probably need different metrics for other sections of the methods). The metrics should be able to be applied between models with data at different time resolutions (e.g., hourly and daily), and applied in out-of-sample testing. We'd also need to discuss what the counterfactual against which you'd compare would be (current CalTRACK? An older version?).

Ideally this approach would come with a standard data set (or a few), but that is not necessarily required.

Proposed test methodology

Open discussion for proposals from interested WG members for test methodologies (CalTRACK Issue: [Monthly Fixed Effects] #117 proposes a testing approach for that issue that could be more broadly applied, CalTRACK Issue: Enhance Testing Methods #122 overlaps with this issue and also proposes a testing approach; Building qualification using baseline model fit #71 and Site Model Selection Criteria #76 contain some relevant discussions, however, they're focused on model selection for particular buildings). Proposals should outline what the test methodology is, how it would be broadly applicable to CalTRACK, its use in other contexts (and why those are relevant)/prevalence in methodological literature, and the advantages and disadvantages it would offer.
The chair may choose to consolidate the list of test methodologies, if several are similar, or narrow it down if some seem infeasible.
Group members would vote on a test methodology (by consensus if possible)

Acceptance Criteria

A supermajority or consensus vote of group members would choose a testing methodology.

mcgeeyoung · 2019-05-22T01:55:44Z

Another way of going about this would be to submit a test methodology with a particular issue and reach consensus on the testing protocol prior to working on the issue. There are likely going to be different testing requirements and different thresholds for different issues and probably different data requirements as well. If we try to set this up beforehand, it's likely that we'll spend all of our time creating exceptions to the rules we've laid down for ourselves.

steevschmidt · 2019-10-28T21:44:09Z

[Pushing forward, per our working group discussion on 10/23/19.]

Restating the goal: Agree on a uniform testing approach for "empirical tests to evaluate methodological choices and assumptions" (1.3.1.3) related specifically to modeling approaches. The testing approach should include:

appropriate metrics
a standard data set
a process for testing
support for different time resolutions (e.g., hourly and daily)

In this context I'd like to define modeling approaches as the subset of CalTRACK methods used to identify and predict the portion of a building's energy use that fluctuates due to weather. By my count 12 of 17 open CalTRACK issues fall within this scope, while five issues do not (#129 and #122 relate to testing; #132 and #135 are procedural; #125 relates to AEU). So modeling approaches appears to be the top priority for the working group.

Furthermore, it appears 10 of 12 open modeling issues are associated with improved identification (i.e. quantification) of weather related loads during the baseline period, while only two (#123 and #127) focus on prediction of energy use during reporting periods.

Given this concentration of issues, and taking McGee's comment into account ("...likely going to be different testing requirements and different thresholds for different issues..."), I propose we sharpen the focus of this issue initially on testing the identification of weather related loads. In other words, testing how well CalTRACK methods identify the building loads to be normalized during a baseline period, and how each proposed enhancement impacts that ability.

Benefits of this narrowed focus:

Original goal was quite broad and has languished; focusing on a specific subset of testing needs may be more manageable and productive;
Accurate identification of loads to be normalized is critical to all subsequent CalTRACK steps; and
Results from such testing may assist with a related goal of "establishing criteria for use cases" (1.3.1.2); in other words, what building types does CalTRACK handle best/worst?

FIRST QUESTION for Caroline (original author) and the team: Is this an acceptable narrowing of scope?

(Please click on the happy face icon at top right of this comment to log either a "thumbs up" or "thumbs down" on this question.)

If this narrowing is supported I will next propose one such testing approach, and describe how it could be applied to our first fully approved issue (#120).

danrubado · 2019-10-29T18:24:23Z

Agree that a standard test data set could be useful, especially for issue 122 in testing impacts of version updates on results, but not so sure that standardized test methods will be that useful for testing all the different types of issues that may come up. Seems like it will just depend on what the issue is as to what kind of testing makes sense.

philngo-recurve · 2022-11-21T20:59:01Z

Closing stale issue in preparation for new working group

carolinemfrancispge added the Phase 1: Pre-Draft Ideas for CalTRACk updates proposed to the working group. label May 15, 2019

jkoliner mentioned this issue May 28, 2019

Prioritization for Sprint #1, May 2019 #131

Closed

jkoliner added this to Phase 1: Pre-Draft in CalTRACK May 28, 2019

jkoliner added Phase 2: Draft Approved by working group as a deliverable. Basis of ongoing work. and removed Phase 1: Pre-Draft Ideas for CalTRACk updates proposed to the working group. labels Jun 7, 2019

jkoliner moved this from Phase 1: Pre-Draft to Phase 2: Draft in CalTRACK Jun 7, 2019

steevschmidt mentioned this issue Sep 7, 2019

CalTRACK Issue: Calculate Degree Days using Degree Hours when available #120

Closed

2 tasks

steevschmidt mentioned this issue Oct 24, 2019

CalTRACK Issue: Enhance Testing Methods #122

Closed

2 tasks

steevschmidt mentioned this issue Nov 11, 2019

CalTRACK Issue: Improved handling of "Bad Buildings" #123

Closed

2 tasks

philngo-recurve closed this as completed Nov 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CalTRACK Issue: Vote on/agree to uniform testing approach #129

CalTRACK Issue: Vote on/agree to uniform testing approach #129

carolinemfrancispge commented May 15, 2019

mcgeeyoung commented May 22, 2019

steevschmidt commented Oct 28, 2019

danrubado commented Oct 29, 2019

philngo-recurve commented Nov 21, 2022

CalTRACK Issue: Vote on/agree to uniform testing approach #129

CalTRACK Issue: Vote on/agree to uniform testing approach #129

Comments

carolinemfrancispge commented May 15, 2019

Prerequisites

Description

Proposed test methodology

Acceptance Criteria

mcgeeyoung commented May 22, 2019

steevschmidt commented Oct 28, 2019

danrubado commented Oct 29, 2019

philngo-recurve commented Nov 21, 2022