Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New sample calculation with proof #986

Closed
wants to merge 13 commits into from
Closed

New sample calculation with proof #986

wants to merge 13 commits into from

Conversation

rajeee
Copy link
Contributor

@rajeee rajeee commented Sep 9, 2022

Pull Request Description

(Proof refers to the assertions in the code that compares new samples to old samples)

Adds python version of run_sampling.rb
Simplifies sample calculation (for modest speed boost) in run_sampling.rb.

Checklist

Not all may apply:

  • Tests (and test files) have been updated
  • Documentation has been updated
  • Changelog has been updated
  • openstudio tasks.rb update_measures has been run
  • No unexpected regression test changes on CI (checked comparison artifacts)

@@ -219,8 +219,10 @@ def _sample_probability_distribution(prob_dist, num_samples)
end

return_samples = {}
return_samples_v2 = {}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New version.
Once the new version is accepted to be okay, we can remove the old version. Keeping both for now to make it possible to compare.

@@ -268,14 +281,33 @@ def _sample_probability_distribution(prob_dist, num_samples)
end
end

if not (return_samples == return_samples_v2)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For most cases, the new version is equal to old version; When they are not, making sure that the difference can be justified.

@shorowit
Copy link
Contributor

I sampled project_national w/ 350k datapoints on Windows Subsystem for Linux.

  • run_sampling.rb (v1): 4m6s
  • run_sampling.rb (v2): 3m32s
  • run_sampling.py: 16m28s

All three produced CSV files of roughly the same size, so that's a good sign.

Logs:
run_sampling_rb_v1.txt
run_sampling_rb_v2.txt
run_sampling_py.txt

@rajeee
Copy link
Contributor Author

rajeee commented Sep 10, 2022

@shorowit Thanks!
I am really surprised with the runtimes though. In my benchmarking, python version was running much faster! I ran it directly on windows though.

@rajeee
Copy link
Contributor Author

rajeee commented Sep 10, 2022

Here is what I got on Mac and on Eagle.
My M1 Mac
Python: 33.4s
Ruby: 1m53s

Eagle Login Node:
Python: 1m29s
Ruby: 3m18s

run_sampling.py.m1mac.txt
run_sampling.py.eagle_login_node.txt
run_sampling.rb.m1mac.txt
run_sampling.rb.eagle_login_node.txt

WSL seems to be doing something really bad for python!

@shorowit
Copy link
Contributor

Here are my results on Windows.

  • run_sampling.rb (v1): 8m12s
  • run_sampling.rb (v2): 7m36s
  • run_sampling.py: 1m1s

Very impressive! Definitely helps to make use of multiple cpu cores (my machine has 16). Not sure what's going on w/ my WSL results above.

@aspeake
Copy link
Contributor

aspeake commented Sep 13, 2022

@rajeee @shorowit thanks guys. Sounds like this could be really helpful to @vtnate and myself. We will run some checks and most likely use it for our custom sampler in the buildstock-projections library. I'll keep you updated on the progress of the down-selection feature.

@rajeee
Copy link
Contributor Author

rajeee commented Sep 22, 2022

There is a certain limitation of current sampling approach.

Consider this simplified case of just two TSVs.
Bedrooms.TSV

Option=1 Option=2 Option=3 Option=4 Option=5 sampling_probability
0.2 0.2 0.2 0.2 0.2 1

Ceiling Fan.TSV

Dependency=Bedrooms Option=None Option=Standard Efficiency Option=Premium Efficiency sampling_probability
1 0.4 0.4 0.2 0.2
2 0.4 0.4 0.2 0.2
3 0.4 0.4 0.2 0.2
4 0.4 0.4 0.2 0.2
5 0.4 0.4 0.2 0.2

For n=10, one would expect two samples each for bedrooms = 1 through 5.
One would also expect, 4 samples for Ceiling Fan = None, 4 samples for Ceiling Fan = Standard Efficiency and 2 samples for Ceiling Fan=Premium Efficiency.

However, run_sampling.rb on these TSVs would generate these samples on buildstock.csv:

Building Bedrooms Ceiling Fan
4 1 None
9 1 Standard Efficiency
2 2 None
5 2 Standard Efficiency
3 3 None
8 3 Standard Efficiency
1 4 None
10 4 Standard Efficiency
6 5 Standard Efficiency
7 5 None

There are 5 samples each for None and Standard Efficiency and no samples for Premium Efficiency.

This limitation is easily explained by the way current sampling works. It first samples Bedrooms that has no dependency, so it assigns two samples for each number of bedrooms. Then for each of these two samples, it tries to distribute the Fan Efficiency. Since there are 3 fan efficiencies, but only two samples, it picks the two option with the highest probability. While this ensures that the distribution follows the TSV specification as close as possible for this particular bedroom, on doing so for all bedrooms, we miss out on the fact that the probability for Option=Premium, small as it may be, should add up and we should eventually get some samples for it.

Although this issue is demonstrated for a case where the number of samples is very small, this phenomena can occur even for the case of very large n when sampling within a narrow slice created by a large number of dependencies and their options.

I am thinking of a slight modification to the sampling approach where we accumulate the probability of discarded options, and use that to ensure that low probability options also get some samples.

@shorowit
Copy link
Contributor

In my opinion, this is a feature, not a limitation/bug. Indeed the sampling algorithm was intentionally developed this way.

In your example, if you did end up end up with 2 Premium Efficiency ceiling fans, they would be randomly assigned to buildings. If you were to then look at ceiling fan energy use as a function of number of bedrooms, the trend would have random spikes based on this random assignment and a user would likely be confused. Or, a user might grab the two 2-bedroom building models and use them for an analysis assuming that these models reflect the highest probability building characteristics, but they would not.

The bottom line is that there is no good sampling result when you have too few samples for a given distribution. And it's important that ResStock maintain enough samples given its distributions. In the absence of having enough samples, your proposed approach may be what some users want and the current approach may be what other users want. The current algorithm reflects our consensus from many years ago as to which use cases are more important.

@afontani may have thoughts here, I think that he and I had some discussions about all of this many years ago.

@rajeee
Copy link
Contributor Author

rajeee commented Sep 22, 2022

@shorowit
Thanks for the commentary.

I agree it's trade-off between two things - making global distribution more accurate vs making local distribution more accurate.

My feelings was that we have more confidence on global distribution (what % of homes nationally have premium fan) than on local distribution (of homes that have 2 bedrooms, what % have premium fan), so it might be more suitable to prioritize that. But, yeah, my presumption maybe wrong.

Also, I might be raising sort of a false flag. I did notice some parameters had slightly (very tiny in fact; couple of samples in a 100K total samples), different global distribution than what one would expect based on sampling_probability calculation. I was investigating what could have caused that. I am working on a test to validate the generated buildstock.csv against the TSVs and was investigating this as a part of that. I will see if we even have any substantial discrepancy for any parameters; maybe we don't.

Anyway, this was an interesting exercise on understanding the behavior of sampling.

@shorowit
Copy link
Contributor

@rajeee I completely agree with your assessment. And to be clear, I'm not arguing that the current behavior is correct just because it's always been this way. I look forward to the results of your investigations and any improvements you can make! Developing a test around all of this is a definite good first step.

@afontani
Copy link
Contributor

@rajeee: Thanks for thinking through this. I also look forward to any results or comparisons at larger sample sizes to see if there are any issues.

I agree with @shorowit about this not actually being a bug but a feature. I am not positive, but I think this behavior helps the algorithm converge to the distributions at $O(n^{-1})$ vs $O(n^{-1/2})$. I think the point of quota-based sampling is to sample the center of the distribution so that the estimate of the mean quantity of interest has a reasonable estimate at smaller sample sizes. Ultimately, I would want any modification to the algorithm to preserve the $O(n^{-1})$ convergence rate.

I think we will have issues with small sample sizes regardless of the algorithm. In this case, I would prefer the algorithm as it is currently written. But for larger sample sizes, like for a national run, 500k samples, I am open to modifying if there are use cases that properly motivate the change. I do worry about how people slice our data from a large run and if the user is getting a more representative set of samples or a set of edge cases.

@rajeee
Copy link
Contributor Author

rajeee commented Sep 29, 2022

@shorowit @afontani Thanks for the comments. As per your feedback, I decided to keep the current approach since it has better accuracy in regions where we have low samples.

I continued the development of the python sampler here: NREL/buildstockbatch#308
In addition to sampling, it now also has a verifier that can verify buildstock.csv for correctness. It also calculates "sampling_errors" (for lack of a better word) for each TSV. You can learn more in that PR link.

I verified the samples generated by all three versions (ruby_v1, ruby_v2 and python) to be correct.

I did 5 sampling runs at n=350K between ruby_v1, ruby_v2 and python sampler using different random seed each time and compared the sampling_error (errors for both dependency groups, and errors for options). There is no qualitative difference between them. This is for 350K samples. So, we have up-to 10% option distribution error and 18% group distribution error for some of the large TSVs.

image

ZOOMED
image

image

@rajeee
Copy link
Contributor Author

rajeee commented Oct 3, 2022

Error distribution convergence:

image

image

Convergence of the average errors:
image

Log-log plot asymptomatically converges to slope of -1.
image

@afontani afontani closed this Oct 19, 2022
@shorowit
Copy link
Contributor

Moved to NREL/buildstockbatch#308

joseph-robertson added a commit that referenced this pull request Feb 3, 2023
…a8d95

fda505a8d95 Update measure argument description.
5ea2342916f Update changelog and docs.
a63142a44f7 Add reporting test for advanced output variables.
f574c52e842 Add reporting test and update template osw.
18193edc885 Add new optional argument for output variables.
602378b40ca Merge pull request #1001 from NREL/eri_reporting
ce9a1a6b90e Latest results.
1a9b90987d5 Increase number of output decimal places and a little code cleanup.
c67d0f7a780 Update xml
59bab55d0bf EC bugfix.
66d21ad412d Refactor/simplify ReportSimulationOutput for ERI purposes. No longer creates a separate *_ERI.csv.
0457f4db9f9 Revert "Add HPXML has_fuel helper method."
365c8e2fcdd Add HPXML has_fuel helper method.
c8b56ceb83a Fix case.
67e93b0c152 Clarify documentation for interior surfaces (or flat roofs) regarding Azimuth/Orientation defaulting.
a7f6341d493 Merge pull request #999 from NREL/shared_recirc_temperature
d5fc0daed83 Drop ABCDE
4bb47a2d00a Updated xml
5c3c071f772 Adds support for shared hot water recirculation systems controlled by temperature, per ANSI/RESNET/ICC 301-2019 Addendum C.
916014c91c8 Merge pull request #996 from NREL/battery-docs
48503939ee9 Merge pull request #995 from NREL/fossil_fuel_emissions_factors
d0a9528e078 Minor tweak.
9c4727d50e7 Adds some explanation for how batteries are controlled and other misc cleanup.
6d9edb4f618 Latest results.
e0369eb925d Rename "CO2" to "CO2e". Updated cambium data to be CO2e and combined (combustion + pre-combustion).
6c9cd855816 Updated default fossil fuel emissions factors to include pre-combustion (e.g., methane leakage for natural gas).
58bccc6420c Merge pull request #992 from NREL/minitest_helper2
9b7ce3f3c02 Merge branch 'master' of https://github.com/NREL/OpenStudio-HPXML into minitest_helper2
6a451ecdb45 Revert to original config
713fb844c91 Merge pull request #993 from NREL/goodbye_beta
bfa2365a843 Remove the "Beta" label from the BuildResidentialHPXML measure. [ci skip]
f5729c759b8 Aha, this test file was not loading minitest_helper.rb.
77ca82d4152 Try excluding build_residential_hpxml_test.rb
854d4ccb090 First, reproduce issue on CI.
d174ace1df1 Merge pull request #987 from NREL/hvac_iq_limits
9d40f9e25e6 Update Changelog.md [ci skip]
7a1644d70e9 Merge branch 'master' of https://github.com/NREL/OpenStudio-HPXML into hvac_iq_limits
06bd81964cc Merge pull request #986 from NREL/docs_on_ci
afe5d64f823 Merge branch 'master' of https://github.com/NREL/OpenStudio-HPXML into docs_on_ci
e5de16e128d Merge pull request #988 from NREL/minitest_helper
66bf8248d02 Possible fix to obtain code coverage on CI?
e6ee2daa0cb Try reverting minitest_helper.rb change.
33b4d180cf9 Adds more stringent limits for `AirflowDefectRatio` and `ChargeDefectRatio` (-0.9 to 9, which now allows values from 1/10th to 10x the design value).
f1c13fb6871 Re-enable CI steps
913f947f19a Fix error?
3c21bf713a8 Install pip
3afc9dcf5c0 Install sphinx
f8e4ca4a4c8 Merge branch 'master' of https://github.com/NREL/OpenStudio-HPXML into docs_on_ci
f0844a93edf Merge pull request #978 from NREL/leap-year-handling
040604b1106 Add newline
44224c6a06d Build docs on CI, store as artifacts.
852a36d084f Update workflow test file.
bc8ccc2868a Update error test.
219cbdfee88 Update error message.
69319016c04 Update changelog and docs.
f782deade4f Merge branch 'master' into leap-year-handling
39686d9de69 Merge pull request #980 from NREL/report-sim-output-setup-fix
a80a1f25183 Return if model not retrieved.
6dd4463a51d Latest results.
335370e3c0b Update defaults header test.
bbd8c3bcc1a Add test for throwing error.
24936969f9e Throw error if leap year with TMY.
14519b064f5 Clean up translator measure.
c7a8ad6a2ef Change to non leap year in test xml file.
6a1ea0ad3ba Update README.md

git-subtree-dir: resources/hpxml-measures
git-subtree-split: fda505a8d95267aaee4f8ae832f338f321981d00
joseph-robertson added a commit that referenced this pull request Feb 3, 2023
…f619e

75de13f619e Simpler fix to negative peak/design values.
51dbd82f910 Merge pull request #1007 from NREL/negative_peak_flow_rate
f5ee3eee6a6 Prevent possibility of trying to set, e.g., negative peak flow rate.
0a561c1369e Merge pull request #1006 from NREL/door_bugfix
b5440086603 Bugfix for cases where door R-value w/o air film goes negative. (Can occur for, e.g., CZ 1 where IECC door U-factor is 1.2.)
602378b40ca Merge pull request #1001 from NREL/eri_reporting
ce9a1a6b90e Latest results.
1a9b90987d5 Increase number of output decimal places and a little code cleanup.
c67d0f7a780 Update xml
59bab55d0bf EC bugfix.
66d21ad412d Refactor/simplify ReportSimulationOutput for ERI purposes. No longer creates a separate *_ERI.csv.
0457f4db9f9 Revert "Add HPXML has_fuel helper method."
365c8e2fcdd Add HPXML has_fuel helper method.
c8b56ceb83a Fix case.
67e93b0c152 Clarify documentation for interior surfaces (or flat roofs) regarding Azimuth/Orientation defaulting.
a7f6341d493 Merge pull request #999 from NREL/shared_recirc_temperature
d5fc0daed83 Drop ABCDE
4bb47a2d00a Updated xml
5c3c071f772 Adds support for shared hot water recirculation systems controlled by temperature, per ANSI/RESNET/ICC 301-2019 Addendum C.
916014c91c8 Merge pull request #996 from NREL/battery-docs
48503939ee9 Merge pull request #995 from NREL/fossil_fuel_emissions_factors
d0a9528e078 Minor tweak.
9c4727d50e7 Adds some explanation for how batteries are controlled and other misc cleanup.
6d9edb4f618 Latest results.
e0369eb925d Rename "CO2" to "CO2e". Updated cambium data to be CO2e and combined (combustion + pre-combustion).
6c9cd855816 Updated default fossil fuel emissions factors to include pre-combustion (e.g., methane leakage for natural gas).
58bccc6420c Merge pull request #992 from NREL/minitest_helper2
9b7ce3f3c02 Merge branch 'master' of https://github.com/NREL/OpenStudio-HPXML into minitest_helper2
6a451ecdb45 Revert to original config
713fb844c91 Merge pull request #993 from NREL/goodbye_beta
bfa2365a843 Remove the "Beta" label from the BuildResidentialHPXML measure. [ci skip]
f5729c759b8 Aha, this test file was not loading minitest_helper.rb.
77ca82d4152 Try excluding build_residential_hpxml_test.rb
854d4ccb090 First, reproduce issue on CI.
d174ace1df1 Merge pull request #987 from NREL/hvac_iq_limits
9d40f9e25e6 Update Changelog.md [ci skip]
7a1644d70e9 Merge branch 'master' of https://github.com/NREL/OpenStudio-HPXML into hvac_iq_limits
06bd81964cc Merge pull request #986 from NREL/docs_on_ci
afe5d64f823 Merge branch 'master' of https://github.com/NREL/OpenStudio-HPXML into docs_on_ci
e5de16e128d Merge pull request #988 from NREL/minitest_helper
66bf8248d02 Possible fix to obtain code coverage on CI?
e6ee2daa0cb Try reverting minitest_helper.rb change.
33b4d180cf9 Adds more stringent limits for `AirflowDefectRatio` and `ChargeDefectRatio` (-0.9 to 9, which now allows values from 1/10th to 10x the design value).
f1c13fb6871 Re-enable CI steps
913f947f19a Fix error?
3c21bf713a8 Install pip
3afc9dcf5c0 Install sphinx
f8e4ca4a4c8 Merge branch 'master' of https://github.com/NREL/OpenStudio-HPXML into docs_on_ci
f0844a93edf Merge pull request #978 from NREL/leap-year-handling
040604b1106 Add newline
44224c6a06d Build docs on CI, store as artifacts.
852a36d084f Update workflow test file.
bc8ccc2868a Update error test.
219cbdfee88 Update error message.
69319016c04 Update changelog and docs.
f782deade4f Merge branch 'master' into leap-year-handling
39686d9de69 Merge pull request #980 from NREL/report-sim-output-setup-fix
a80a1f25183 Return if model not retrieved.
6dd4463a51d Latest results.
335370e3c0b Update defaults header test.
bbd8c3bcc1a Add test for throwing error.
24936969f9e Throw error if leap year with TMY.
14519b064f5 Clean up translator measure.
c7a8ad6a2ef Change to non leap year in test xml file.
6a1ea0ad3ba Update README.md

git-subtree-dir: resources/hpxml-measures
git-subtree-split: 75de13f619e9b4b98f7f33889caf8de406f6ea07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants