Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support sample_weight in buildstock.csv #1056

Merged
merged 6 commits into from
Mar 21, 2023
Merged

Conversation

joseph-robertson
Copy link
Contributor

@joseph-robertson joseph-robertson commented Mar 15, 2023

Pull Request Description

Adds support for a sample_weight column in the precomputed buildstock.csv. By default, BuildExistingModel writes build_existing_model.sample_weight based on calculating n_buildings_represented / n_datapoints (this calculation is done in the workflow generator and then passed into the BuildExistingModel measure). Now, if sample_weight already exists in the buildstock.csv, it will write this value instead of the calculated one.

This PR also:

  • removes registering of weight - it doesn't appear to be used anywhere
  • removes the number_of_buildings_represented argument from BuildExistingModel - it appears it was only being used in the context of a PAT project, which we no longer support

Checklist

Not all may apply:

  • Tests (and test files) have been updated
  • Documentation has been updated
  • Changelog has been updated
  • openstudio tasks.rb update_measures has been run
  • No unexpected regression test changes on CI (checked comparison artifacts)

@joseph-robertson joseph-robertson added this to the ResStock v3.1.0 milestone Mar 15, 2023
@joseph-robertson joseph-robertson self-assigned this Mar 15, 2023
@joseph-robertson joseph-robertson marked this pull request as ready for review March 15, 2023 22:44
@joseph-robertson
Copy link
Contributor Author

Should the new allowable column, for consistency, be "Sample Weight" instead of "sample_weight"?

Copy link
Contributor

@aspeake aspeake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just had a couple of questions

Comment on lines 606 to 610
register_value(runner, 'sample_weight', sample_weight.to_s)
end

if !sample_weight.nil?
register_value(runner, 'weight', sample_weight.to_s)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are these registered as different variables? Should it just be one or the other gets stored? I do see there is a "weight" and a "sample_weight" in results.json, but then just "sample_weight" is output in results_up00.csv, can you help clarify this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering this, too. But erred on the side of not breaking the current behavior. @nmerket do you know why weight gets registered here? Was this a bsb thing at some point?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it gets registered so that it will show up in the outputs in results.csv. Then we can use it in downstream postprocessing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But doesn't it get removed in favor of using sample_weight?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would appear that it does.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any recommendations what should be done here in regard to weight? Nothing? Or should I try removing it and see what happens?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little lost on where sample_weight gets used downstream, because it seems like sample_weight never got set here before, only weight

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sample_weight is a measure argument for BuildExistingModel. As long as this argument gets assigned a value, it will "register" as build_existing_model.sample_weight (i.e., you don't need the register_value(runner, 'sample_weight', ...) call). This PR "re-registers" (overwrites) the sample_weight value as it exists in the buildstock.csv.

workflow_generator:
type: residential_hpxml
args:
build_existing_model:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we also add a sample_weight field here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe you can already do that. It will override the default calculation here: https://github.com/NREL/buildstockbatch/blob/develop/buildstockbatch/workflow_generator/residential_hpxml.py#L266-L272.

Orrrr maybe the yml schema doesn't permit a sample_weight argument. I'll look into this. You'd have a use case for making sample_weight available here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering what happens if you have a sample_weight here and one in the buildstock, but looking closer at that workflow generator, I guess you can't explicitly set one.

I guess my question still stands though, one is assigned weight and the other sample_weight?

Copy link
Contributor Author

@joseph-robertson joseph-robertson Mar 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently the yml won't validate if you include sample_weight under the build_existing_model block:

yamale.yamale_error.YamaleError: Error validating data
        build_existing_model.sample_weight: Unexpected element

We could update the schema to allow sample_weight. In the case that it's provided, it would just override the n_buildings_represented / n_datapoints calculation. The stuff downstream that this PR addresses would then apply accordingly...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to support explicitly setting sample_weight from the yml file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, I don't think that is necessary, bsb still passes sample_weight just not explicitly from the user, so I'm wondering what happens when sample_weight gets passed as an argument and there is a sample_weight column in the buildstock. Seems like the buildstock one overwrites it, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's correct. The workflow generator will continue to calculate sample_weight. But this PR will allow the presence of a sample_weight column in buildstock.csv to overwrite that calculated value.

Copy link
Contributor

@aspeake aspeake left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest changes make sense to me, seems like the weight variable is deprecated, so assuming nothing surprising pops up in CI, this looks good.

@joseph-robertson joseph-robertson merged commit 746eea1 into develop Mar 21, 2023
@joseph-robertson joseph-robertson deleted the sample-weight branch March 21, 2023 17:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants