Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmdstan cleanup - issues 1123, 1124, 1129, 1132, 1133 #1134

Merged
merged 75 commits into from Dec 15, 2022

Conversation

mitzimorris
Copy link
Member

@mitzimorris mitzimorris commented Nov 25, 2022

Submisison Checklist

  • Run tests: ./runCmdStanTests.py src/test
  • Declare copyright holder and open-source license: see below

Summary:

  • Add method laplace_sample
  • Add argument jacobian to method optimize; default is FALSE (current behavoir)
  • For method log_prob, add functionality to allow sets of constrained parameters as inputs in the form of a Stan CSV file.
  • Improve maintainability of CmdStan by splitting out helper functions into separate file command_helper.hpp.

Apologies for an omnibus PR - this is PR replaces #1128 - which unified treatment of argument jacobian for methods laplace_sample and optimize and also added file command_helper.hpp. The methods in command_helper.hpp encapsulate the checking and marshalling of inputs for each method, with the goal of making the command function easier to maintain.

The logic for the generate_quantities method was broken out into a few helper functions, one of which gets the constrained parameters from the Stan CSV file. This is exactly the functionality we need to send multiple sets of constrained parameters to method log_prob, so we refactored the logic for the log_prob method into a series of helper functions.

The function command still runs to 800+ non-comment lines. Some code bloat could be fixed by further using the helper functions get_arg_val throughout and possibly refactoring logic for the sample and variational method. Is it worth the effort here? Unclear.

Intended Effect:

Add new features to CmdStan and improve readability and maintainability of code base.

How to Verify:

Unit tests

Side Effects:

Documentation:

Documentation will be added to the CmdStan User's Guide in a separate PR on the docs repo.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Columbia University

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@mitzimorris
Copy link
Member Author

parameters are not being unconstrained when they are read in, which is obviously a big problem. I assumed that must have been happening inside the services layer, but it seems like it should be happening here instead.

yes, big problem and thanks for catching this one!

I fixed this and renamed a bunch of the command_helper functions. I feel like it could or should be cleaner, but given that standalone_generated quantities does the unconstraining for you, and log_prob accepts constrained parameters as either JSON, Rdump (single draw) or CSV (multiple draws), and laplace and generated quantities want Eigen::MatrixXd inputs while log_prob is simpler, I don't know how to make this cleaner, or if it's worth it.

@mitzimorris
Copy link
Member Author

ready for another round if you are!

@WardBrian
Copy link
Member

I can go through the changes first thing tomorrow. Excited to get this merged soon

Copy link
Member

@WardBrian WardBrian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really great. One more nagging set of CSV validity issues, and a couple questions on top of that

src/cmdstan/command_helper.hpp Show resolved Hide resolved
src/cmdstan/command_helper.hpp Outdated Show resolved Hide resolved
src/test/interface/laplace_test.cpp Show resolved Hide resolved
src/cmdstan/command_helper.hpp Outdated Show resolved Hide resolved
src/cmdstan/command_helper.hpp Outdated Show resolved Hide resolved
src/cmdstan/command_helper.hpp Outdated Show resolved Hide resolved
Comment on lines 467 to 468
msg << "CSV file is incomplete, expecting at least "
<< (cparam_names.size() + 1) << " values." << std::endl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed the segfault I previously had is resolved and this now looks safe.

Two (minor) things:

Instead of "values" should we say columns? For example, if a column header is missing but the value is still there.

This message is a little weird specifically in the names != values case, since I do have at least 3 values, I just also have a fourth value

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, will change message

@WardBrian
Copy link
Member

Not sure if you are still tweaking in the background but based on what you've currently pushed this seems pretty much good to go to me

@mitzimorris
Copy link
Member Author

Not sure if you are still tweaking in the background but based on what you've currently pushed this seems pretty much good to go to me

still thinking about if it's possible to do further cleanup of the args config - another PR for that.

I will make the one change and then it's ready to merge. thanks so much for the careful review!

Copy link
Member

@WardBrian WardBrian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Thanks for such a nice set of refactors and features.

@WardBrian WardBrian merged commit 528a39a into develop Dec 15, 2022
@WardBrian WardBrian deleted the issue/1133-log_prob branch December 15, 2022 20:46
@mitzimorris
Copy link
Member Author

more CSV tests for log_prob and laplace_sample

a. file only contains header comments
b. file contains config as header comment, column names row, but no data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants