Skip to content

Fix tests and src_info for 2.29#530

Merged
WardBrian merged 6 commits intodevelopfrom
test-issues-29
Jan 31, 2022
Merged

Fix tests and src_info for 2.29#530
WardBrian merged 6 commits intodevelopfrom
test-issues-29

Conversation

@WardBrian
Copy link
Copy Markdown
Member

Submission Checklist

  • Run unit tests
  • Declare copyright holder and open-source license: see below

Summary

This fixes a few issues discovered while testing the 2.29 release candidate:

  1. The fixed_param changes are not in 2.29, which we were assuming they would be
  2. src_info() would fail if the model had any warnings, since do_command would combine the stderr and stdout streams. This bug exists even without 2.29, it just wasn't uncovered until the new warnings were added to stanc
  3. I patched around atexit in the tests to handle the intermitten issues with logging and pytest
  4. The text of the "diagnose" command changed (it outputs more digits of permission)

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company):
Simons Foundation

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@WardBrian WardBrian requested a review from mitzimorris January 31, 2022 16:50
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Jan 31, 2022

Codecov Report

Merging #530 (0a2539f) into develop (f945d50) will decrease coverage by 0.00%.
The diff coverage is n/a.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #530      +/-   ##
===========================================
- Coverage    79.13%   79.12%   -0.01%     
===========================================
  Files           45       45              
  Lines         9441     9438       -3     
===========================================
- Hits          7471     7468       -3     
  Misses        1970     1970              
Impacted Files Coverage Δ
cmdstanpy/cmdstanpy/model.py 87.44% <0.00%> (-0.03%) ⬇️
...runner/work/cmdstanpy/cmdstanpy/cmdstanpy/model.py 87.44% <0.00%> (-0.03%) ⬇️
a/cmdstanpy/cmdstanpy/cmdstanpy/model.py 88.32% <0.00%> (-0.03%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f945d50...0a2539f. Read the comment docs.

@WardBrian
Copy link
Copy Markdown
Member Author

@mitzimorris This should be ready for review now if you're able to take a look

@WardBrian
Copy link
Copy Markdown
Member Author

Comment thread test/test_sample.py
'due to this limit will result in slow exploration.',
'For optimal performance, increase this limit.',
]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate how brittle this test is - all we really care is that if mentions "maximum treedepth limit"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we have a bunch of tests which are pretty reliant on exact wording. In this case I just made the minimum change required, but it would be worth doing a pass on all of these to clean them up. I'm not always sure what the intent of the test is just from reading it, though

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you definitely did the right thing - minimal change, leave this mess for later.

we need models that will always fail in known ways and tests for existing diagnostics to check that the problem is correctly diagnosed - which requires more understanding of the problem than I had when I wrote this test back in the beginning.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it’s sort of a question of purpose, but I could see an argument toward simply testing that the diagnose command returns a (non empty) string at all, not trying to test specific outputs. Presumably cmdstan is testing it actually diagnoses as it should, we just need to test that functionality is wrapped properly

Copy link
Copy Markdown
Member

@mitzimorris mitzimorris Jan 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simply testing that the diagnose command returns a (non empty) string at all, not trying to test specific outputs.

CmdStan diagnose utility was changed to always return a string - https://github.com/stan-dev/cmdstan/blob/166712876de05c8548af05bb7acbedd0410735e0/src/cmdstan/diagnose.cpp#L214-L215

assuming that CmdStanPy will return this as well? if so, check that string is non-empty and doesn't match "no problems detected"

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant even simpler than that: call diagnose on any model (for example Bernoulli, or this one, or a few different ones) and only check that the command didn’t fail and a string was returned.

It returning a correct string is above the wrappers pay grade, in my opinion, and leads to tests being sensitive to wording etc.

It’s similar to the discussion of whether we should be testing accuracy of fits in tests, which is where we’d get occasional failures in the VI tests etc. We could instead just test that the results have the shape we want. If we want to test and verify that we’re correctly reading in values, we can do that from a known CSV file with fixed values

Now, this methodology would have downsides: we might not catch a bug in cmdstan if one slipped by the upstream tests, or if a change isn’t communicated like the default ID changing in 2.28. But our tests would be a lot cleaner

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're absolutely correct. keep it simple!

Copy link
Copy Markdown
Member

@mitzimorris mitzimorris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great!

@WardBrian WardBrian merged commit 2b14038 into develop Jan 31, 2022
@WardBrian WardBrian deleted the test-issues-29 branch January 31, 2022 20:34
@WardBrian WardBrian mentioned this pull request Feb 1, 2022
This was referenced Feb 17, 2022
@WardBrian WardBrian mentioned this pull request Aug 15, 2022
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants