Skip to content

Conversation

@paul-buerkner
Copy link
Contributor

I have started writing tests for the diagnostics module, specicially the diagnostic metrics. I have a few questions for @LarsKue before I continue adding more tests.

  • Is the way I wrote the tests generally in line with how bayesflow tests should look like?
  • When I run these tests, it runs them twice, for batch_size 2 and 3. I don't want or need to batch_size to play any role here. It's not something I use in these tests nor do I know where it's coming from. How can I deactivate that the tests are run multiple times?
  • Next on my list would be the diagnostic plots. Is there something I shall test about python plots beyond the fact that they run through?

@paul-buerkner paul-buerkner added the draft Draft Pull Request, Work in Progress label Feb 21, 2025
@codecov-commenter
Copy link

codecov-commenter commented Feb 21, 2025

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Files with missing lines Coverage Δ
bayesflow/utils/dict_utils.py 60.62% <100.00%> (+39.37%) ⬆️

... and 14 files with indirect coverage changes

Copy link
Contributor

@LarsKue LarsKue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. The tests and fixtures are generally idiomatic and in line with the rest of bayesflow. I would avoid testing against constant shapes, unless this is explicitly what you want. For example, should the shape of ce["values"] be (3,) or identical to some other shape, e.g. y.shape[1]?

As for disabling the parametrization over the batch size, I think we can turn autouse off on that fixture. If this causes other problems, I will have to make some other changes first, so it should be fine if these tests are run twice for now.

I do not think there is much we can/should do about testing plots. You could verify that the functions return the correct object. But in general, as long as the function runs, we can expect the plotting libraries to do the heavy lifting here.

@paul-buerkner
Copy link
Contributor Author

Great, thank you!

@paul-buerkner
Copy link
Contributor Author

I have now added tests for all metrics and most plots. @Kucharssim will add some tests for the model comparison plots soon in another PR. I will merge this PR as soon as all checks are passing.

@paul-buerkner paul-buerkner changed the title Draft: Add initial tests for diagnostic metrics Add initial tests for diagnostic metrics Feb 21, 2025
@paul-buerkner paul-buerkner merged commit 95d98c6 into dev Feb 21, 2025
13 checks passed
@paul-buerkner paul-buerkner deleted the test-diagnostics branch February 21, 2025 17:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

draft Draft Pull Request, Work in Progress

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants