Add initial tests for diagnostic metrics #335

paul-buerkner · 2025-02-21T15:15:08Z

I have started writing tests for the diagnostics module, specicially the diagnostic metrics. I have a few questions for @LarsKue before I continue adding more tests.

Is the way I wrote the tests generally in line with how bayesflow tests should look like?
When I run these tests, it runs them twice, for batch_size 2 and 3. I don't want or need to batch_size to play any role here. It's not something I use in these tests nor do I know where it's coming from. How can I deactivate that the tests are run multiple times?
Next on my list would be the diagnostic plots. Is there something I shall test about python plots beyond the fact that they run through?

codecov-commenter · 2025-02-21T15:23:41Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

All modified and coverable lines are covered by tests ✅

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Files with missing lines	Coverage Δ
bayesflow/utils/dict_utils.py	`60.62% <100.00%> (+39.37%)`	⬆️

... and 14 files with indirect coverage changes

LarsKue

Looks good. The tests and fixtures are generally idiomatic and in line with the rest of bayesflow. I would avoid testing against constant shapes, unless this is explicitly what you want. For example, should the shape of ce["values"] be (3,) or identical to some other shape, e.g. y.shape[1]?

As for disabling the parametrization over the batch size, I think we can turn autouse off on that fixture. If this causes other problems, I will have to make some other changes first, so it should be fine if these tests are run twice for now.

I do not think there is much we can/should do about testing plots. You could verify that the functions return the correct object. But in general, as long as the function runs, we can expect the plotting libraries to do the heavy lifting here.

paul-buerkner · 2025-02-21T15:32:28Z

Great, thank you!

paul-buerkner · 2025-02-21T17:10:36Z

I have now added tests for all metrics and most plots. @Kucharssim will add some tests for the model comparison plots soon in another PR. I will merge this PR as soon as all checks are passing.

add initial tests for diagnostic metrics

d0bd20c

paul-buerkner added the draft Draft Pull Request, Work in Progress label Feb 21, 2025

paul-buerkner requested a review from LarsKue February 21, 2025 15:15

LarsKue approved these changes Feb 21, 2025

View reviewed changes

paul-buerkner added 4 commits February 21, 2025 17:12

add more tests for diagnostics

c5c6e61

turn off autouse of batch_size fixture

65d251f

add more diagnostics tests

96589ce

minor cleaning

900fafd

paul-buerkner changed the title ~~Draft: Add initial tests for diagnostic metrics~~ Add initial tests for diagnostic metrics Feb 21, 2025

paul-buerkner merged commit 95d98c6 into dev Feb 21, 2025
13 checks passed

paul-buerkner deleted the test-diagnostics branch February 21, 2025 17:23

Kucharssim mentioned this pull request Feb 24, 2025

Tests for model comparison diagnostics + ece as a diagnostic metric #336

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add initial tests for diagnostic metrics #335

Add initial tests for diagnostic metrics #335

Uh oh!

paul-buerkner commented Feb 21, 2025

Uh oh!

codecov-commenter commented Feb 21, 2025 •

edited

Loading

Uh oh!

LarsKue left a comment •

edited

Loading

Uh oh!

paul-buerkner commented Feb 21, 2025

Uh oh!

paul-buerkner commented Feb 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Add initial tests for diagnostic metrics #335

Add initial tests for diagnostic metrics #335

Uh oh!

Conversation

paul-buerkner commented Feb 21, 2025

Uh oh!

codecov-commenter commented Feb 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

LarsKue left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

paul-buerkner commented Feb 21, 2025

Uh oh!

paul-buerkner commented Feb 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov-commenter commented Feb 21, 2025 •

edited

Loading

LarsKue left a comment •

edited

Loading