Refactor evaluation loop for empty frames#858
Merged
Conversation
Collaborator
Author
|
Tensorboard is no longer a required module in pytorch lightning, so need to be explictly installed. I am updating the tests since this got caught more as a side-effect and not anything to do with tensorboard. |
Collaborator
Author
|
@henrykironde this is ready, sorry for the delay with all the docstring commits, my sphinx version was outdated. |
henrykironde
reviewed
Jan 7, 2025
Contributor
henrykironde
left a comment
There was a problem hiding this comment.
@bw4sz, this looks good! One thought: could we handle the try blocks more specifically? They feel a bit broad as they are.
692bcc7 to
68a69d7
Compare
Collaborator
Author
|
I rebased, I noticed that somehow the batch tests were commented out, I'll make a separate PR, that is unrelated. |
henrykironde
approved these changes
Jan 8, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR aims to standardize, document and better test those situations in which 1) the model doesn't make predictions, but there are ground truth to evaluate and 2) where there are no ground truth in validation, but the model makes predictions. I introduce a new evaluation metric 'empty frame accuracy' that will be useful for many users, I added tests and docs. During this, I found that the test that was called validation_step wasn't fully testing validation_step, but rather trainer.validate, which is related, but not the same. I added a proper validation step test, which required silencing the loggers, due to pytorch lightning being overeager in throwing errors if self.validation_step is run outside of trainer.validate, which is only needed for testing situations.