New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for batch_size-related issues #799 and #755 #809
Conversation
Hello @yhn112! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:
Comment last updated at 2020-05-21 16:21:03 UTC |
A bit of comments to these changes:
|
catalyst/core/state.py
Outdated
self.loader_step: int = 0 | ||
self.loader_samples: int = 0 | ||
self.loader_len: int = 0 | ||
self.loader_batch_size = 0 | ||
|
||
self.batch_size: int = 0 | ||
|
||
self.global_samples: int = 0 | ||
self.global_step: int = 0 | ||
self.global_epoch: int = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about naming change?
for example:
global_epoch - epoch counter
global_batch_step - batch counter (not sure if we need it)
global_sample_step - samples counter (should be used for logging)
loader_batch_step - current loader batch step
loader_sample_step - current loader sample step
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with any consistent naming here. =)
About global_batch_step -- I would like to make a selectable option for loggers, to log over samples or over batches. Also, IMHO it's better to have a consistent API of state
. For future usage or for users.
catalyst/core/runner.py
Outdated
if isinstance(batch, list): | ||
self.state.batch_size = len(batch[0]) | ||
else: | ||
self.state.batch_size = next(iter(batch.values())).shape[0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
moreover, what do you think about performance of this solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
python if
statement is really computation heavy, but same is deep learning pipelines :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate this solution at all.
It would be better if there were only dict-like batches in catalyst, IMHO.
(or even better, if torch dataloaders had something like next_batch_size
property, but I definitely cannot do anything with it)
This does not seem to affect the performance very much. but ok, will do some formal tests.
@yhn112 could you please check the codestyle? |
@Scitator, FYI: But ok, fixed. |
@yhn112 This issue only occures on funcitons with inner functions :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last moment, could you please update the CHANGELOG?
I've just started writing CHANGELOG update... |
nevermind, already done, last preparations before |
* Init batch_size changes * PEP8 (line too long) fixes * global_samples rename and docs update * Fix for batches of type `list` * New renaming and fixes * Fix for tuple batches * pep8 line length * Added tests for AverageValueMeter * Code style (kind of)
Before submitting
catalyst-make-codestyle && catalyst-check-codestyle
(pip install -U catalyst-codestyle
).make check-docs
?Description
Fixes the incorrect epoch metric averaging. Also, makes counters of samples and batches more correct and consistent.
Related Issue
#799
#755
Type of Change
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
You can use 'Login as guest' to see Teamcity build logs.