-
Notifications
You must be signed in to change notification settings - Fork 625
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compresses Diff format and fix diff bug in Transforms #1379
Conversation
Codecov Report
@@ Coverage Diff @@
## main #1379 +/- ##
==========================================
- Coverage 92.91% 92.39% -0.53%
==========================================
Files 175 175
Lines 13049 12945 -104
==========================================
- Hits 12125 11961 -164
- Misses 924 984 +60
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
self.data_updated: Set[int] = set() | ||
self.info_updated = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for now this will always be False. I've added this to the format so we don't break backwards compatibility in the future.
🚀 🚀 Pull Request
Checklist:
coverage-rate
upChanges
Compresses data_added in diff, to take up less storage by only storing first and last index added instead of full set.
The format of the dicts returned in diff has also been changed:-
Here the data_adeded is a range of sample indexes that were added to the tensor.
For example [3, 6] means that sample 3, 4 and 5 were added.
Another example [3, 3] means that no samples were added as the range is empty
data_updated on the other hand is a set of sample indexes that were updated.
For example {0, 2} means that sample 0 and 2 were updated.
"data_transformed_in_place" informs whether an inplace transformation happened in between the commits that are being looked at.
Also fixes bug in which diff wasn't being considered in transforms