-
Notifications
You must be signed in to change notification settings - Fork 25.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feed forward chunking others #6365
Feed forward chunking others #6365
Conversation
Update from source
Update from source
Update from source
Update from source
Update from source
Update from source
fix the shuffle agrument usage and the default (huggingface#6307)
Update from source
Codecov Report
@@ Coverage Diff @@
## master #6365 +/- ##
==========================================
+ Coverage 78.42% 80.47% +2.04%
==========================================
Files 156 156
Lines 28129 28152 +23
==========================================
+ Hits 22061 22655 +594
+ Misses 6068 5497 -571
Continue to review full report at Codecov.
|
#6024 is merged :-) Great work @Pradhy729! It would be a good idea to rebase this PR to current master so that you can easily leverage the tests that were added in #6024 just by setting the flag |
Yes - definitely will do. Was just waiting for the merge. Thanks for adding the tests. |
Update from source
4346fe2
to
b35f648
Compare
Update from source
b35f648
to
44efa91
Compare
@patrickvonplaten Feed forward chunking has been added for the following:
Also, changed model signature to have callable as first positional argument. |
Hi @patrickvonplaten --> Can you review and approve if this looks good? |
@@ -188,6 +188,7 @@ def __init__(self, **kwargs): | |||
self.pad_token_id = kwargs.pop("pad_token_id", None) | |||
self.eos_token_id = kwargs.pop("eos_token_id", None) | |||
self.decoder_start_token_id = kwargs.pop("decoder_start_token_id", None) | |||
self.chunk_size_feed_forward = kwargs.pop("chunk_size_feed_forwar", 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, thanks for adding this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move the docstring from Reformer to this file and delete the corresponding docstring / config variable from reformer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually it's aleady done - never mind
@@ -1447,7 +1447,7 @@ def prune_layer( | |||
|
|||
|
|||
def apply_chunking_to_forward( | |||
chunk_size: int, chunk_dim: int, forward_fn: Callable[..., torch.Tensor], *input_tensors | |||
forward_fn: Callable[..., torch.Tensor], chunk_size: int, chunk_dim: int, *input_tensors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for changing that. @LysandreJik - as you said this is the better order of the arguments and should be fine in terms of breaking backward compatibility
tests/test_modeling_common.py
Outdated
@@ -60,7 +60,7 @@ class ModelTesterMixin: | |||
test_resize_embeddings = True | |||
test_head_masking = True | |||
test_missing_keys = True | |||
test_chunking = False | |||
test_chunking = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove the test_chunking=True
statement in other test files as well?
Hey @Pradhy729 - this looks great!
|
dfe0497
to
82f79df
Compare
@patrickvonplaten |
LGTM! @Pradhy729 - great work! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great addition, thanks a lot!
Merging! Good job @Pradhy729 |
* Feed forward chunking for Distilbert & Albert * Added ff chunking for many other models * Change model signature * Added chunking for XLM * Cleaned up by removing some variables. * remove test_chunking flag Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
* Feed forward chunking for Distilbert & Albert * Added ff chunking for many other models * Change model signature * Added chunking for XLM * Cleaned up by removing some variables. * remove test_chunking flag Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
This reverts commit c45253a.
Adding feed forward chunking to other models. Based on #6024