-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[gpt2pre 1.2] Start End Packer call method #7791
[gpt2pre 1.2] Start End Packer call method #7791
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few suggestions. Nice test coverage!
@mattsoulanille your comments have been addressed. Thanks for the review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Implements the full StartEndPacker class including it's
call()
method.This is PR 2/2 of the full StartEndPacker implementation.
Depends on #7790.
NOTE:
The keras implementation includes a
return_padding_mask
parameter in the constructor. Setting thisTrue
makes the call method return the padded ragged tensor AND a mask of padded tokens as a tuple. This tuple return was not possible in the TypeScript implementation without changing thecall
type signature to[Tensor|Tensor[], Tensor|Tensor[]]
. Therefore, I opted to create a separate method,callAndReturnPaddingMask()
that returns just that. Thecall
method simply takes the first result of this method call.Room for optimization:
Currently, a mask will be calculated every time
call
is called, even when not needed. Decomposing the function further can help and can be done in the cleanup PR.