-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements sequence_length
param
#3221
Conversation
sequence_length
param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like a potentially awkward experience for a user to set sequence_length=300
, get a config validation error requiring them to manually update a second max_sequence_length=300
parameter.
What do you think about going with a dynamic update option (which would mean adding a function to the ModelConfig's post_init
(example)?
Good point @justinxzhao – updated the PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
The lgoic looks fine to me.
|
@w4nderlust, thanks for the comments– I've added a new test that confirms this fix addresses this GitHub issue, as well as updated the schema so that the description of The unit test that validates the |
Added a couple minor comments. one thing I haven't checked though is backwards compatibility, so be mindful to check if this change makes models with the old overall this looks good for now, but I still believe that this could be solved differently and potentially better in the future. Writing here so that there's a record of it (maybe we can create an issue somewhere to capture it too). A couple options:
Both things could complement the current solution instead of replacing them. Moreover, I like the idea of an explicit sequence length parameter as it's similar to what we do for images height and width, so even if in the future we were to implement 1 and 2, this work is still valuable and still holds. |
Sounds good, thanks @w4nderlust! Regarding backwards compatibility, we should be in the clear. The Will address your comments and merge after tests pass. Thanks! |
This PR implements a new config param for text and sequence features,
sequence_length
, a nullable positive integer.sequence_length
defaults toNone
. This means that the sequence length for a given feature will be inferred from the dataset. The inferred sequence length will be capped atmax_sequence_length
, which defaults to256
.If
sequence_length
is notNone
, then the sequence length for a given feature will be the specified value and samples will be padded and truncated as needed. If the specified value forsequence_length
is greater thanmax_sequence_length
, then experiment will fail fast with an error message recommending that you setmax_sequence_length
to a value greater than or equal tosequence_length
.