-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamically set max_new_tokens
based on output feature length, GMSL and model window size
#3713
Conversation
… max sequence length and model window size
max_new_tokens
based on output feature length, GMSL and model window sizemax_new_tokens
based on output feature length, GMSL and model window size
max_new_tokens
based on output feature length, GMSL and model window sizemax_new_tokens
based on output feature length, GMSL and model window size
@@ -57,6 +59,29 @@ def test_set_pad_token_already_exists(): | |||
assert tokenizer.pad_token_id == 1 | |||
|
|||
|
|||
class TestSetContextLen: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting mechanic for grouping tests - curious if you saw this pattern recommended somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justinxzhao Definitely didn't see it recommended anywhere, but I wanted to find a logical way to group these tests together since they're about the same "topic" but testing different aspects of it, so decided to write a class. Is that fine, or would you like me to just write 4 individual tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea in general is that since they're all testing the same function but different scenarios, it makes sense to either put them all in the same dedicated module for clarity or in some sort of container like a class. Typically you can just use parameterization but this one would require a lot of conditionals to be used in the test so I decided to skip it. Alternatively, I could also combine them all into one test. All options are ok - no strong preference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see this being a useful way to organize tests particularly for very large test files. It's a bit more to maintain, but seems sufficiently lightweight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, let me split them up!
@@ -57,6 +59,29 @@ def test_set_pad_token_already_exists(): | |||
assert tokenizer.pad_token_id == 1 | |||
|
|||
|
|||
class TestSetContextLen: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see this being a useful way to organize tests particularly for very large test files. It's a bit more to maintain, but seems sufficiently lightweight.
TLDR; This fixes a bug in
model.evaluation()
that caused underreporting of performance metrics in a majority of cases. By setting max_new_tokens to the largest possible value needed, we ensure that the metric number frommodel.evaluate()
are a good representation of true model performance.This PR updates the config validation setter methods to ensure that the
generation.max_new_tokens
parameter in the model configuration is set correctly based on the maximum sequence length of the output features. This ensures accurate token generation for LLM model types. It also includes handling for different cases and provides informative logging for configuration changes.Specifically:
output
feature since that is the max number of tokens it learns during fine-tuning.