-
Notifications
You must be signed in to change notification settings - Fork 25.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alternative to globals() #8667
Alternative to globals() #8667
Conversation
(RobertaConfig, (BertweetTokenizer, None)), | ||
(RobertaConfig, (PhobertTokenizer, None)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those were not in the values()
of this dict since they are replaced in the next line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm good point.
I feel like in the end this issues comes from the fact that we should have a tokenizer config instead of piggy-backing on the model config. What. do you think @LysandreJik @julien-c ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we should. This is an issue I'll try to solve next week.
@@ -195,7 +193,6 @@ | |||
(LayoutLMConfig, (LayoutLMTokenizer, LayoutLMTokenizerFast)), | |||
(DPRConfig, (DPRQuestionEncoderTokenizer, DPRQuestionEncoderTokenizerFast)), | |||
(SqueezeBertConfig, (SqueezeBertTokenizer, SqueezeBertTokenizerFast)), | |||
(BertConfig, (HerbertTokenizer, HerbertTokenizerFast)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those were not in the values() of this dict since they are replaced in the next line.
SLOW_TOKENIZER_MAPPING = { | ||
k: (v[0] if v[0] is not None else v[1]) | ||
for k, v in TOKENIZER_MAPPING.items() | ||
if (v[0] is not None or v[1] is not None) | ||
} | ||
SLOW_TOKENIZER_MAPPING[HerbertTokenizer] = HerbertTokenizerFast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not in TOKENIZER_MAPPING.items()
so not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As seen offline, probably unnecessary.
SLOW_TOKENIZER_MAPPING[HerbertTokenizer] = HerbertTokenizerFast | ||
|
||
|
||
def tokenizer_class_from_name(class_name: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function can also be transformed to a dict if we prefer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
SLOW_TOKENIZER_MAPPING = { | ||
k: (v[0] if v[0] is not None else v[1]) | ||
for k, v in TOKENIZER_MAPPING.items() | ||
if (v[0] is not None or v[1] is not None) | ||
} | ||
SLOW_TOKENIZER_MAPPING[HerbertTokenizer] = HerbertTokenizerFast |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As seen offline, probably unnecessary.
* Make ci fail * Try to make tests actually run? * CI finally failing? * Fix CI * Revert "Fix CI" This reverts commit ca7923b. * Ooops wrong one * one more try * Ok ok let's move this elsewhere * Alternative to globals() (#8667) * Alternative to globals() * Error is raised later so return None * Sentencepiece not installed make some tokenizers None * Apply Lysandre wisdom * Slightly clearer comment? cc @sgugger Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
What does this PR do?
This PR does two things:
globals