Allow per-version configurations #14344
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Similarly to #12713, this allows per-version configurations. This is necessary for LayoutXLM, which up to now was using the configuration-defined
XLMRobertaTokenizer
, but which should now use theLayoutXLMTokenizer
.Updating the configuration would mean breaking all previous versions of
transformers
that were using LayoutXLM. Not updating this parameter means that LayoutXLM will never benefit fromLayoutXLMTokenizer
through theAutoTokenizer
API.Resolves #14275
This implements similar tests to the tokenizer, but instead of using
bert-base-cased
, it uses the actual model that is at issue (microsoft/layoutxlm-base
). This model should continue using theXLMRobertaTokenizer
until a new minor version is released, as the configuration I uploaded is namedconfig.4.13.0.json
: https://huggingface.co/microsoft/layoutxlm-base/blob/main/config.4.13.0.json