[Tokenizer] batch_encode_plus method cannot encode List[Tuple[str]] with is_pretokenized=True #5169
Closed
2 of 4 tasks
Labels
Core: Tokenization
Internals of the library; Tokenization.
馃悰 Bug
Information
Model I am using: BERT
Language I am using the model on: English
The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
Expected behavior
batch_encode_plus would be able to encode List[Tuple[List[int], List[int]]], as described in the function description, hence, the example's input_ids would be:
[[101, 2023, 2573, 102, 2023, 2573, 2205, 102]]
Environment info
transformers
version: 2.11.0The text was updated successfully, but these errors were encountered: