`truncation='do_not_truncate'` is not working equivalently to `truncation=False`

Hi,
`truncation='do_not_truncate'` is not working equivalently to `truncation=False`.
When using `truncation=False` and providing `max_length`, it defaults to `'longest_first'` truncation strategy.
Whether this default behavior is natural or not, isn't `False` supposed to be identical to `'do_not_truncate'`?

This leads to a situation when the user explicitly specifies `truncation=False` but the text **is tokenized**.

This manual: https://huggingface.co/docs/transformers/pad_truncation and this doc https://huggingface.co/docs/transformers/main_classes/tokenizer
say that:
>`False` or `'do_not_truncate'`: no truncation is applied. This is the default behavior.

Which means that they are supposed to be equivalent (regardless of what they do, they should behave the same).

I suggest that `False` should just mean "no truncation", regardless of `max_length` was supplied or not.

Here is a short example:

```python
from transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("facebook/bart-base")
sent = 'The quick brown fox jumps over the lazy dog'

len(tokenizer.encode(sent, max_length=5, truncation='do_not_truncate'))
```
prints: `11`

```python
len(tokenizer.encode(sent, max_length=5, truncation=False))
```
prints: `5`

Thanks,
Uri

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

`truncation='do_not_truncate'` is not working equivalently to `truncation=False` #19334

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

truncation='do_not_truncate' is not working equivalently to truncation=False #19334

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`truncation='do_not_truncate'` is not working equivalently to `truncation=False` #19334