Error when tokenizer is set to string: `AttributeError: 'str' object has no attribute 'pad_token_id'` #36731

TrevinAvery · 2025-03-14T21:03:49Z

I'm using the SageMaker HuggingFace inference toolkit, with the image-text-to-text task. I am getting the following error from this line in transformers/pipelines/base.py:

AttributeError: 'str' object has no attribute 'pad_token_id'

This toolkit passes a string for the tokenizer parameter to the pipeline function. The passed string matches model_dir (as seen here).

I expect the string to either 1) be used to load a valid tokenizer object, 2) be dropped because it is unused, or 3) throw an error because it is invalid. However, instead, it just passes the string forward to pipeline_class (in this case ImageTextToText), which does not accept a string for the tokenizer parameter.

Suggested fix:

When evaluating load_tokenizer, if it is false, it should either set tokenizer to None or raise an exception.

The text was updated successfully, but these errors were encountered:

Rocketknight1 · 2025-03-19T13:16:13Z

Is this happening with Gemma3?

zucchini-nlp · 2025-03-19T13:26:06Z

It should have been fixed by recent patch on the release branch, we had a bug where the tokenizer wasn't added to model-mapping

TrevinAvery mentioned this issue Mar 14, 2025

Task image-text-to-text fails with AttributeError: 'str' object has no attribute 'pad_token_id' aws/sagemaker-huggingface-inference-toolkit#135

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when tokenizer is set to string: `AttributeError: 'str' object has no attribute 'pad_token_id'` #36731

Error when tokenizer is set to string: `AttributeError: 'str' object has no attribute 'pad_token_id'` #36731

TrevinAvery commented Mar 14, 2025

Rocketknight1 commented Mar 19, 2025

zucchini-nlp commented Mar 19, 2025

Error when tokenizer is set to string: AttributeError: 'str' object has no attribute 'pad_token_id' #36731

Error when tokenizer is set to string: AttributeError: 'str' object has no attribute 'pad_token_id' #36731

Comments

TrevinAvery commented Mar 14, 2025

Rocketknight1 commented Mar 19, 2025

zucchini-nlp commented Mar 19, 2025

Error when tokenizer is set to string: `AttributeError: 'str' object has no attribute 'pad_token_id'` #36731

Error when tokenizer is set to string: `AttributeError: 'str' object has no attribute 'pad_token_id'` #36731