Skip to content

[Bug Report] Padding side inconsistency with Huggingface Transformers #801

@spfrommer

Description

@spfrommer

Describe the bug
The HookedTransformer tokenizer has the padding side set to "right" for Gemma 2 2b. However, the huggingface autotokenizer has the padding side set to "left." I'm not sure why these are inconsistent.

Code example

from transformer_lens import HookedTransformer
from transformers import AutoTokenizer

model = HookedTransformer.from_pretrained('google/gemma-2-2b')
tokenizer = AutoTokenizer.from_pretrained('google/gemma-2-2b')

print(model.tokenizer.padding_side)
print(tokenizer.padding_side)

Output:

right
left

System Info
Linux system: installed using pip in a Python 3.10.12 virtualenv. Package versions are:

  • transformer_lens: 2.9.0
  • transformers: 4.46.1

Checklist

  • I have checked that there is no similar issue in the repo (required)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingcomplexity-moderateModerately complicated issues for people who have intermediate experience with the codeneeds-investigationIssues that need to be recreated, or investigated before work can be done

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions