[Question] Why does Transformer Lens only support quantized LLaMA models? #684

miguel-kjh · 2024-07-26T09:03:10Z

Why does Transformer Lens only support quantized LLaMA models?

Hi everyone,

I'm trying to use the transformer_lens library to study the activations of a quantized Mistral 7B model (unsloth/mistral-7b-instruct-v0.2-bnb-4bit). However, when I try to load it, I encounter a problem.

This is the code I'm using:

model_merged = model.merge_and_unload()
model_hooked = transformer_lens.HookedTransformer.from_pretrained(
    "unsloth/mistral-7b-instruct-v0.2-bnb-4bit",
    hf_model=model_merged, 
    hf_model_4bit=True, 
    fold_ln=False, 
    fold_value_biases=False, 
    center_writing_weights=False, 
    center_unembed=False, 
    tokenizer=tokenizer
)

The problem is that I get an assertion error stating that only LLaMA models can be used in quantized format with this library. This is the error message I receive:

---------------------------------------------------------------------------
AssertionError  Traceback (most recent call last)
AssertionError: Quantization is only supported for Llama models

I find it illogical and frustrating that only LLaMA models are compatible with transformer_lens in quantized format. Can anyone explain why this decision was made? Is there a technical reason behind this or any way to work around this issue so that I can use my Mistral 7B model?

I appreciate any guidance or solutions you can provide.

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Question] Why does Transformer Lens only support quantized LLaMA models? #684

[Question] Why does Transformer Lens only support quantized LLaMA models? #684

miguel-kjh commented Jul 26, 2024

[Question] Why does Transformer Lens only support quantized LLaMA models? #684

[Question] Why does Transformer Lens only support quantized LLaMA models? #684

Comments

miguel-kjh commented Jul 26, 2024

Why does Transformer Lens only support quantized LLaMA models?