Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Yi support + benchmark results #27

Merged
merged 5 commits into from
Nov 21, 2023
Merged

Conversation

MekkCyber
Copy link
Contributor

I noticed that there is no implementation of mpt_pos_shift_attention_forward, I know it's not necessary for the code knowing that no changes are made because there is no positional encoding, however, for consistency I think it's better to have it. Feel free to accept this pull request or not :). I will try working on adding other models to the library. Thank you for your time.

@MekkCyber
Copy link
Contributor Author

Hello @tomaarsen

Do you have any suggestions about models to implement attention_sinks for ?

@tomaarsen
Copy link
Owner

Perhaps the very recent Yi models?

@MekkCyber
Copy link
Contributor Author

MekkCyber commented Nov 6, 2023

i tried to add Yi support, i think the Yi tokenizer is not integrated yet in AutoTokenizer, so to test it i used the code provided for YiTokenizer, with tokenizer.model as a vocab_file. If you have any remark please let me know.

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    # for efficiency:
    device_map="auto",
    torch_dtype=torch.float16,
    # `attention_sinks`-specific arguments:
    attention_sink_size=4,
    attention_sink_window_size=252,  # <- Low for the sake of faster generation
    trust_remote_code=True,
)
model.eval()
tokenizer = YiTokenizer('tokenizer.model')
tokenizer.pad_token_id = tokenizer.eos_token_id

@tomaarsen
Copy link
Owner

Hello!

Apologies for delaying this for a while. Regarding the tokenizer, I think that is because the AutoTokenizer also requires trust_remote_code=True, e.g.:

model_id = "01-ai/Yi-6B"
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    # for efficiency:
    device_map="auto",
    torch_dtype=torch.float16,
    # `attention_sinks`-specific arguments:
    attention_sink_size=4,
    attention_sink_window_size=252,  # <- Low for the sake of faster generation
    trust_remote_code=True,
)
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
tokenizer.pad_token_id = tokenizer.eos_token_id

And then it should be fine!

I've added some experiments, ran them, and put the results in the README. I also credited you for this addition there!

  • Tom Aarsen

@tomaarsen tomaarsen changed the title add mpt_pos_shift_attention_forward Add Yi support + benchmark results Nov 21, 2023
@tomaarsen tomaarsen merged commit 34d071c into tomaarsen:main Nov 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants