[Feature request] Support attention logits cap with tanh #257

merrymercy · 2024-05-24T11:44:18Z

The grok model uses tanh to cap the attention logits. Could you support this feature in flashinfer? If you need community help, any instructions on how to add this will be appreciated.

Grok (jax):
https://github.com/xai-org/grok-1/blob/7050ed204b8206bb8645c7b7bbef7252f79561b0/model.py#L864-L865

SGLang implementation (triton):
https://github.com/sgl-project/sglang/blob/2cea6146d8735780da602c0dfa0569b0fb5d47ba/python/sglang/srt/layers/extend_attention.py#L101-L102

yzh119 · 2024-05-24T15:17:42Z

Sounds good, should be easy to support.

yzh119 · 2024-05-24T16:18:54Z

Is there is formal name with this "Attention with Logits Cap" method?

merrymercy · 2024-05-24T19:25:25Z

there is no formal name. maybe just call it "logit cap"

merrymercy · 2024-06-12T09:25:20Z

@yzh119 Any progress on this issue?
FYI, TensorRT-LLM recently added this feature https://github.com/NVIDIA/TensorRT-LLM/blob/db4edea1e1359bcfcac7bbb87c1b639b5611c721/tensorrt_llm/functional.py#L4519-L4521

Implement the #257 feature.

yzh119 · 2024-06-14T09:13:38Z

Done in #298

yzh119 self-assigned this May 24, 2024

yzh119 mentioned this issue Jun 12, 2024

feat: initial support of logits hook #298

Merged

yzh119 added a commit that referenced this issue Jun 14, 2024

feat: initial support of logits hook (#298)

ab1e2ad

Implement the #257 feature.

yzh119 closed this as completed Jun 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature request] Support attention logits cap with tanh #257

[Feature request] Support attention logits cap with tanh #257

merrymercy commented May 24, 2024 •

edited

yzh119 commented May 24, 2024

yzh119 commented May 24, 2024

merrymercy commented May 24, 2024

merrymercy commented Jun 12, 2024

yzh119 commented Jun 14, 2024

[Feature request] Support attention logits cap with tanh #257

[Feature request] Support attention logits cap with tanh #257

Comments

merrymercy commented May 24, 2024 • edited

yzh119 commented May 24, 2024

yzh119 commented May 24, 2024

merrymercy commented May 24, 2024

merrymercy commented Jun 12, 2024

yzh119 commented Jun 14, 2024

merrymercy commented May 24, 2024 •

edited