We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support static shape (auto padding to max length on input) and static shape KV cache for LLM.
Static shape will be problem when enable NPU, WebNN or CoreML.
I can submit a PR. But I'm not pro of models, not sure correct implement of static KV cache.
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Feature request
Support static shape (auto padding to max length on input) and static shape KV cache for LLM.
Motivation
Static shape will be problem when enable NPU, WebNN or CoreML.
Your contribution
I can submit a PR.
But I'm not pro of models, not sure correct implement of static KV cache.
The text was updated successfully, but these errors were encountered: