Skip to content

Issues: huggingface/text-generation-inference

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

Cannot load model HuggingFaceM4/idefics2-8b-AWQ
#2036 opened Jun 7, 2024 by jla346
2 of 4 tasks
Fp8 support KV-Cache
#2027 opened Jun 6, 2024 by philschmid
API_KEY argument
#2026 opened Jun 5, 2024 by nbroad1881
Can't I run llama3 with cuda 12.0?
#2001 opened Jun 4, 2024 by uyeongkim
2 of 4 tasks
Missing Schema in API Documentation
#2000 opened Jun 4, 2024 by jkawamoto
Support for openbmb/MiniCPM-Llama3-V-2_5
#1998 opened Jun 3, 2024 by sfbemerk
2 tasks done
warmup doesn't work as expected
#1993 opened Jun 3, 2024 by meitalbensinai
2 of 4 tasks
Deberta V3 not supported
#1992 opened Jun 1, 2024 by Stealthwriter
2 of 4 tasks
Unable to load quantized commandrplus-medusa on H100
#1991 opened Jun 1, 2024 by sdadas
2 of 4 tasks
Gemma not starting with tensor parallelism
#1987 opened May 31, 2024 by arunpatala
2 of 4 tasks
Intel XPU Docker image import error on start
#1983 opened May 30, 2024 by grafail
2 of 4 tasks
ProTip! Type g p on any issue or pull request to go back to the pull request listing page.