-
Notifications
You must be signed in to change notification settings - Fork 288
Issues: InternLM/lmdeploy
[Benchmark] benchmarks on different cuda architecture with mo...
#815
opened Dec 11, 2023 by
lvhan028
Open
6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug] InternVL2-40B generates nonsense outputs
#1965
opened Jul 9, 2024 by
pseudotensor
2 tasks done
[Feature] Support for CogVLM2-Video-LLama3-Chat in TorchEngine
#1964
opened Jul 9, 2024 by
ericzhou571
[Feature] Can you please do INT4 Quantization for InternVL2-26B and InternVL2-40B
#1955
opened Jul 8, 2024 by
tairen99
[Bug] Turbomind Docker getting failed after High load
#1954
opened Jul 8, 2024 by
Tushar-ml
2 tasks done
AttributeError: 'AsyncEngine' object has no attribute 'get_ppl'
#1950
opened Jul 8, 2024 by
poisonwine
[Bug] 为什么logprobs的内容是None?Why the value of logprobs is None?
#1948
opened Jul 8, 2024 by
airaria
2 tasks done
[Bug] Llama3 Chat Template are not consistency with the Huggingface implementation.
#1945
opened Jul 8, 2024 by
efsotr
2 tasks done
[Bug] v0.5.0 crashes with CUDA OOM error while v0.4.2 does not (in exactly the same scenario - 30 concurrent requests to LLama2 70B)
#1943
opened Jul 7, 2024 by
josephrocca
2 tasks done
[Feature] Prefix cache hit/miss/eviction statistics to detect cache thrashing
#1942
opened Jul 7, 2024 by
josephrocca
[Bug] same code A800 good but A10 stuck MiniCPM-Llama3-V-2_5
#1938
opened Jul 6, 2024 by
llmrainer
2 tasks done
[Bug] unified_attention split kv for prefill with more workspace coredump
#1935
opened Jul 6, 2024 by
snippetzero
2 tasks done
[Bug] Encount TCP error (Port Aready used) when deploy with PytorchEngine
awaiting response
#1925
opened Jul 5, 2024 by
Desein-Yang
2 tasks done
[Feature] Is there any plan to support for InternLM-XComposer2.5 inference?
#1920
opened Jul 4, 2024 by
Charles-Xie
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.