Skip to content

Can I run VLLM with 5090+5070Ti for Llama 70B Q4 (needs approximately 42 GB) inference? Or do I need identical GPUs? #14706

Unanswered
jayavanth asked this question in Q&A

You must be logged in to vote

Replies: 0 comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
1 participant