Popular repositories Loading
-
qwen-coder-w4a16-demo
qwen-coder-w4a16-demo PublicDeepSeek-R1 7B INT4 at 69.3 tok/s on a $300 RTX 3060. Faster than llama.cpp, vLLM, and NVIDIA TensorRT-LLM. Is one developer + Ai really better than the entire industry?
Python 1
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.