Annoucing easyquant for speeding up LLM inference via quantization

Latest

peterjc123 released this 31 May 09:41

· 54 commits to main since this release

llm_0.0.1

841294e

With the help of quantization, we could achieve LLM inference efficiently with lower resource usage. Please install the package below and try out the examples here. We look forward to your feedback.

Assets 14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Annoucing easyquant for speeding up LLM inference via quantization