This repo contains a jupyter notebook that will utilize the GPTQ technique to quantize LLMs. An in-depth explanation combined with examples is included in the notebook which you can follow to quantize any of the LLMs. For simplicity purposes, I have quantized an open-source language model from huggingface called dlite-v2-355m.
-
Notifications
You must be signed in to change notification settings - Fork 0
SujanNeupane42/LLM_Quantization
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Quantizing LLMs using GPTQ
Topics
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published