forked from intel/neural-compressor
-
Notifications
You must be signed in to change notification settings - Fork 0
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
License
machworklab/neural-compressor
ErrorLooks like something went wrong!
About
SOTA low-bit LLM quantization (INT8/FP8/INT4/FP4/NF4) & sparsity; leading model compression techniques on TensorFlow, PyTorch, and ONNX Runtime
Resources
License
Security policy
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Python 99.1%
- Other 0.9%