Release v1.3.5 🚀

kooyunmo released this 25 May 04:11

· 12 commits to main since this release

Optimize CPU RAM usage during quantization with offloading
Support FP8 conversion for DBRX, Mixtral, and Command R+

Assets 2