Pinned Loading
Repositories
Showing 10 of 39 repositories
- GPTQModel Public Forked from ModelCloud/GPTQModel
Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
- ax-samples Public
Samples code for world class Artificial Intelligence SoCs for computer vision applications.
- SmolVLM-256M-Instruct.axera Public Forked from techshoww/SmolVLM-256M-Instruct.axera
Demo for SmolVLM-256M-Instruct on AXERA 650N