Pinned Loading
Repositories
Showing 10 of 38 repositories
- GPTQModel Public Forked from ModelCloud/GPTQModel
Production ready LLM model compression/quantization toolkit with hw accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
- ax-samples Public
Samples code for world class Artificial Intelligence SoCs for computer vision applications.
- SmolVLM-256M-Instruct.axera Public Forked from techshoww/SmolVLM-256M-Instruct.axera
Demo for SmolVLM-256M-Instruct on AXERA 650N
- ax-llm-SmolVLM-256M Public Forked from techshoww/ax-llm
Explore LLM model deployment based on AXera's AI chips