VPTQ
Popular repositories Loading
-
ik_llama.cpp
ik_llama.cpp PublicForked from ikawrakow/ik_llama.cpp
llama.cpp clone with additional SOTA quants and improved CPU performance
C++ 1
-
-
vllm
vllm PublicForked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
Python
-
-
text-generation-webui
text-generation-webui PublicForked from oobabooga/text-generation-webui
A Gradio web UI for Large Language Models.
Python
-
transformers
transformers PublicForked from huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
Python
Repositories
- vllm Public Forked from vllm-project/vllm
A high-throughput and memory-efficient inference and serving engine for LLMs
VPTQ/vllm’s past year of commit activity - sglang Public Forked from sgl-project/sglang
SGLang is a fast serving framework for large language models and vision language models.
VPTQ/sglang’s past year of commit activity - hessian_collector Public
VPTQ/hessian_collector’s past year of commit activity - ktransformers Public Forked from kvcache-ai/ktransformers
A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations
VPTQ/ktransformers’s past year of commit activity - VITA Public Forked from VITA-MLLM/VITA
✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction
VPTQ/VITA’s past year of commit activity - ipex-llm Public Forked from intel/ipex-llm
Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
VPTQ/ipex-llm’s past year of commit activity - transformers Public Forked from huggingface/transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
VPTQ/transformers’s past year of commit activity - llm-compressor Public Forked from vllm-project/llm-compressor
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
VPTQ/llm-compressor’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…