Skip to content
@VPTQ

VPTQ

Popular repositories Loading

  1. ik_llama.cpp ik_llama.cpp Public

    Forked from ikawrakow/ik_llama.cpp

    llama.cpp clone with additional SOTA quants and improved CPU performance

    C++ 1

  2. DeepSeek-V3 DeepSeek-V3 Public

    Forked from deepseek-ai/DeepSeek-V3

    Python 1

  3. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python

  4. llama.cpp llama.cpp Public

    Forked from ggml-org/llama.cpp

    LLM inference in C/C++

    C++

  5. text-generation-webui text-generation-webui Public

    Forked from oobabooga/text-generation-webui

    A Gradio web UI for Large Language Models.

    Python

  6. transformers transformers Public

    Forked from huggingface/transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    Python

Repositories

Showing 10 of 23 repositories
  • VPTQ/DeepSeek-V3’s past year of commit activity
    Python 1 MIT 14,700 0 1 Updated Mar 3, 2025
  • vllm Public Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    VPTQ/vllm’s past year of commit activity
    Python 0 Apache-2.0 6,110 0 0 Updated Feb 26, 2025
  • sglang Public Forked from sgl-project/sglang

    SGLang is a fast serving framework for large language models and vision language models.

    VPTQ/sglang’s past year of commit activity
    Python 0 Apache-2.0 1,140 0 0 Updated Feb 26, 2025
  • VPTQ/hessian_collector’s past year of commit activity
    Jupyter Notebook 0 0 0 0 Updated Feb 24, 2025
  • ktransformers Public Forked from kvcache-ai/ktransformers

    A Flexible Framework for Experiencing Cutting-edge LLM Inference Optimizations

    VPTQ/ktransformers’s past year of commit activity
    Python 0 Apache-2.0 802 0 0 Updated Feb 12, 2025
  • VITA Public Forked from VITA-MLLM/VITA

    ✨✨VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

    VPTQ/VITA’s past year of commit activity
    Python 0 165 0 0 Updated Jan 22, 2025
  • ipex-llm Public Forked from intel/ipex-llm

    Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc

    VPTQ/ipex-llm’s past year of commit activity
    Python 0 Apache-2.0 1,362 0 0 Updated Jan 17, 2025
  • flashinfer Public Forked from flashinfer-ai/flashinfer

    FlashInfer: Kernel Library for LLM Serving

    VPTQ/flashinfer’s past year of commit activity
    Cuda 0 Apache-2.0 238 0 0 Updated Jan 9, 2025
  • transformers Public Forked from huggingface/transformers

    🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

    VPTQ/transformers’s past year of commit activity
    Python 0 Apache-2.0 28,638 0 0 Updated Dec 20, 2024
  • llm-compressor Public Forked from vllm-project/llm-compressor

    Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

    VPTQ/llm-compressor’s past year of commit activity
    Python 0 Apache-2.0 96 0 0 Updated Dec 17, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…