Skip to content
@ModelTC

ModelTC

Model Infra

Pinned Loading

  1. MQBench Public

    Model Quantization Benchmark

    Python 803 142

  2. United-Perception Public

    United Perception

    Python 432 67

  3. Dipoorlet Public

    Offline Quantization Tools for Deploy.

    Python 128 17

  4. lightllm Public

    LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

    Python 3.2k 254

  5. llmc Public

    [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

    Python 473 53

  6. OmniBal Public

    Python 21 3

Repositories

Showing 10 of 49 repositories
  • lightllm Public

    LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

    Python 3,221 Apache-2.0 254 76 11 Updated May 16, 2025
  • llmc Public

    [EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

    Python 473 Apache-2.0 53 28 0 Updated May 15, 2025
  • lightx2v Public

    Light Video Generation Inference Framework

    Python 25 6 0 2 Updated May 14, 2025
  • HarmoniCa Public

    [ICML 2025] This is the official PyTorch implementation of "HarmoniCa: Harmonizing Training and Inference for Better Feature Caching in Diffusion Transformer Acceleration".

    Python 2 Apache-2.0 0 1 0 Updated May 3, 2025
  • 0 0 0 0 Updated Apr 28, 2025
  • Dockerfile 0 0 0 0 Updated Apr 24, 2025
  • general-sam-py Public

    Python bindings for general-sam and some utilities

    Python 3 Apache-2.0 0 0 1 Updated Apr 22, 2025
  • MQBench Public

    Model Quantization Benchmark

    Python 803 Apache-2.0 142 7 5 Updated Apr 20, 2025
  • flash-attention Public Forked from Dao-AILab/flash-attention

    Fast and memory-efficient exact attention

    Python 0 BSD-3-Clause 1,699 0 0 Updated Apr 17, 2025
  • greedy-tokenizer Public

    Greedily tokenize strings with the longest tokens iteratively.

    Python 0 Apache-2.0 0 0 1 Updated Mar 24, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…