Skip to content
Change the repository type filter

All

    Repositories list

    • Shell
      Other
      53050Updated Jul 4, 2025Jul 4, 2025
    • Get up and running with Llama 3.3, DeepSeek-R1, Phi-4, Gemma 3, Mistral Small 3.1 and other large language models.
      Go
      Other
      12k621Updated Jul 4, 2025Jul 4, 2025
    • TypeScript
      Other
      0001Updated Jul 3, 2025Jul 3, 2025
    • Python
      Other
      0100Updated Jul 2, 2025Jul 2, 2025
    • torch_musa is an open source repository based on PyTorch, which can make full use of the super computing power of MooreThreads graphics cards.
      Python
      Other
      30417600Updated Jun 26, 2025Jun 26, 2025
    • Jupyter Notebook
      Other
      0310Updated May 29, 2025May 29, 2025
    • LiteGS

      Public
      A refactored codebase for Gaussian Splatting. Fastest(4.7x)!! Modular!! Pure Python or CUDA Extension
      Python
      Other
      715020Updated May 28, 2025May 28, 2025
    • muThrust

      Public
      The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
      C++
      Other
      763200Updated May 23, 2025May 23, 2025
    • kineto

      Public
      HTML
      Other
      3100Updated May 21, 2025May 21, 2025
    • StableGS

      Public
      0210Updated Apr 17, 2025Apr 17, 2025
    • TurboSplat-Viz is a 3D Gaussian Splatting (GS) renderer implemented using DirectX 12. Leveraging the exceptional performance of Mesh Shaders, DX12GSViewer achieves unparalleled speed improvements.
      C++
      MIT License
      0600Updated Apr 1, 2025Apr 1, 2025
    • Python
      1300Updated Mar 19, 2025Mar 19, 2025
    • A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
      Python
      Apache License 2.0
      447500Updated Mar 11, 2025Mar 11, 2025
    • Go
      Apache License 2.0
      41400Updated Mar 1, 2025Mar 1, 2025
    • 0800Updated Feb 28, 2025Feb 28, 2025
    • MT-DeepEP

      Public
      DeepEP: an efficient expert-parallel communication library
      C++
      Other
      833600Updated Feb 27, 2025Feb 27, 2025
    • A bidirectional pipeline parallelism algorithm for computation-communication overlap in V3/R1 training.
      Python
      Other
      298100Updated Feb 27, 2025Feb 27, 2025
    • C++
      MIT License
      01400Updated Feb 26, 2025Feb 26, 2025
    • mutlass

      Public
      MUSA Templates for Linear Algebra Subroutines
      C++
      Other
      1.3k2710Updated Feb 26, 2025Feb 26, 2025
    • MooER

      Public
      MooER: Moore-threads Open Omni model for speech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.
      Python
      Other
      1521330Updated Jan 8, 2025Jan 8, 2025
    • TurboRAG

      Public
      Python
      118050Updated Nov 25, 2024Nov 25, 2024
    • vllm_musa

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Other
      8.5k5150Updated Oct 28, 2024Oct 28, 2024
    • SimuMax

      Public
      a static analytical model for LLM distributed training
      Python
      Other
      11500Updated Oct 18, 2024Oct 18, 2024
    • RetinaGS

      Public
      Python
      Other
      72400Updated Oct 17, 2024Oct 17, 2024
    • Repository for OpenCV's extra modules
      C++
      Other
      5.8k200Updated Sep 25, 2024Sep 25, 2024
    • opencv

      Public
      Open Source Computer Vision Library
      C++
      Other
      56k1900Updated Sep 25, 2024Sep 25, 2024
    • muAlg

      Public
      Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl
      Cuda
      Other
      457300Updated Sep 13, 2024Sep 13, 2024
    • dynolog

      Public
      Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also integrates with pytorch and can trigger traces for distributed training applications.
      C++
      Other
      01300Updated Aug 7, 2024Aug 7, 2024
    • qtbase

      Public
      Qt Base (Core, Gui, Widgets, Network, ...)
      C++
      1.1k000Updated Jun 20, 2024Jun 20, 2024
    • C++
      Other
      123000Updated Jun 20, 2024Jun 20, 2024