Skip to content
@gpustack

GPUStack

Simple, scalable AI model deployment on GPU clusters

Pinned Loading

  1. gpustack Public

    Simple, scalable AI model deployment on GPU clusters

    Python 3.1k 314

  2. gguf-parser-go Public

    Review/Check GGUF files and estimate the memory usage and maximum tokens per second.

    Go 189 19

  3. llama-box Public

    LM inference server implementation based on *.cpp.

    C++ 240 22

  4. vox-box Public

    A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.

    Python 140 21

Repositories

Showing 10 of 10 repositories
  • .github Public

    Meta-Github repository for all GPUStack repositories.

    Dockerfile 0 Apache-2.0 1 0 0 Updated Jul 24, 2025
  • Python 0 Apache-2.0 2 1 0 Updated Jul 24, 2025
  • gpustack Public

    Simple, scalable AI model deployment on GPU clusters

    Python 3,136 Apache-2.0 314 442 (1 issue needs help) 9 Updated Jul 24, 2025
  • gpustack-ui Public
    TypeScript 42 Apache-2.0 29 0 0 Updated Jul 23, 2025
  • llama-box Public

    LM inference server implementation based on *.cpp.

    C++ 240 MIT 22 3 1 Updated Jul 23, 2025
  • vox-box Public

    A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.

    Python 140 Apache-2.0 21 12 0 Updated Jul 19, 2025
  • gguf-parser-go Public

    Review/Check GGUF files and estimate the memory usage and maximum tokens per second.

    Go 189 MIT 19 0 0 Updated Jul 18, 2025
  • HTML 0 2 0 0 Updated Jun 10, 2025
  • fastfetch Public Forked from fastfetch-cli/fastfetch

    Like neofetch, but much faster because written mostly in C.

    C 1 MIT 547 0 0 Updated Oct 24, 2024
  • gguf-packer-go Public

    Deliver LLMs of GGUF format via Dockerfile.

    Go 13 MIT 3 0 0 Updated Oct 24, 2024

People

This organization has no public members. You must be a member to see who’s a part of this organization.