Skip to content
@gpustack

GPUStack

Simple, scalable AI model deployment on GPU clusters

Pinned Loading

  1. gpustack Public

    Simple, scalable AI model deployment on GPU clusters

    Python 3.1k 316

  2. gguf-parser-go Public

    Review/Check GGUF files and estimate the memory usage and maximum tokens per second.

    Go 189 19

  3. llama-box Public

    LM inference server implementation based on *.cpp.

    C++ 241 22

  4. vox-box Public

    A text-to-speech and speech-to-text server compatible with the OpenAI API, supporting Whisper, FunASR, Bark, and CosyVoice backends.

    Python 140 21

Repositories

Showing 10 of 10 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Most used topics

Loading…