Skip to content

hasan-raja/InferHub

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InferHub

Unified multimodal AI inference platform exposing LLM, ASR, TTS and Vision through low-latency APIs with streaming, observability and rollout controls.

Phase 1 creates the production-shaped foundation:

  • FastAPI API gateway bootstrap
  • typed configuration
  • PostgreSQL, Redis, ClickHouse and Kafka local dependencies
  • health and readiness endpoints
  • Groq integration scaffold
  • Docker Compose and floci local cloud notes

See docs/phase-01.md for architecture, commands and Sarvam alignment. See docs/phase-02.md for authentication, authorization, rate limiting and model registry. See docs/phase-03.md for gRPC worker services and Groq integration. See docs/phase-04.md for client-facing inference APIs and WebSockets.

About

Unified multimodal AI inference platform exposing LLM, ASR, TTS and Vision through low-latency APIs with streaming, observability and rollout controls.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors