Stars
A high-throughput and memory-efficient inference and serving engine for LLMs
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. All in a modern, AI-native editor.
Distributed ML Training and Fine-Tuning on Kubernetes
Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
Backend.AI is a streamlined, container-based computing cluster platform that hosts popular computing/ML frameworks and diverse programming languages, with pluggable heterogeneous accelerator suppor…
A unified tool for collecting system logs and other debug information
xpk (Accelerated Processing Kit, pronounced x-p-k,) is a software tool to help Cloud developers to orchestrate training jobs on accelerators such as TPUs and GPUs on GKE.
An airflow operator that executes a task in a kubernetes cluster, given a kubernetes yaml configuration or an image refrence.
Deploy a Flux MiniCluster to Kubernetes with the operator
A tool to detect infrastructure issues on cloud native AI systems
Create and manage your Notebooks on Kubernetes with ease.