Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
-
Updated
Jun 14, 2024 - Python
Ray is a unified framework for scaling AI and Python applications. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
A high-throughput and memory-efficient inference and serving engine for LLMs
The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
SkyPilot: Run LLMs, AI, and Batch jobs on any cloud. Get maximum savings, highest GPU availability, and managed execution—all with a simple interface.
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
RayLLM - LLMs on Ray
A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
Efficient AI Inference & Serving
🪶 Lightweight OpenAI drop-in replacement for Kubernetes
Friendli: the fastest serving engine for generative AI
A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
A GPT-3.5 & GPT-4 Workload Trace to Optimize LLM Serving Systems
A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App
A REST API for vLLM, production ready
Stitch simplifies and scales LLM application deployment, reducing infrastructure complexity and costs.
A framework for few-shot evaluation of autoregressive language models.
Automating the deployment of the Takeoff Server on AWS for LLMs
Add a description, image, and links to the llm-serving topic page so that developers can more easily learn about it.
To associate your repository with the llm-serving topic, visit your repo's landing page and select "manage topics."