The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
-
Updated
May 29, 2024 - Python
The easiest way to serve AI/ML models in production - Build Model Inference Service, LLM APIs, Multi-model Inference Graph/Pipelines, LLM/RAG apps, and more!
Pretrain, finetune, deploy 20+ LLMs on your own data. Uses state-of-the-art techniques: flash attention, FSDP, 4-bit, LoRA, and more.
Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.
🔮 SuperDuperDB: Bring AI to your database! Build, deploy and manage any AI application directly with your existing data infrastructure, without moving your data. Including streaming inference, scalable model training and vector search.
Generative AI reference workflows optimized for accelerated infrastructure and microservice architecture.
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Code examples and resources for DBRX, a large language model developed by Databricks
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Sparsity-aware deep learning inference runtime for CPUs
irresponsible innovation. Try now at https://chat.dev/
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
RayLLM - LLMs on Ray
LLMs and Machine Learning done easily
A library to communicate with ChatGPT, Claude, Copilot, Gemini, HuggingChat, and Pi
[ICML'24] EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
AI-powered cybersecurity chatbot designed to provide helpful and accurate answers to your cybersecurity-related queries and also do code analysis and scan analysis.
LLMFlows - Simple, Explicit and Transparent LLM Apps
The official repo of Aquila2 series proposed by BAAI, including pretrained & chat large language models.
A tool for generating function arguments and choosing what function to call with local LLMs
Efficient AI Inference & Serving
Add a description, image, and links to the llm-inference topic page so that developers can more easily learn about it.
To associate your repository with the llm-inference topic, visit your repo's landing page and select "manage topics."