-
Updated
Jul 5, 2023 - Python
llm-serving
Here are 27 public repositories matching this topic...
A framework for few-shot evaluation of autoregressive language models.
-
Updated
Sep 30, 2023 - Python
You can run any large language model on your local machine with this repository.
-
Updated
Dec 19, 2023 - Python
Efficient AI Inference & Serving
-
Updated
Jan 8, 2024 - Python
A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App
-
Updated
Jan 8, 2024 - Python
Automating the deployment of the Takeoff Server on AWS for LLMs
-
Updated
Jan 16, 2024 - Python
Streaming of LLM responses in realtime using Fastapi and Streamlit.
-
Updated
Jan 21, 2024 - Python
A guide on how to run LLMs on intel CPUs
-
Updated
Jan 23, 2024 - Python
🪶 Lightweight OpenAI drop-in replacement for Kubernetes
-
Updated
Feb 5, 2024 - Python
Building Static Web Applications using Large Language Model. From hand sketched documents, images and screenshots to proper web pages.
-
Updated
Mar 12, 2024 - Python
A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.
-
Updated
Apr 6, 2024 - Python
A Framework For Intelligence Farming
-
Updated
May 18, 2024 - Python
RayLLM - LLMs on Ray
-
Updated
May 28, 2024 - Python
Okik is a command-line interface (CLI) tool for LLM, RAG and model serving.
-
Updated
Jun 14, 2024 - Python
Stitch simplifies and scales LLM application deployment, reducing infrastructure complexity and costs.
-
Updated
Jun 2, 2024 - Python
A REST API for vLLM, production ready
-
Updated
Jun 11, 2024 - Python
A GPT-3.5 & GPT-4 Workload Trace to Optimize LLM Serving Systems
-
Updated
Jun 14, 2024 - Python
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).
-
Updated
Jun 16, 2024 - Python
A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine
-
Updated
Jun 16, 2024 - Python
Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.
-
Updated
Jun 17, 2024 - Python
Improve this page
Add a description, image, and links to the llm-serving topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the llm-serving topic, visit your repo's landing page and select "manage topics."