llm-serving

Star

Here are 27 public repositories matching this topic...

Stosan / commentator

Star

generative-ai llm-serving

Updated Jul 5, 2023
Python

friendliai / lm-evaluation-harness

Star

A framework for few-shot evaluation of autoregressive language models.

llms generative-ai llm-serving llm-inference

Updated Sep 30, 2023
Python

suleymansevimli / run-llm-model-locally

Star

You can run any large language model on your local machine with this repository.

python git-lfs huggingface llm llm-serving

Updated Dec 19, 2023
Python

hpcaitech / SwiftInfer

Star

Efficient AI Inference & Serving

deep-learning inference artificial-intelligence llama gpt llm-serving llm-inference llama2

Updated Jan 8, 2024
Python

fork123aniket / LLM-RAG-powered-QA-App

Star

A Production-Ready, Scalable RAG-powered LLM-based Context-Aware QA App

question-answering ray fine-tuning context-aware-system large-language-models ray-serve llmops llm-serving eleutherai llm-training llm-inference retrieval-augmented-generation parameter-efficient-fine-tuning

Updated Jan 8, 2024
Python

InquestGeronimo / horizon-takeoff

Star

Automating the deployment of the Takeoff Server on AWS for LLMs

aws machine-learning cloud deep-learning ec2 server artificial-intelligence llmops llm-serving llm-inference

Updated Jan 16, 2024
Python

george-mountain / LLM-Local-Streaming

Star

Streaming of LLM responses in realtime using Fastapi and Streamlit.

ai fastapi streamlit llm llm-serving llm-streaming

Updated Jan 21, 2024
Python

biosfood / intel-llm-guide

Star

A guide on how to run LLMs on intel CPUs

setup machine-learning tutorial guide intel setup-development-environment llm llm-serving llm-inference

Updated Jan 23, 2024
Python

chenhunghan / ialacol

Star

🪶 Lightweight OpenAI drop-in replacement for Kubernetes

python kubernetes ai gpu helm cuda openai cloudnative llm langchain llm-serving llamacpp ggml gptq llm-inference

Updated Feb 5, 2024
Python

george-mountain / web-app-builder--LLM

Star

Building Static Web Applications using Large Language Model. From hand sketched documents, images and screenshots to proper web pages.

ai pypi pypi-package streamlit llm llm-serving

Updated Mar 12, 2024
Python

asprenger / ray_vllm_inference

Star

A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.

inference pytorch transformer ray model-serving mlops llm llmops llm-serving vllm

Updated Apr 6, 2024
Python

Neural-Dragon-AI / Cynde

Star

A Framework For Intelligence Farming

xgboost autoscaling pydantic openai-api polars llm-serving llm-inference modal-labs pydantic-logfire intelligence-farming

Updated May 18, 2024
Python

ray-project / ray-llm

Star

RayLLM - LLMs on Ray

distributed-systems transformers ray serving large-language-models llm llmops llm-serving llm-inference

Updated May 28, 2024
Python

okikorg / okik

Star

Okik is a command-line interface (CLI) tool for LLM, RAG and model serving.

python machine-learning deeplearning model-serving llm llmops llm-serving llm-inference

Updated Jun 14, 2024
Python

valyu-network / Stitch

Star

Stitch simplifies and scales LLM application deployment, reducing infrastructure complexity and costs.

llm-serving llm-inference llm-framework llmstack

Updated Jun 2, 2024
Python

France-Travail / happy_vllm

Star

A REST API for vLLM, production ready

production transformers api-rest serving mlops llm llm-serving vllm

Updated Jun 11, 2024
Python

HPMLL / BurstGPT

Star

A GPT-3.5 & GPT-4 Workload Trace to Optimize LLM Serving Systems

dataset mlsys llm llm-serving

Updated Jun 14, 2024
Python

interestingLSY / swiftLLM

Star

A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).

cuda transformers inference pytorch transformer llama gpt inference-engine model-serving mlops llm llmops llm-serving llm-inference

Updated Jun 16, 2024
Python

mosecorg / mosec

Star

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

python rust machine-learning deep-learning mxnet tensorflow gpu cv pytorch tts hacktoberfest model-serving nerual-network machine-learning-platform jax mlops llm llm-serving

Updated Jun 16, 2024
Python

bentoml / OpenLLM

Star

Run any open-source LLMs, such as Llama 2, Mistral, as OpenAI compatible API endpoint in the cloud.

Updated Jun 17, 2024
Python

Improve this page

Add a description, image, and links to the llm-serving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the llm-serving topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

llm-serving

Here are 27 public repositories matching this topic...

Stosan / commentator

friendliai / lm-evaluation-harness

suleymansevimli / run-llm-model-locally

hpcaitech / SwiftInfer

fork123aniket / LLM-RAG-powered-QA-App

InquestGeronimo / horizon-takeoff

george-mountain / LLM-Local-Streaming

biosfood / intel-llm-guide

chenhunghan / ialacol

george-mountain / web-app-builder--LLM

asprenger / ray_vllm_inference

Neural-Dragon-AI / Cynde

ray-project / ray-llm

okikorg / okik

valyu-network / Stitch

France-Travail / happy_vllm

HPMLL / BurstGPT

interestingLSY / swiftLLM

mosecorg / mosec

bentoml / OpenLLM

Improve this page

Add this topic to your repo