#

vllm

Here are 52 public repositories matching this topic...

xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

Updated Nov 15, 2024
Python

katanaml / sparrow

Data processing with ML, LLM and Vision LLM

computer-vision machinelearning gpt nlp-machine-learning rag huggingface-transformers llm vllm

Updated Nov 14, 2024
Python

OpenRLHF / OpenRLHF

An Easy-to-use, Scalable and High-performance RLHF Framework (70B+ PPO Full Tuning & Iterative DPO & LoRA & RingAttention)

reinforcement-learning raylib transformers deepspeed large-language-models reinforcement-learning-from-human-feedback vllm

Updated Nov 17, 2024
Python

prometheus-eval / prometheus-eval

Evaluate your LLM's response with Prometheus and GPT4 💯

python evaluation gpt4 llm llmops vllm litellm llm-as-a-judge llm-as-evaluator

Updated Sep 9, 2024
Python

ModelTC / llmc

[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Updated Nov 18, 2024
Python

microsoft / vidur

A large-scale simulation framework for LLM inference

simulation inference transformer llm vllm

Updated Oct 10, 2024
Python

runpod-workers / worker-vllm

The RunPod worker template for serving our large language model endpoints. Powered by vLLM.

language-model llm runpod vllm

Updated Oct 31, 2024
Python

chtmp223 / topicGPT

Official Implementation of TopicGPT: A Prompt-Based Framework for Topic Modeling (NAACL '24)

python nlp openai topic-modeling llm vllm

Updated Nov 11, 2024
Python

Trainy-ai / llm-atc

Fine-tuning and serving LLMs on any cloud

finetuning llms vllm llama2

Updated Dec 2, 2023
Python

OpenCSGs / llm-inference

llm-inference is a platform for publishing and managing llm inference, providing a wide range of out-of-the-box features for model deployment, such as UI, RESTful API, auto-scaling, computing resource management, monitoring, and more.

transformer ray deepspeed llama-cpp vllm llm-inference

Updated May 17, 2024
Python

wangcx18 / llm-vscode-inference-server

An endpoint server for efficiently serving quantized open-source LLMs for code.

vscode-extension llm vllm llm-inference

Updated Oct 15, 2023
Python

asprenger / ray_vllm_inference

A simple service that integrates vLLM with Ray Serve for fast and scalable LLM serving.

inference pytorch transformer ray model-serving mlops llm llmops llm-serving vllm

Updated Apr 6, 2024
Python

zRzRzRzRzRzRzR / lm-fly

大模型推理框架加速，让 LLM 飞起来

mlx tgi openvino llm vllm llm-inference tensorrt-llm

Updated May 10, 2024
Python

France-Travail / happy_vllm

A REST API for vLLM, production ready

production api-rest llm llm-serving vllm

Updated Nov 15, 2024
Python

YY0649 / ICE-PIXIU

ICE-PIXIU：A Cross-Language Financial Megamodeling Framework

nlp llama pixiu large-language-models vllm internlm

Updated Oct 21, 2024
Python

sasha0552 / vllm-ci

CI scripts designed to build a Pascal-compatible version of vLLM.

Updated Aug 10, 2024
Python

LLM-inference-router / vllm-router

vLLM Router

kubernetes huggingface llm vllm llm-inference llama2

Updated Mar 11, 2024
Python

iNeil77 / vllm-code-harness

Run code inference-only benchmarks quickly using vLLM

transformers code-generation nlp-machine-learning vllm

Updated Sep 15, 2024
Python

hcd233 / Aris-AI-Model-Server

An OpenAI Compatible API which integrates LLM, Embedding and Reranker. 一个集成 LLM、Embedding 和 Reranker 的 OpenAI 兼容 API

ai embedding mlx reranker rag fastapi sentence-transformers awq llm vllm gptq openai-compatible-api

Updated Jul 13, 2024
Python

kyegomez / SimpleUnet

An simple implementation of Unet because all the implementations i've seen are wayy tooo complicated.

image computer-vision artificial-intelligence image-classification image-segmentation unet biomedical biomedical-image-processing gpt4 vllm texttovide

Updated Nov 12, 2024
Python

Improve this page

Add a description, image, and links to the vllm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the vllm topic, visit your repo's landing page and select "manage topics."