Xorbits Inference: Model Serving Made Easy 🤖

Xorbits Inference(Xinference) is a powerful and versatile library designed to serve language, speech recognition, and multimodal models. With Xorbits Inference, you can effortlessly deploy and serve your or state-of-the-art built-in models using just a single command. Whether you are a researcher, developer, or data scientist, Xorbits Inference empowers you to unleash the full potential of cutting-edge AI models.

👉 Join our Slack community!

🔥 Hot Topics

Framework Enhancements

Auto recover: #694
Function calling API: #701, here's example: https://github.com/xorbitsai/inference/blob/main/examples/FunctionCall.ipynb
Support rerank model: #672
Speculative decoding: #509
Incorporate vLLM: #445

New Models

Built-in support for phi-2: #828
Built-in support for mistral-instruct-v0.2: #796
Built-in support for deepseek-llm and deepseek-coder: #786
Built-in support for Mixtral-8x7B-v0.1: #782
Built-in support for OpenHermes 2.5: #776
Built-in support for Yi: #629
Built-in support for zephyr-7b-alpha and zephyr-7b-beta: #597

Integrations

Dify: an LLMOps platform that enables developers (and even non-developers) to quickly build useful applications based on large language models, ensuring they are visual, operable, and improvable.
Chatbox: a desktop client for multiple cutting-edge LLM models, available on Windows, Mac and Linux.

Key Features

🌟 Model Serving Made Easy: Simplify the process of serving large language, speech recognition, and multimodal models. You can set up and deploy your models for experimentation and production with a single command.

⚡️ State-of-the-Art Models: Experiment with cutting-edge built-in models using a single command. Inference provides access to state-of-the-art open-source models!

🖥 Heterogeneous Hardware Utilization: Make the most of your hardware resources with ggml. Xorbits Inference intelligently utilizes heterogeneous hardware, including GPUs and CPUs, to accelerate your model inference tasks.

⚙️ Flexible API and Interfaces: Offer multiple interfaces for interacting with your models, supporting OpenAI compatible RESTful API (including Function Calling API), RPC, CLI and WebUI for seamless model management and interaction.

🌐 Distributed Deployment: Excel in distributed deployment scenarios, allowing the seamless distribution of model inference across multiple devices or machines.

🔌 Built-in Integration with Third-Party Libraries: Xorbits Inference seamlessly integrates with popular third-party libraries including LangChain, LlamaIndex, Dify, and Chatbox.

Why Xinference

Feature	Xinference	FastChat	OpenLLM	RayLLM
OpenAI-Compatible RESTful API	✅	✅	✅	✅
vLLM Integrations	✅	✅	✅	✅
More Inference Engines (GGML, TensorRT)	✅	❌	✅	✅
More Platforms (CPU, Metal)	✅	✅	❌	❌
Multi-node Cluster Deployment	✅	❌	❌	✅
Image Models (Text-to-Image)	✅	✅	❌	❌
Text Embedding Models	✅	❌	❌	❌
More OpenAI Functionalities (Function Calling)	✅	❌	❌	❌

Getting Started

Please give us a star before you begin, and you'll receive instant notifications for every new release on GitHub!

Quick Start

Install Xinference by using pip as follows. (For more options, see Installation page.)

pip install "xinference[all]"

To start a local instance of Xinference, run the following command:

$ xinference-local

Once Xinference is running, there are multiple ways you can try it: via the web UI, via cURL, via the command line, or via the Xinference’s python client. Check out our docs for the guide.

Getting involved

Platform	Purpose
Github Issues	Reporting bugs and filing feature requests.
Slack	Collaborating with other Xorbits users.
Twitter	Staying up-to-date on new features.

Name		Name	Last commit message	Last commit date
Latest commit History 423 Commits
.github		.github
assets		assets
benchmark		benchmark
doc		doc
examples		examples
xinference		xinference
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yaml		.readthedocs.yaml
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
README_ja_JP.md		README_ja_JP.md
README_zh_CN.md		README_zh_CN.md
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
versioneer.py		versioneer.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Xorbits Inference: Model Serving Made Easy 🤖

🔥 Hot Topics

Framework Enhancements

New Models

Integrations

Key Features

Why Xinference

Getting Started

Quick Start

Getting involved

About

Releases

Packages

Languages

License

Bojun-Feng/inference

Folders and files

Latest commit

History

Repository files navigation

Xorbits Inference: Model Serving Made Easy 🤖

🔥 Hot Topics

Framework Enhancements

New Models

Integrations

Key Features

Why Xinference

Getting Started

Quick Start

Getting involved

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages