Skip to content

A localized open-source AI server that is better than ChatGPT.

License

Notifications You must be signed in to change notification settings

mexicanamerican/ai00_rwkv_server

 
 

Repository files navigation

💯AI00 RWKV Server

All Contributors

English | 中文 | 日本語


AI00 RWKV Server is an inference API server based on the RWKV model.

It supports VULKAN inference acceleration and can run on all GPUs that support VULKAN. No need for Nvidia cards!!! AMD cards and even integrated graphics can be accelerated!!!

No need for bulky pytorch, CUDA and other runtime environments, it's compact and ready to use out of the box!

Compatible with OpenAI's ChatGPT API interface.

100% open source and commercially usable, under the MIT license.

If you are looking for a fast, efficient, and easy-to-use LLM API server, then AI00 RWKV Server is your best choice. It can be used for various tasks, including chatbots, text generation, translation, and Q&A.

Join the AI00 RWKV Server community now and experience the charm of AI!

QQ Group for communication: 30920262

💥Features

  • Based on the RWKV model, it has high performance and accuracy
  • Supports VULKAN inference acceleration, you can enjoy GPU acceleration without the need for CUDA! Supports AMD cards, integrated graphics, and all GPUs that support VULKAN
  • No need for bulky pytorch, CUDA and other runtime environments, it's compact and ready to use out of the box!
  • Compatible with OpenAI's ChatGPT API interface

⭕Usages

  • Chatbots
  • Text generation
  • Translation
  • Q&A
  • Any other tasks that LLM can do

👻Other

Installation, Compilation, and Usage

📦Direct Download and Installation

  1. Directly download the latest version from Release

  2. After downloading the model, place the model in the assets/models/ path, for example, assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st

  3. Run in the command line

    $ ./ai00_rwkv_server --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st
  4. Open the browser and visit the WebUI http://127.0.0.1:65530

📜Compile from Source Code

  1. Install Rust

  2. Clone this repository

    $ git clone https://github.com/cgisky1980/ai00_rwkv_server.git $ cd ai00_rwkv_server
  3. After downloading the model, place the model in the assets/models/ path, for example, assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st

  4. Compile

    $ cargo build --release
  5. After compilation, run

    $ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st
  6. Open the browser and visit the WebUI http://127.0.0.1:65530

📝Supported Arguments

  • --model: Model path
  • --tokenizer: Tokenizer path
  • --port: Running port
  • --quant: Specify the number of quantization layers
  • --adapter: Adapter (GPU and backend) selection options

Example

The server listens on port 3000, loads the full-layer quantized (32 > 24) 0.4B model, and selects the high-performance adapter.

$ cargo run --release -- --model assets/models/RWKV-4-World-0.4B-v1-20230529-ctx4096.st --port 3000 --quant 32 --adapter auto

📙Currently Available APIs

The API service starts at port 65530, and the data input and output format follow the Openai API specification.

  • /v1/models
  • /models
  • /v1/chat/completions
  • /chat/completions
  • /v1/completions
  • /completions
  • /v1/embeddings
  • /embeddings

📙WebUI Screenshots

image

image

📝TODO List

  • Support for text_completions and chat_completions
  • Support for sse push
  • Add embeddings
  • Integrate basic front-end
  • Parallel inference via batch serve
  • Support for int8 quantization
  • Support for SpQR quantization
  • Support for LoRA model
  • Hot loading and switching of LoRA model

👥Join Us

We are always looking for people interested in helping us improve the project. If you are interested in any of the following, please join us!

  • 💀Writing code
  • 💬Providing feedback
  • 🔆Proposing ideas or needs
  • 🔍Testing new features
  • ✏Translating documentation
  • 📣Promoting the project
  • 🏅Anything else that would be helpful to us

No matter your skill level, we welcome you to join us. You can join us in the following ways:

  • Join our Discord channel
  • Join our QQ group
  • Submit issues or pull requests on GitHub
  • Leave feedback on our website

We can't wait to work with you to make this project better! We hope the project is helpful to you!

Thank you to these awesome individuals who are insightful and outstanding for their support and selfless dedication to the project

顾真牛
顾真牛

📖 💻 🖋 🎨 🧑‍🏫
研究社交
研究社交

💻 💡 🤔 🚧 👀 📦
josc146
josc146

🐛 💻 🤔 🔧
l15y
l15y

🔧 🔌 💻

Stargazers over time

Stargazers over time

About

A localized open-source AI server that is better than ChatGPT.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Rust 96.1%
  • Python 3.9%