Release v1.9.11 · josStorer/RWKV-Runner

Changes

Added the albatross inference backend, supporting batch inference while keeping API compatibility. The backend automatically handles batching. It works out of the box on Windows with RTX 30XX series and newer GPUs; select the option below in the client to get an immediate performance boost. Under concurrent workloads, 3060 and newer GPUs can typically reach 2000-10000 token/s inference speed for 3B and 7B scale models.

Added a batch generation button to the client, providing a friendlier batch generation preview interface.

2026-05-08.22-46-45_compressed.mp4

Bumped precompiled llama.cpp vulkan.

Note: If you encounter WebView2 crash issues, please try opening the Windows Settings, click on Apps, search for
WebView2, click Modify -> Repair to update your WebView2 runtime.

Install

Windows: https://github.com/josStorer/RWKV-Runner/blob/master/build/windows/Readme_Install.txt
MacOS: https://github.com/josStorer/RWKV-Runner/blob/master/build/darwin/Readme_Install.txt
Linux: https://github.com/josStorer/RWKV-Runner/blob/master/build/linux/Readme_Install.txt
Simple Deploy Example: https://github.com/josStorer/RWKV-Runner/blob/master/README.md#simple-deploy-example
Server Deploy Examples: https://github.com/josStorer/RWKV-Runner/tree/master/deploy-examples
Windows 7 Patches: https://github.com/josStorer/wails/releases/tag/v2.9.2x

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.9.11

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Changes

Install

Uh oh!