Skip to content

Python Runtime Examples

Mike edited this page May 28, 2026 · 2 revisions

Runtime Examples

Multi-unit runtime

units = [
    xlocllm.unit("LLM", "Qwen-3.5-0.8b"),
    xlocllm.unit("embedding", "multilingual-e5-small"),
    xlocllm.unit("reranker", "bge-reranker-base"),
]
with xlocllm.runtime(units, port=1146) as rt:
    rt.run()
    print(rt.models())

OpenAI-compatible server for local clients

from openai import OpenAI

llm = xlocllm.unit("LLM", "Qwen-3.5-0.8b")
with xlocllm.runtime([llm], port=1146) as rt:
    rt.run()
    client = OpenAI(base_url=rt.url, api_key="xlocllm")
    print(client.models.list())

Chat UI

with xlocllm.runtime([llm]) as rt:
    rt.run()
    rt.chatui(session="demo", use_rag=True)

Production-style explicit port

runtime = xlocllm.runtime([llm], port=12000, mode="native")
runtime.run()
print("Use this base_url:", runtime.url)

Clone this wiki locally