-
Notifications
You must be signed in to change notification settings - Fork 0
Python Unit
Mike edited this page May 28, 2026
·
1 revision
Unit представляет одну capability/model пару, service unit или composite unit.
Обычно его создают через xlocllm.unit(...), xlocllm.vectorstorage(...) или xlocllm.rag(...).
| Свойство | Описание |
|---|---|
unit.id |
<type>:<modelId> |
unit.type |
нормализованный тип unit |
unit.model |
resolved exact model id |
unit.label |
label из каталога или model id |
unit.model_info |
ModelInfo | None |
unit.mode |
native или web
|
unit.quant |
выбранное GGUF quantization для native LLM |
unit.reasoning |
default reasoning control для поддерживаемых LLM |
unit.options |
runtime options |
unit.rag |
attached RAG unit для LLM |
unit.supports_reasoning |
поддерживает ли модель thinking/reasoning control |
unit.is_custom |
custom ONNX/sklearn/torch source |
| Метод | Зачем нужен | Ключевые параметры |
|---|---|---|
to_payload() |
bridge payload | - |
to_dict() |
полное dict-представление | - |
status() |
состояние attached runtime или offline selected state | - |
remove() |
убрать unit из runtime без удаления cache | - |
delete_cache(bridge=None) |
удалить model cache |
bridge optional |
delete(ids=None, filter=None, delete_cache=True, bridge=None, **params) |
для model unit - удалить/убрать; для RAG/vector - удалить записи |
ids, filter, delete_cache
|
set_reasoning(enabled) |
горячо включить/выключить reasoning |
True, False, None
|
as_runtime(port=1146) |
создать single-unit runtime | port |
install(port=1146) |
shortcut runtime install | port |
run(port=1146) |
shortcut runtime run | port |
stop() |
остановить single-unit runtime | - |
hibernate() |
выгрузить active models | - |
heatup() |
запустить/warmup active models | - |
invoke(endpoint, payload, timeout=None) |
вызвать endpoint через single-unit runtime |
endpoint, payload, timeout
|
add(documents, ids=None, metadatas=None, embeddings=None, **params) |
добавить документы в RAG или векторные записи |
documents, ids, metadatas, embeddings
|
search(query=None, embedding=None, top_k=None, filter=None, **params) |
поиск в RAG/vectorstorage |
query, embedding, top_k, filter
|
clear(**params) |
очистить namespace RAG/vector | params |
stats() |
статистика RAG/vector | - |
reindex(**params) |
переэмбеддить RAG chunks | params |
predict(inputs, **params) |
custom ONNX/regression inference |
inputs, params
|
Методы add/search/delete/clear/stats/reindex/predict требуют, чтобы unit был привязан к запущенному Runtime.
Привязка происходит автоматически при создании runtime([unit]).
- xlocllm
- Quickstart
- About
- Functions Python
- Functions TypeScript
- Use cases
- Examples Python
- Examples TypeScript
- Shared GPU mode
-
Models catalog
- Models The best
- Models Full model list
- Models Use your model
- For native mode
- Models Native LLM tiny small
- Models Native LLM medium
- Models Native LLM large
- Models Native embedding
- Models Native reranker
- Models Native translator
- Models Native tts
- Models Native vlm
- Models Native asr
- Models Native ocr
- Models Native image-classification
- Models Native object-detection
- Models Native image-segmentation
- Models Native depth-estimation
- Models Native document-layout
- Models Native table-detection
- Models Native document-qa
- Models Native language-id
- Models Native audio-classification
- Models Native text-classification
- Models Native ner
- Models Native zero-shot-text
- Models Native summarization
- Models Native text2text
- Models Native code
- For webgpu mode
- For web mode
- Models Web LLM
- Models Web embedding
- Models Web reranker
- Models Web translator
- Models Web tts
- Models Web vlm
- Models Web asr
- Models Web ocr
- Models Web image-classification
- Models Web object-detection
- Models Web image-segmentation
- Models Web depth-estimation
- Models Web document-layout
- Models Web table-detection
- Models Web document-qa
- Models Web zero-shot-image
- Models Web language-id
- Models Web audio-classification
- Models Web text-classification
- Models Web ner
- Models Web zero-shot-text
- Models Web summarization
- Models Web text2text
- Models Web code
- Dev