-
-
Notifications
You must be signed in to change notification settings - Fork 10
Prompt Assistant
Mooshieblob1 edited this page Jun 19, 2026
·
1 revision
MooshieUI ships a local LLM (a llama.cpp GGUF server) that helps you write prompts. It runs on your own machine, so prompts are not sent to any external service.
- Enhance - rewrites and improves your existing prompt.
- Compose - generates a prompt from a plain-language description. You choose a length (short, medium, or detailed) and whether to include artist tags.
- The assistant starts a local llama.cpp server on demand.
- It is hardware-aware: it offloads layers to the GPU when there is room and falls back to CPU if ComfyUI is already using your VRAM.
- An idle watchdog stops the server automatically when it has not been used for a while, freeing memory.
- In GPU Docker builds you can point the assistant at a specific llama.cpp binary directory with the
MOOSHIEUI_LLAMA_BIN_DIRenvironment variable. - On multi-GPU hosts you may want to pin the llama server to a specific device (for example
CUDA_VISIBLE_DEVICES) so it does not compete with ComfyUI. - The model used by the assistant is configured by the deployment. If no assistant model is configured, a fallback tag-generation model may load instead.
- Prompting Guide for wildcards, presets, scheduling, and the rest of the prompt system.
Getting started
Prompting
Generation features
Models and output
Deployment
Help