Prompt Assistant

MooshieUI ships a local LLM (a llama.cpp GGUF server) that helps you write prompts. It runs on your own machine, so prompts are not sent to any external service.

What it does

Enhance - rewrites and improves your existing prompt.
Compose - generates a prompt from a plain-language description. You choose a length (short, medium, or detailed) and whether to include artist tags.

How it runs

The assistant starts a local llama.cpp server on demand.
It is hardware-aware: it offloads layers to the GPU when there is room and falls back to CPU if ComfyUI is already using your VRAM.
An idle watchdog stops the server automatically when it has not been used for a while, freeing memory.

Hosted / Docker notes

In GPU Docker builds you can point the assistant at a specific llama.cpp binary directory with the MOOSHIEUI_LLAMA_BIN_DIR environment variable.
On multi-GPU hosts you may want to pin the llama server to a specific device (for example CUDA_VISIBLE_DEVICES) so it does not compete with ComfyUI.
The model used by the assistant is configured by the deployment. If no assistant model is configured, a fallback tag-generation model may load instead.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prompt Assistant

Prompt Assistant

What it does

How it runs

Hosted / Docker notes

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally