Home

ComfyUI-GGUF (Randy420Marsh fork) — Wiki

This wiki covers the fork-specific features and the end-to-end build + conversion pipeline for the tools/ folder. Everything here applies to Randy420Marsh/ComfyUI-GGUF, not the upstream city96/ComfyUI-GGUF repo (which does not carry these features).

If you only want to load GGUF models inside ComfyUI you can stop reading here — git clone the repo into ComfyUI/custom_nodes/, pip install --upgrade gguf, and you're done. The wiki only matters if you want to convert and quantize your own models from .safetensors → .gguf.

Pages

Build the patched llama-quantize — the C++ build step using the pre-patched Randy420Marsh/llama.cpp branch city96. One git clone, one cmake, no manual patch step. This is the part that was previously only documented inside tools/README.md and was missing from a discoverable wiki.
Conversion Guide — long-form per-model walkthrough (Flux, SD3.5, ERNIE-Image, Lumina2 / Z-Image / RedCraft ZiB, Wan, Hunyuan-Video, …). Mirror of docs/CONVERSION_GUIDE.md for browseability.

The in-repo tools/README.md remains the reference doc; the wiki is a more navigable view of the same material plus the pre-patched-build shortcut.

Fork-specific features

These are the additions in this fork that are not in upstream city96/ComfyUI-GGUF. The conversion-guide and build instructions on this wiki assume them.

Feature	Where	PR
`mistral3` text-encoder arch (Ministral-3-3B, used by ERNIE-Image) + tekken tokenizer reconstruction	`loader.py`	#5, #6
Gemma-4 text-encoder via `tokenizer.json` sidecar (fixes upstream `'str' object has no attribute 'decode'` crash)	`loader.py`	#11
Optional `mmproj_name` picker on `CLIPLoader (GGUF)` for explicit multimodal-projector selection	`loader.py`, `nodes.py`	#10
ERNIE-Image arch + ComfyUI scaled-fp8 dequant in `convert.py`	`tools/convert.py`	#1, #2
Qt GUI front-end with bf16 auto-detect (`nvidia-smi`) + full `llama-quantize` output-type selector + `Analyze` button	`tools/gguf_gui.py`	#3, #4, #7
`tools/inspect_gguf.py --metadata` for per-arch GGUF metadata inspection (replaces a safetensors-only script that can't open GGUF)	`tools/inspect_gguf.py`	#5
Rewritten `tools/README.md` (venv + `LD_LIBRARY_PATH` + GUI workflow)	`tools/README.md`	#8
Long-form `docs/CONVERSION_GUIDE.md`	`docs/CONVERSION_GUIDE.md`	#9
README repo-attribution updates pointing install URLs at this fork	`README.md`, `tools/README.md`, `pyproject.toml`	#12

Bugs in features outside that list (e.g. UNet loader for Flux, the original Q4_K quantizer, LoRA loading) live upstream — file those at city96/ComfyUI-GGUF/issues.

Home

Setup

Build the patched llama-quantize

Reference

Repos

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

ComfyUI-GGUF (Randy420Marsh fork) — Wiki

Pages

Fork-specific features

Uh oh!

Uh oh!

Clone this wiki locally