Skip to content

liconstudio/ComfyUI-SNR-quant

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

SNR-Quant

ComfyUI nodes for adaptive FP8/BF16 mixed precision quantization.

Features

  • Automatic model quantization - Quantize any diffusion model to a target size
  • Adaptive threshold selection - Automatically detects model "fragility" and adjusts quantization strategy
  • SNR-based layer selection - Protects sensitive layers, quantizes robust ones
  • Outlier protection - Layers with extreme weights are kept in BF16
  • Progress bar - Shows quantization progress in ComfyUI console
  • CSV report - Generates detailed report showing which layers were quantized
  • Re-quantization safe - Already quantized FP8 layers are preserved (no size increase on re-run)

Installation

  1. Clone or copy this folder to ComfyUI/custom_nodes/SNR-quant
  2. Restart ComfyUI

Dependencies (usually already installed):

  • torch>=2.1.0 (for FP8 support)
  • safetensors>=0.4.0
  • numpy>=1.24.0

Usage

SNR Quant (FP8/BF16)

Quantize a model to fit within a target size.

Inputs:

  • model_name - Select from ComfyUI/models/diffusion_models
  • target_size_GB - Target size in GB (e.g., 22.0)

Output:

  • Saved directly to ComfyUI/output/diffusion_models/quant_{model_name}_{target}GB.safetensors
  • CSV report: ComfyUI/output/diffusion_models/quant_{model_name}_{target}GB_report.csv

Example:

model_name: wan2.2_fp16.safetensors
target_size_GB: 22.0
→ output: quant_wan2.2_fp16_22.0GB.safetensors
→ report: quant_wan2.2_fp16_22.0GB_report.csv

SNR Model Analyzer

Analyze a model's SNR and outlier distribution.

Input:

  • model_name - Select from ComfyUI/models/diffusion_models

Outputs:

  • report - Text report with statistics and quantization recommendations

How It Works

Load model → Analyze layers → Detect fragility → Select layers → Quantize → Save
                ↓                    ↓              ↓
            SNR + Outlier      fragile/moderate/  FP8 or BF16
                               robust             per layer

Fragility Detection

Fragility SNR Range SNR Median Strategy
Fragile < 2 dB < 32 dB Protect more (90th percentile)
Moderate - - Balanced (75th percentile)
Robust > 5 dB > 34 dB Quantize more (50th percentile)

Layer Selection

  1. Filter layers by outlier threshold (adaptive percentile)
  2. Sort by SNR (descending - highest SNR = safest to quantize)
  3. Select layers until target size is reached

Size Calculation

  • Baseline: All 2D layers as BF16 (2 bytes/param) + 1D layers as BF16
  • Minimum: All 2D layers as FP8 (1 byte/param) + 1D layers as BF16
  • If target < minimum: auto-adjust to minimum

Re-quantization Behavior

When quantizing an already quantized model:

  • Existing FP8 layers are preserved (not converted back to BF16)
  • Only remaining BF16 layers are considered for further quantization
  • File size will not increase on re-run

Output Format

Quantized models are saved as .safetensors with mixed precision:

  • FP8 layers: torch.float8_e4m3fn
  • BF16 layers: torch.bfloat16
  • 1D layers (bias, norms): always BF16

CSV Report Format

The generated report includes:

  • layer_name - Full layer path
  • precision - Final precision (FP8, BF16, or FP8 (existing))
  • params_numel - Number of parameters
  • SNR_dB - Signal-to-Noise ratio (higher = safer to quantize)
  • outlier - Outlier index (lower = safer to quantize)
  • precision_change - Whether layer was quantized or preserved

License

Apache-2.0

About

Adaptive FP8/BF16 mixed precision quantization for ComfyUI diffusion models. SNR-based layer selection with outlier protection.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages