Professional video processing, scene detection, and utility nodes for ComfyUI.
One-hotkey cleanup for messy groups. Select any group(s), hit Shift+Alt+A, and every child node snaps into a left-to-right layered layout based on connection order β sources on the left, sinks on the right, columns ordered by topological depth. Within a column, nodes are stacked top-to-bottom by the mean y of their downstream targets so wires stay roughly parallel. Collapsed nodes right-align inside their column so their output sockets stay close to the next column. The group then resizes to wrap the result. One Ctrl+Z reverts everything. Find it under TrentNodes menu.
Swap backgrounds in messy PSDs where there's no clean single "background" layer. PSDLayerCompositor now supports three replacement modes:
- single β swap one layer by index (original behavior).
- replace_range β swap a contiguous range of layers
[replacement_index..replacement_end_index]with one image, sized to the canvas or the layers' union bbox.- underlay β paste the replacement under all original layers (escape hatch when you can't pin down the background range).
PSDBackgroundDetect scores layers from the splitter manifest using name regex, full-canvas coverage, opacity, blend mode, and bottom-bias, and outputs a recommended
bg_start/bg_endrange to wire into the compositor. Use the detect β range β underlay ladder when one mode doesn't quite fit.v1 limitation: PSD re-export (
output_psd_path) only works insinglemode. Find them under Trent/PSD.
Neural green screen keying using CorridorKey (Corridor Digital). Instead of producing binary masks, it unmixes true foreground color from the green background, preserving semi-transparent details like hair, motion blur, and out-of-focus edges. Outputs clean foreground, alpha matte, and composite preview. Uses BiRefNet for automatic alpha hint generation when no mask is provided. Find it under Trent/Video.
Apply configurable, temporally coherent degradation to video frames for generating synthetic training pairs. 15 built-in presets (VHS, dashcam, security cam, old film, underwater, and more) plus full custom control over motion blur, defocus, noise, compression artifacts, chromatic aberration, interlacing, rolling shutter, and lens distortion. Find it under Trent/Video.
One-hotkey swap of native ComfyUI video nodes to VHS equivalents. Replaces LoadVideo with VHS_LoadVideo and SaveVideo with VHS_VideoCombine, automatically collapsing intermediate GetVideoComponents and CreateVideo nodes. Works on selected nodes or the entire graph. Shift+V Find it under TrentNodes menu.
Pairs with VHS Swap. Once you have a VHS Load Video and a VHS_VideoCombine in the graph, hit Shift+Alt+V to: add (or reuse) a
VHS_VideoInfooff the Load'svideo_infooutput, convert the Combine'sframe_ratewidget to an input and wire the sourceloaded_fpsinto it, and set the Combine tovideo/h264-mp4withcrf=13. Selection-aware (uses selected nodes when present, otherwise falls back to the single matching pair in the graph). Single Ctrl+Z reverts everything. Find it under TrentNodes menu.
Search for "Trent Nodes" in ComfyUI Manager and click Install.
comfy node registry-install trentnodescd ComfyUI/custom_nodes
git clone https://github.com/TrentHunter82/TrentNodes.git
cd TrentNodes
pip install -r requirements.txtAll nodes are organized under the Trent/ category for easy navigation.
Chop Cuts Accurate scene detection and video splitting. Automatically detects cuts, fades, and transitions using multi-metric analysis, then exports each scene as a separate MP4 file with a detailed report of cut locations and timestamps.
Video Folder Analyzer Scans directories for video files and generates detailed reports including resolution, frame rate, codec, duration, and file size. Outputs as text, JSON, or markdown.
Latest Video Last N Frames Extracts the final N frames from the most recently modified video in a specified directory. Useful for monitoring render outputs.
Latest Video Final Frame Retrieves the last frame from the newest video file in a folder. Streamlines iterative video generation workflows.
Cross Dissolve with Overlap Creates smooth frame transitions with configurable overlap duration. Blends adjacent frames for professional video effects.
Batch Slowdown GPU-accelerated frame duplication to slow down image, mask, or latent batches. Supports multiple input modes: direct multiplier (2x, 3x, 1.5x), target frame count, or FPS conversion (24fps to 60fps). Features smart decimal distribution for non-integer slowdowns and optional speedup mode for sampling every Nth frame.
Frame Ramp Boogie GPU-accelerated frame interpolation that inserts blended intermediate frames between consecutive frame pairs. Features configurable easing curves (linear, ease in/out, cubic bezier with presets) and region targeting (full batch, start, middle, end). Creates smooth slow-motion with actual frame blending instead of simple duplication.
Save Transparent Video Export image batches as video with alpha channel transparency. Three output formats: Animated WebP (good compression, browser-ready), ProRes 4444 MOV (lossless for DaVinci Resolve / After Effects), and PNG image sequence (universal lossless fallback). Alpha is sourced from an optional MASK input, the 4th channel of RGBA images, or defaults to fully opaque. Supports mask auto-resize, single-mask-to-batch broadcast, and configurable quality/FPS. All alpha compositing is GPU-accelerated.
Video Folder Cowboy Directory iterator for video files with natural sorting (vid1 < vid2 < vid10). Browse folders with a built-in file browser dialog, load frames via OpenCV with configurable frame skipping, max frame limits, and start frame offset. Supports glob patterns, sorted subdirectory processing, and configurable index overflow handling (wrap, clamp, error). Returns frames as IMAGE batch plus filename, total video count, file path, frame count, and FPS.
MatAnyone Video Matte Temporally-consistent video matting using MatAnyone (CVPR 2025). Given a single initial mask (or auto-generated via BiRefNet), propagates it across all video frames with memory-based temporal consistency. Produces flicker-free alpha mattes for compositing over chroma key or custom backgrounds. All compositing is GPU-accelerated via torch.lerp.
CorridorKey Green Screen Keyer Neural green screen keying using CorridorKey (Corridor Digital). Instead of producing binary masks, it unmixes true foreground color from the green background, preserving semi-transparent details like hair, motion blur, and out-of-focus edges. Outputs clean straight foreground color, a linear alpha matte, and a composited preview. Uses BiRefNet for automatic alpha hint generation when no mask is provided. Features CNN refiner control, green spill removal, auto-despeckle for tracking markers, edge feathering, and mask expansion. Supports custom background images, transparent RGBA output, and multiple chroma key colors (green, blue, aqua, white, black).
Video Degradation Apply configurable, temporally coherent degradation to video frame batches for generating synthetic training pairs. 15 built-in presets (mild, moderate, severe, phone_indoor, social_media_reupload, zoom_call, dashcam, night_handheld, old_youtube, old_vhs, shaky_handheld, security_cam, livestream, old_film, underwater) or full custom control. Supports motion blur (directional with angle modes), defocus blur (uniform, breathing, rack focus, edge softness), noise (gaussian, poisson, film grain, sensor, mixed), compression artifacts (JPEG, H.264, blockiness), chromatic aberration, temporal flicker, resolution degradation, color degradation, interlacing, rolling shutter, vignette, and lens distortion. Outputs degraded frames plus a JSON degradation map. All operations GPU-accelerated.
Video Layer Ho Down Multi-layer compositing node with interactive drag-to-position canvas preview. Place up to 5 transparent foreground layers onto a background image or video batch. Dynamic layer inputs -- connect one and the next appears automatically (up to 5). Each layer has independent scale, opacity, and blend mode (normal, multiply, screen, overlay, add). Supports RGBA 4-channel images for transparency, automatic batch size alignment (single frame layers repeat across entire video), and partial off-screen placement. All compositing is GPU-accelerated via PyTorch. Features a live canvas with checkerboard transparency indicator, click-to-select layers, drag-to-position, crosshair alignment guides, coordinate display, and a center-reset button.
PSD Layer Splitter
Rasterizes a .psd into per-layer PNGs and writes a _manifest.json capturing index, position, size, opacity, blend mode, visibility, group path, and (v2) bbox_area_ratio, covers_canvas, is_fully_opaque for downstream tools. Supports canvas or cropped layer sizing, optional group extraction, hidden-layer inclusion, and per-kind filters (pixel, type, shape, smartobject, fill, group, adjustment).
PSD Layer Loader
Loads previously-split layers from a folder (the splitter's output) without re-rasterizing. Use this when iterating on downstream nodes β skips the slow PSD parse. Range-selects via start_index / end_index, optionally loads alpha as a separate mask.
PSD Layer Compositor
Recomposites layers from a splitter folder back into a single image, reading positions/opacity/visibility from _manifest.json. Three replacement modes for swapping out a background: single (swap one layer by index), replace_range (swap a contiguous index range with one image β for messy PSDs where the "background" is multiple stacked layers), and underlay (paste replacement under all originals as an escape hatch). replacement_fit (stretch/fit/cover/center) and range_fit (canvas/union_bbox) control sizing. Optionally re-exports a modified .psd via output_psd_path (single mode only in v1).
PSD Background Detect
Scores layers from a _manifest.json and recommends a contiguous index range that's likely the background. Signals: name regex match (bg|background|backdrop|sky|wall|floor by default), full-canvas coverage, opacity + normal blend mode, bbox area ratio, bottom-bias; penalises text and adjustment layers. Outputs bg_start, bg_end, confidence, and a human-readable rationale you can wire to easy showAnything to see exactly which layers were picked and why. Wire bg_start/bg_end into the compositor's replacement_index/replacement_end_index and switch the compositor to replace_range mode for a fully-automated background swap.
PSD Layer Names
Lists every layer name in a .psd file β no rasterization, no manifest side-effects, just walks the layer tree. Use this to find the exact target_layer_name to feed PSDLayerSaveAsPSD. Optional group inclusion and group-path display.
PSD Layer Save As PSD
Replaces one layer's pixels in an existing .psd with a provided image and saves to a new path (the original is never overwritten). Locates the layer by name (recursively walks groups), preserves the original layer's name, opacity, blend mode, and visibility, and respects non-Latin layer names through the proper Unicode tagged block.
When the "background" isn't a single clean layer, ladder through the three modes:
- Detect β range β wire
PSDBackgroundDetectto the compositor'sreplacement_index/replacement_end_index, set mode toreplace_range. The detector'srationaleoutput (piped toeasy showAnything) shows which layers it picked and why, so you can sanity-check the call. - Manual range β if the detector misfires, leave the mode as
replace_rangeand type the indices directly. The detector is a convenience, not a dependency. - Underlay β when you genuinely can't pin down a range, switch to
underlay. The replacement paints under all originals and you toggle visibility on the offending opaque layers in the source PSD.
Enhanced Animation Timing Processor Analyzes animation sequences to detect duplicate frames and replaces them with gray frames for video generation workflows. Features multiple similarity detection methods (hybrid, SSIM, histogram, perceptual), configurable preservation options for sequence first/last frames, and keyframe alignment that automatically inserts padding frames to ensure keyframes land on multiples of 4 (or any configurable multiple) for glitch-free video generation. Outputs include processed frames, duplicate mask, timing report, and removal indices for the companion Frame Remover node.
Animation Frame Remover
Removes padding frames inserted by the Enhanced Animation Timing Processor. Connect the removal_indices output from the processor to automatically strip the temporary padding frames after video generation, returning to the original frame count while preserving the generated content.
Image+Text Grid Creates a grid layout of images with text captions below each. Features auto-grid layout (set images_per_row to 0) that picks optimal columns via ceil(sqrt(n)), aspect-aware cell sizing based on median batch aspect ratio, and automatic centering of the last row when it has fewer images. Configure grid layout with images per row, image size, caption height, font size, padding, and background color. Note: when receiving images from a list-based node (e.g. StringListCowboy), use an ImageListToImageBatch node upstream to collect all images into a single batch before the grid. Perfect for contact sheets, comparison grids, or captioned image galleries.
Align Stylized Frame
Aligns AI-stylized images (Flux img2img restyles etc.) back to their original source frame. Global alignment uses differentiable affine estimation β FFT phase-correlation seeding plus Adam refinement on contrast-normalized edge maps β recovering translation, scale, and optionally rotation and anisotropic scale to sub-pixel precision, with an ECC fallback that guarantees the output is never worse than the input. Subject-preserving mode uses BiRefNet for segmentation, DWPose shoulder matching (centroid/area fallback for non-person subjects), and pastes the untouched stylized subject at the corrected position. Ghost removal and border-gap fills use big-lama by default (purpose-built removal model, Apache-2.0, ~196MB auto-download from GitHub, single forward pass) with Netflix VOID (Apache-2.0, via ComfyUI core, weights in standard model folders) selectable as a diffusion option β note VOID is a video model run on a replicated 5-frame clip for stills, is much slower, and in testing LaMa matched or beat it on single frames; VOID is the intended engine for future video-batch fills. The old sd_inpaint option maps to lama automatically (SD 1.5 backend removed). The score_map visualization shows per-pixel alignment residual before/after. Notes: the difference_map output is a 2x-wide side-by-side diagnostic image, not a same-size frame; with inpaint_method=none the inpaint_mask output excludes the pasted subject so it is safe for an external inpainting pass; batches use frame-0 geometry for all frames.
Cherry Pick Frames Flexible frame selector with multiple modes for extracting specific frames from image batches. Supports first N frames, last N frames, specific indices (comma-separated like "0,5,10,75"), or every Nth frame. Dynamic outputs adjust based on your selection. Perfect for grabbing keyframes, endpoints, or evenly-spaced samples from video batches.
Bevel/Emboss Effect Applies depth and dimensionality to images through configurable bevel and emboss filters. Includes adjustable angle, depth, and smoothing parameters.
Image Batch Analyzer Comprehensive statistical analysis of image batches. Generates histograms, color distribution charts, and detailed reports on brightness, contrast, and color composition.
Multi-Batch Combine Concatenates multiple image batches into a single output batch. Accepts up to 8 optional inputs - unconnected inputs are simply skipped. Handles dimension mismatches automatically with configurable resize modes: largest (resize all to max dimensions), first (match first batch), or custom (specify target width/height). GPU-accelerated resizing via bilinear, nearest, bicubic, or area interpolation.
Black Bar Cinema Scope Adds cinematic black bars (letterbox/pillarbox) to images for widescreen aspect ratios. Supports standard presets (16:9, 2.35:1 Cinemascope, 2.39:1 Anamorphic, 2.76:1 Ultra Panavision, etc.) plus custom ratio override. GPU-accelerated.
Image Folder Cowboy Directory iterator that loads images with proper natural sorting (img1 < img2 < img10). Fixes common issues with filename ordering by splitting text and numeric chunks. Features configurable index overflow handling (wrap, clamp, error) and sorted subdirectory processing.
Easiest Green Screen One-click background removal and chroma key replacement using BiRefNet AI segmentation. Composites foreground over a solid color background (green, blue, aqua, white). Features edge refinement with dilate/erode/feather controls, temporal smoothing for flicker-free video batches, optional custom background images, and resolution presets (512/768/1024). All operations GPU-accelerated.
Grab First Frame Returns the first frame from a batch of images. One input, one output, zero settings.
Just Pad or Crop It Pad or crop an image to match a reference image's dimensions. Each axis is handled independently: axes smaller than the target are padded with configurable gray fill, axes larger are center-cropped. Outputs a binary mask (1.0 = real pixel, 0.0 = padded region). Supports center or top-left alignment.
Smart File Transfer (Auto-Rename) Intelligent file management with automatic conflict resolution, checksums, and organized directory structures. Safely transfers files with duplicate detection.
Custom Filename Generator Creates structured filenames using templates with support for timestamps, counters, and metadata variables. Ensures consistent file naming across workflows.
Filename Extractor Parses filenames to extract embedded metadata, timestamps, and structured information. Converts filenames into usable workflow data.
JSON Multi-Line Summary Converts complex JSON data into human-readable multi-line summaries. Formats nested structures for display and logging.
JSON Extractor Extracts specific values from JSON objects using path notation. Simplifies working with structured data in workflows.
Number Counter Generates sequential numbers with configurable start, step, and padding. Essential for batch processing and frame numbering.
Text File Line Loader Loads individual lines from text files by index. Useful for iterating through prompt lists or configuration files.
File List Lists files in a directory with filtering options. Returns file paths for batch processing workflows.
Create Text File Creates text files with custom content. Specify a file path and content to write. Automatically adds .txt extension if none provided. Creates parent directories as needed.
Wan2.1 Frame Adjuster Adjusts frame amount to always satisfy Wan 4x+1 requirements by adding gray frames to the end of a batch; use a Get Frame Range from Batch node before combining video with the original amount of frames for less headaches when using Wan.
Latent Aligned Mask Creates precision masks aligned to latent space dimensions. Ensures proper mask scaling for latent-based video and image processing.
Latent Aligned Mask (Advanced) Extended version with additional parameters for fine-tuned mask generation including feathering, inversion, and composite operations.
Latent Aligned Mask (Simple) Streamlined mask creation with minimal inputs for quick latent-aligned masks in simple workflows.
Latent Aligned Mask (Wan) Specialized variant optimized for Wan video model requirements with automatic 4x+1 frame alignment.
Wan Vace Keyframe Builder Dynamic keyframe sequencing for Wan Vace video generation. Features interactive UI with drag-and-drop image inputs, frame-accurate positioning, automatic resizing, and synchronized mask generation. Supports up to 256 frames with customizable filler frames.
Vace Mask AutoComping Composites solid gray over masked areas of input images for Wan VACE inpainting workflows. Feed in an image batch and a mask batch (e.g. from SAM3), and it outputs the original video with gray overlaid on the masked regions plus a matching clean binary mask -- saving you from manually compositing the gray-over-original setup that VACE expects. Features adjustable mask expansion (hard edge, no feather) to grow the inpaint region, and a configurable gray level (default 0.5 matches VACE filler). Handles single-mask-to-batch broadcast, automatic spatial resizing, and GPU-accelerated dilation.
Auto Style Dataset Generates 35 prompt strings for synthetic dataset creation. Reads prompts from an external config file and applies optional prepend/append text to each output. Perfect for batch generation of training data with consistent formatting.
String List Cowboy Lassos strings together into a list with optional prefix/suffix branding. Works like Impact Pack's MakeAnyList but specialized for strings - connect any inputs and they get collected into a string list. Each string gets the prefix prepended and suffix appended. Dynamic inputs expand as you connect more values.
This, That, or The Other Parallel gating node with 3 independent input/output channels. Each input passes to its corresponding output ONLY if truthy (non-None, non-zero, non-empty). Falsy inputs block their downstream path via ExecutionBlocker. Uses lazy evaluation to avoid evaluating inputs until needed. Dynamic inputs expand as you connect (1-3 slots).
First Valid Fallback chain node that outputs the FIRST truthy value. Checks inputs in priority order: first β second β third. Returns the first truthy value found, or blocks if all are falsy. Uses lazy evaluation to skip evaluating later inputs once a truthy value is found. Perfect for providing fallback values like default images, prompts, or parameters.
LoRA Test Prompt Generator Generates 10 test prompts specifically designed to validate different types of LoRA models. Supports four LoRA categories:
- subject_person: Portrait/character LoRAs with varied lighting, poses, and environments
- style: Artistic style LoRAs across diverse subjects and scenes
- product: Object/product LoRAs with studio and lifestyle contexts
- vehicle: Car/vehicle LoRAs covering angles, lighting, and motion
Outputs 10 individual prompt strings plus a combined all_prompts output for easy batch processing. Includes optional quality suffix to append tags like "8k, detailed" to all prompts.
VidScribe MiniCPM Beta GPU-accelerated vision-language model for describing images and video frames using MiniCPM-V 4.5. Features:
- int4 quantization (~6-8GB VRAM)
- Smart frame sampling (auto-selects ~32 frames from longer videos)
- Auto-unload after 60s idle to free VRAM
- System prompt presets (default, detailed, concise, narrator, technical, accessible, creative)
- Three modes: single image, multi-image comparison, video frame sequence with temporal understanding
- Deep thinking mode for more thorough analysis
Unload MiniCPM Manually unload MiniCPM model to immediately free VRAM. Connect any output to trigger. Useful when you need GPU memory for other operations without waiting for the 60-second auto-unload timeout.
VRAM Gated Checkpoint Loader Loads a checkpoint only after receiving a VRAM-cleared signal from VidScribe MiniCPM. Ensures the large VLM model is fully unloaded before loading diffusion models.
VRAM Gated VAE Loader Loads a VAE only after receiving a VRAM-cleared signal. Same sequencing pattern as the gated checkpoint loader.
VRAM Gated UNET Loader Loads a UNET model only after receiving a VRAM-cleared signal. Use with FLUX or other UNET-based architectures.
VRAM Gated LoRA Loader (Model Only) Loads a LoRA (model-only) after receiving a VRAM-cleared signal. Applies LoRA weights to a model with configurable strength.
These are canvas-level tools that operate on the ComfyUI graph directly -- no Python backend nodes required. They register as commands with keybindings, menu entries, and selection toolbox buttons.
Grid Paste Duplicate any selection of nodes, groups, reroutes, or subgraphs into an automatically-arranged grid. Select your nodes, hit Ctrl+Shift+;, type how many copies you want, and they appear in a clean grid at your cursor position. The grid auto-calculates a roughly-square layout (e.g. 9 copies = 3x3) with 50px padding, sizes each cell to the bounding box of your selection, and preserves all internal connections between copied nodes. Widget values, node colors, group rectangles -- everything comes along for the ride.
Grid Paste Connected Same grid layout, but every copy's external inputs are wired back to the original source nodes -- the same behavior as Ctrl+Shift+V but applied in bulk. Hit Ctrl+Shift+Alt+; to use this mode. Perfect for scenarios like pasting 6 KSamplers that all need to connect to the same checkpoint loader, or duplicating a ControlNet processing chain where every copy should read from the same source image.
Both modes wrap the entire operation in a single undo transaction, so one Ctrl+Z reverts everything. Maximum 100 copies per operation.
VHS Swap One-hotkey swap of native ComfyUI video nodes to VHS (Video Helper Suite) equivalents. Replaces LoadVideo with VHS_LoadVideo and SaveVideo with VHS_VideoCombine, automatically collapsing intermediate GetVideoComponents and CreateVideo nodes and rewiring all connections. Works on selected nodes or the entire graph. Transfers widget values (filename, fps) and reconnects IMAGE/AUDIO outputs. Requires VHS to be installed. Hotkey: Shift+V, also available in the TrentNodes menu.
Wire VHS Combine
Companion to VHS Swap that finishes the wiring. Drops in (or reuses) a VHS_VideoInfo node off the Load Video's video_info output, converts the VHS_VideoCombine frame_rate widget to an input and connects the source's loaded_fps into it, and sets the Combine's format to video/h264-mp4 with crf=13. Selection-aware: with one VHS Load Video + one VHS Video Combine selected, those are used; otherwise it falls back to the single matching pair in the graph. Whole operation is one undo. Requires VHS to be installed. Hotkey: Shift+Alt+V, also available in the TrentNodes menu.
Organize Group as Grid Lays out a group's child nodes in a clean left-to-right grid based on connection order. Columns are assigned by topological depth (longest path from a source node), so upstream nodes always sit to the left of their downstream targets β exactly mirroring how the wires read. Within a column, nodes are sorted by the mean y of their downstream targets (barycenter heuristic) to minimize wire crossings, with current y as a fallback for sinks and disconnected nodes. Collapsed nodes right-align inside their column so their output socket stays close to the next column instead of leaving a long horizontal gap. The group resizes to wrap the result, and the entire arrange is wrapped in a single undo transaction. Works on multi-selected groups. Hotkey: Shift+Alt+A, also available in the TrentNodes menu.
FAL Kling V2V (O3 Pro) Calls the FAL AI Kling O3 Pro video-to-video reference API to generate new video from a reference video and text prompt. Encodes input IMAGE batch frames to mp4, uploads to FAL CDN, and returns the generated video as frames. Supports optional style reference images (@Image1, @Image2) and character/element injection (@Element1, @Element2) with frontal face + reference image pairs. Optional AUDIO input embeds audio into the uploaded video via ffmpeg muxing -- use with keep_audio=True to preserve it in the generated output. Features auto-appending of @tags for connected inputs so you can write natural prompts, plus a built-in @ autocomplete dropdown in the prompt widget that shows available tags based on which inputs are connected. Images are auto-compressed to JPEG and downscaled if needed to stay within FAL's 10 MB upload limit. Costs $0.336 per second of generated video.
Audio Length in Seconds Calculates the duration of an audio input. Returns both the rounded-up integer (always ceiling to the nearest second) and the exact float duration. Handles all ComfyUI audio formats including VideoHelperSuite LazyAudioMap.
Complete lip sync pipeline for non-human character animation. Converts audio to mouth shapes and composites them onto tracked positions in video frames.
Audio To Phonemes Extracts phonemes from audio using Vosk speech recognition. Returns timestamped phoneme data for mouth shape mapping. Automatically downloads the required Vosk model on first use.
Phoneme To Mouth Shapes Converts phoneme data to a sequence of mouth shape indices (A-H + X for silence). Maps speech sounds to the 9 standard mouth positions used in animation.
Mouth Shape Loader Loads 9 mouth shape images from a folder. Expects files named A.png through H.png plus X.png (silence). Validates all shapes are present and correctly sized.
Mouth Shape Preview Previews mouth shapes with their corresponding phoneme labels. Useful for verifying mouth shape assets before use.
Mouth Shape Compositor Basic compositor that places mouth shapes on frames at a fixed position. Use for static characters or simple animations.
Mouth Shape Compositor (Tracked) Advanced compositor with tracking support. Places mouth shapes at positions determined by either:
- Point tracking: Use tracked (x,y) coordinates from Point Tracker
- Mask tracking: Use per-frame masks from SAM3 to find mouth centroids
Features BiRefNet background removal, scaling, offset adjustment, and optional RGBA output for further compositing.
Creature Lip Sync All-in-one lip sync node combining audio analysis, mouth shape selection, and compositing in a single streamlined node. Ideal for quick character animation setups.
Point Tracker Robust point tracking using pyramidal Lucas-Kanade optical flow. Click a point on frame 1 and track it through the entire video. Features:
- Sub-pixel accuracy with Scharr gradients
- Multi-stage recovery (adaptive template, original template, full-frame search)
- Periodic drift validation against original template
- GPU-accelerated template matching for large search areas
- Configurable window size up to 1025px for full-frame tracking
Point Preview Click-to-pick interface for selecting the initial tracking point. Click anywhere on the image to set coordinates, which pass directly to Point Tracker.
Points To Masks Converts point sequences to gaussian masks for use with mask-based compositing.
Remove Mouth Background Standalone background removal using BiRefNet or color keying. Returns mouth shapes with alpha channel for custom compositing workflows.
- Audio To Phonemes - Extract speech from audio
- Phoneme To Mouth Shapes - Convert to mouth indices
- Mouth Shape Loader - Load your 9 mouth images
- Point Preview - Click to select tracking point
- Point Tracker - Track the point through video
- Mouth Shape Compositor (Tracked) - Composite mouths onto frames
- ComfyUI (latest version recommended)
- Python 3.10+
- opencv-python >= 4.8.0
- numpy >= 1.24.0
- pillow >= 10.0.0
- matplotlib >= 3.7.0
- colorama >= 0.4.6
- vosk >= 0.3.45 (for lip sync speech recognition)
- transformers >= 4.40.0 (for BiRefNet and MiniCPM-V)
- accelerate (for MiniCPM-V model loading)
- fal-client >= 0.4.0 (for FAL AI API nodes)
- requests >= 2.28.0 (for FAL video download)
- timm >= 1.0.0 (for CorridorKey Hiera backbone)
- einops >= 0.8.0 (for CorridorKey tensor operations)
β
69 professional nodes for video, image, audio, API, VLM, flow control, and lip sync workflows
β
Canvas tools - Grid Paste, VHS Swap, Wire VHS Combine, and Organize Group as Grid (topological group layout)
β
Organized categories - all nodes under Trent/ namespace
β
Auto-discovery - drop nodes in nodes/ folder and restart
β
Colorful startup banner with load validation
β
Comprehensive error checking on initialization
β
Registry published - semantic versioning support
# Clone the repository
git clone https://github.com/TrentHunter82/TrentNodes.git
cd TrentNodes
# Install dependencies
pip install -r requirements.txt
# Add new nodes
# Just drop .py files in nodes/ folder - they auto-register!Pull requests welcome! Please:
- Follow existing code style
- Add docstrings to new nodes
- Test thoroughly before submitting
- Update this README with new nodes
- Issues: GitHub Issues
- Registry: Comfy Registry
- ComfyUI Discord: Join Server
MIT License - see LICENSE for details.
Trent - Trent Films
Made with β€οΈ for the ComfyUI community