# **Faster WAN 2.1 IMAGE TO VIDEO WITH CAUSVID, LIGHTX2V, & FUSION-X LoRAs**
- You can use this notebook for basic image to video generation using the Wan2.1 LoRAs listed in the title. For a notebook that also does video upscale, enhancement, face correction/swap, and frame interploation, visit this link: https://isinse.gumroad.com/l/wan2point1withLoRAs
- To read a guide on using this notebook, see this link: https://penioj.blogspot.com/2025/07/i-dumped-kling-ai-for-these-colab.html  
- If you deselect the `use480p` checkbox, the 720p Wan 2.1 base model will be downloaded and used rather than the 480p Wan 2.1 base model. When used with the `walking to viewers` LoRA, the 720p model is about 3 minutes faster than the 480p model for high definition (720p) videos.
- You can use the free T4 GPU to generate a 5 second 480P video (frames=81) at 16fps with the Q5_K_M GGUF 480p model and with the default settings in less than 10 minutes. A 4-second 720p video (frames=65) can be generated in roughly 22 minutes with the Q4_K_M 480p model, and in 19 minutes with the Q4_K_M 720p model. I recommend that you use higher GPUs for bigger models, longer videos, and faster generations.
- **To use a lora, put its huggingface or civitai download link in the `lora_download_url` textbox, select the `download_lora` checkbox, and if using civitai, input your civitai token before running the code to `Prepare Environment`. Remember to describe the main subject of the image and include the trigger words for the LoRA in the prompt. For the default walking forward lora link in lora_2_download_url, the trigger word is 'walking to viewers.' You can get LoRAs from this huggingface repository: https://huggingface.co/collections/Remade-AI/wan21-14b-480p-i2v-loras-67d0e26f08092436b585919b and from civitai: https://civitai.com/models. In civitai, set the `Wan Video` and `LoRA` filters to see the Wan LoRAs.** You can watch this video to learn how to get and use LoRAs from huggingface and civitai, and how to create your civitai token: https://youtu.be/49NkIV_QpBM
- Generating a video from this flux image (https://comfyanonymous.github.io/ComfyUI_examples/flux/) with the settings (480x480, 20 steps, 65 frames) using the Q4 GGUF model and the free T4 GPU took about 33 minutes with no Teacache i.e. `rel_l1_threshless` set to zero in the Teacache settings, and less than 18 minutes with `rel_l1_threshless` set to 0.275 with little loss in quality. Increase the value of `rel_l1_threshless` for faster generation with a tradeoff in quality. To get much faster generations, use the causvid, lightx2v or fusionx model LoRAs. It is recommended that you set `rel_l1_threshless` to zero if using these LoRAs.
- **causvid recommended settings** : cfg_scale=1 , steps=4 ,sampler_name=uni_pc , sceduler=simple , flow_shift=5 , strength=0.8
- **lightx2v recommended settings** : cfg_scale=1 , steps=4 ,sampler_name=LCM , sceduler=simple , flow_shift=8, strength=1
- **fusionx recommended settings** : cfg_scale=1 , steps=6 ,sampler_name=uni_pc , sceduler=simple , flow_shift=5 , strength=1
- You can enable both lightx2v & fusionx and adjust their strengths until you get a desirable result. fusionx already contains the causvid LoRA, but you can experiment with different combinations.


In [None]:
# @markdown # 💥1. Prepare Environment

# Install PyTorch and dependencies
!pip install torch==2.6.0 torchvision==0.21.0
!pip install -q torchsde einops diffusers accelerate xformers==0.0.29.post2 triton==3.2.0 sageattention
!pip install av spandrel albumentations insightface onnx opencv-python segment_anything ultralytics onnxruntime
!pip install onnxruntime-gpu -y
clear_output()

# Clone ComfyUI repositories
%cd /content
!git clone https://github.com/Isi-dev/ComfyUI
clear_output()
%cd /content/ComfyUI/custom_nodes
!git clone https://github.com/Isi-dev/ComfyUI_GGUF.git
clear_output()
!git clone https://github.com/Isi-dev/ComfyUI_KJNodes.git
clear_output()

# Install custom nodes requirements
%cd /content/ComfyUI/custom_nodes/ComfyUI_GGUF
!pip install -r requirements.txt
clear_output()
%cd /content/ComfyUI/custom_nodes/ComfyUI_KJNodes
!pip install -r requirements.txt
clear_output()

# Clone Practical-RIFE for frame interpolation
%cd /content
!git clone https://github.com/Isi-dev/Practical-RIFE
%cd /content/Practical-RIFE
!pip install git+https://github.com/rk-exxec/scikit-video.git@numpy_deprecation
!mkdir -p /content/Practical-RIFE/train_log

# Download RIFE training scripts and flownet
!wget -q https://huggingface.co/Isi99999/Frame_Interpolation_Models/resolve/main/4.25/train_log/IFNet_HDv3.py -O /content/Practical-RIFE/train_log/IFNet_HDv3.py
!wget -q https://huggingface.co/Isi99999/Frame_Interpolation_Models/resolve/main/4.25/train_log/RIFE_HDv3.py -O /content/Practical-RIFE/train_log/RIFE_HDv3.py
!wget -q https://huggingface.co/Isi99999/Frame_Interpolation_Models/resolve/main/4.25/train_log/refine.py -O /content/Practical-RIFE/train_log/refine.py
!wget -q https://huggingface.co/Isi99999/Frame_Interpolation_Models/resolve/main/4.25/train_log/flownet.pkl -O /content/Practical-RIFE/train_log/flownet.pkl
clear_output()

# Install system packages
%cd /content/ComfyUI
!apt -y install -qq aria2 ffmpeg
clear_output()

# Environment variables
use480p = True # @param {"type":"boolean"}
import os
os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'expandable_segments:True'

# Python imports
from pathlib import Path
import torch, numpy as np, cv2, gc, sys, random, subprocess, shutil, imageio
from PIL import Image
from google.colab import files
from IPython.display import display, HTML, Image as IPImage

sys.path.insert(0, '/content/ComfyUI')
from comfy import model_management
from nodes import (
    CheckpointLoaderSimple, CLIPLoader, CLIPTextEncode, VAEDecode, VAELoader,
    KSampler, UNETLoader, LoadImage, SaveImage, CLIPVisionLoader,
    CLIPVisionEncode, LoraLoaderModelOnly, ImageScale
)
from custom_nodes.ComfyUI_GGUF.nodes import UnetLoaderGGUF
from custom_nodes.ComfyUI_KJNodes.nodes.model_optimization_nodes import (
    WanVideoTeaCacheKJ, PathchSageAttentionKJ, SkipLayerGuidanceWanVideo
)
from comfy_extras.nodes_model_advanced import ModelSamplingSD3
from comfy_extras.nodes_images import SaveAnimatedWEBP
from comfy_extras.nodes_video import SaveWEBM
from comfy_extras.nodes_wan import WanImageToVideo
from comfy_extras.nodes_upscale_model import UpscaleModelLoader

# Utility functions for downloading models
def download_with_aria2c(link, folder="/content/ComfyUI/models/loras"):
    os.makedirs(folder, exist_ok=True)
    filename = link.split("/")[-1]
    command = f"aria2c --console-log-level=error -c -x 16 -s 16 -k 1M {link} -d {folder} -o {filename}"
    print("Executing download command:", command)
    get_ipython().system(command)
    return filename

def download_civitai_model(civitai_link, civitai_token, folder="/content/ComfyUI/models/loras"):
    import time
    os.makedirs(folder, exist_ok=True)
    model_id = civitai_link.split("/models/")[1].split("?")[0]
    civitai_url = f"https://civitai.com/api/download/models/{model_id}?type=Model&format=SafeTensor"
    if civitai_token:
        civitai_url += f"&token={civitai_token}"
    filename = f"model_{time.strftime('%Y%m%d_%H%M%S')}.safetensors"
    full_path = os.path.join(folder, filename)
    os.system(f"wget --max-redirect=10 --show-progress \"{civitai_url}\" -O \"{full_path}\"")
    return filename

def download_lora(link, folder="/content/ComfyUI/models/loras", civitai_token=None):
    if "civitai.com" in link.lower():
        if not civitai_token:
            raise ValueError("Civitai token required")
        return download_civitai_model(link, civitai_token, folder)
    else:
        return download_with_aria2c(link, folder)

def model_download(url: str, dest_dir: str, filename: str = None, silent: bool = True) -> bool:
    Path(dest_dir).mkdir(parents=True, exist_ok=True)
    if filename is None:
        filename = url.split('/')[-1].split('?')[0]
    cmd = ['aria2c','--console-log-level=error','-c','-x','16','-s','16','-k','1M','-d',dest_dir,'-o',filename,url]
    if silent:
        cmd.extend(['--summary-interval=0','--quiet'])
        print(f"Downloading {filename}...", end=' ', flush=True)
    subprocess.run(cmd, check=True)
    if silent: print("Done!")
    return filename

# LoRA & model configuration
model_quant = "Q4_K_M" # @param ["Q4_0","Q4_K_M", "Q5_K_M", "Q6_K", "Q8_0"]
download_loRA_1, download_loRA_2, download_loRA_3 = False, False, False
lora_1_download_url = "Put your loRA here"
lora_2_download_url = "https://civitai.com/api/download/models/1636239?type=Model&format=SafeTensor"
lora_3_download_url = "https://huggingface.co/Remade-AI/Rotate/resolve/main/rotate_20_epochs.safetensors"
token_if_civitai_url = "Put your civitai token here"

# Output model storage
lora_1, lora_2, lora_3 = None, None, None
valid_extensions = {'.safetensors', '.ckpt', '.pt', '.pth', '.sft'}

if download_loRA_1:
    lora_1 = download_lora(lora_1_download_url, civitai_token=token_if_civitai_url)
if lora_1 and not any(lora_1.lower().endswith(ext) for ext in valid_extensions): lora_1=None
if download_loRA_2:
    lora_2 = download_lora(lora_2_download_url, civitai_token=token_if_civitai_url)
if lora_2 and not any(lora_2.lower().endswith(ext) for ext in valid_extensions): lora_2=None
if download_loRA_3:
    lora_3 = download_lora(lora_3_download_url, civitai_token=token_if_civitai_url)
if lora_3 and not any(lora_3.lower().endswith(ext) for ext in valid_extensions): lora_3=None

# WAN2.1 model download
if use480p:
    if model_quant == "Q4_K_M":
        dit_model = model_download("https://huggingface.co/Isi99999/Wan2.1BasedModels/resolve/main/wan2.1-i2v-14b-480p-Q4_K_M.gguf","/content/ComfyUI/models/diffusion_models")
else:
    dit_model = model_download("https://huggingface.co/Isi99999/Wan2.1BasedModels/resolve/main/wan2.1-i2v-14b-720p-Q4_K_M.gguf","/content/ComfyUI/models/diffusion_models")

# Other required model downloads
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/text_encoders/umt5_xxl_fp8_e4m3fn_scaled.safetensors -d /content/ComfyUI/models/text_encoders -o umt5_xxl_fp8_e4m3fn_scaled.safetensors
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/vae/wan_2.1_vae.safetensors -d /content/ComfyUI/models/vae -o wan_2.1_vae.safetensors
!aria2c --console-log-level=error -c -x 16 -s 16 -k 1M https://huggingface.co/Comfy-Org/Wan_2.1_ComfyUI_repackaged/resolve/main/split_files/clip_vision/clip_vision_h.safetensors -d /content/ComfyUI/models/clip_vision -o clip_vision_h.safetensors
clear_output()

print("✅ Environment Setup Complete!")


In [None]:

# @markdown # 💥2. Upload Image
file_uploaded = upload_image()
display_upload = False # @param {type:"boolean"}
if display_upload:
    if file_uploaded.lower().endswith(('.png', '.jpg', '.jpeg')):
        display(IPImage(filename=file_uploaded))
    else:
        print("Image format cannnot be displayed.")
# @markdown ---

Saving 1mode.png to 1mode.png


In [None]:

# @markdown # 💥3. Generate Video
import time
start_time = time.time()
# @markdown ### Video Settings
positive_prompt = "The beautiful woman walks forward and smiles as the camera pulls out to reveal more of the scene. She is walking to viewers" # @param {"type":"string"}
negative_prompt = "色调艳丽，过曝，静态，细节模糊不清，字幕，风格，作品，画作，画面，静止，整体发灰，最差质量，低质量，JPEG压缩残留，丑陋的，残缺的，多余的手指，画得不好的手部，画得不好的脸部，畸形的，毁容的，形态畸形的肢体，手指融合，静止不动的画面，杂乱的背景，三条腿，背景人很多，倒着走" # @param {"type":"string"}
width = 256 # @param {"type":"number"}
height = 144 # @param {"type":"number"}
seed = 2994728291 # @param {"type":"integer"}
steps = 4 # @param {"type":"integer", "min":1, "max":100}
cfg_scale = 1 # @param {"type":"number", "min":1, "max":20}
sampler_name = "lcm" # @param ["uni_pc", "uni_pc_bh2", "ddim","euler", "euler_cfg_pp", "euler_ancestral", "euler_ancestral_cfg_pp", "heun", "heunpp2","dpm_2", "dpm_2_ancestral","lms", "dpm_fast", "dpm_adaptive", "dpmpp_2s_ancestral", "dpmpp_2s_ancestral_cfg_pp", "dpmpp_sde", "dpmpp_sde_gpu","dpmpp_2m", "dpmpp_2m_cfg_pp", "dpmpp_2m_sde", "dpmpp_2m_sde_gpu", "dpmpp_3m_sde", "dpmpp_3m_sde_gpu", "ddpm", "lcm","ipndm", "ipndm_v", "deis", "res_multistep", "res_multistep_cfg_pp", "res_multistep_ancestral", "res_multistep_ancestral_cfg_pp","gradient_estimation", "er_sde", "seeds_2", "seeds_3"]
scheduler = "simple" # @param ["simple","normal","karras","exponential","sgm_uniform","ddim_uniform","beta","linear_quadratic","kl_optimal"]
frames = 16 # @param {"type":"integer", "min":1, "max":120}
fps = 8 # @param {"type":"integer", "min":1, "max":60}
# output_format = "mp4" # @param ["mp4", "webm"]
#fps = 16
output_format = "mp4"
overwrite_previous_video = False # @param {type:"boolean"}

# @markdown ### Model Configuration
use_sage_attention = False # @param {type:"boolean"}
# use_sage_attention = True
use_flow_shift = False # @param {type:"boolean"}
flow_shift = 3 # @param {"type":"slider","min":0.0,"max":100.0,"step":0.01}


# @markdown ### Wan2.1 Based Models LoRA Configuration
use_causvid = False # @param {type:"boolean"}
causvid_Strength = 0.8 # @param {"type":"slider","min":-100,"max":100,"step":0.01}
causvid_steps = 4 # @param {"type":"integer", "min":1, "max":20}
use_lightx2v = False # @param {type:"boolean"}
lightx2v_Strength = 0.5 # @param {"type":"slider","min":-100,"max":100,"step":0.01}
lightx2v_steps = 4 # @param {"type":"integer", "min":1, "max":20}
use_fusionx = False # @param {type:"boolean"}
fusionx_Strength = 0.5 # @param {"type":"slider","min":-100,"max":100,"step":0.01}
fusionx_steps = 4 # @param {"type":"integer", "min":1, "max":20}


# @markdown ### LoRA Configuration
use_lora = False # @param {type:"boolean"}
LoRA_Strength = 1.0 # @param {"type":"slider","min":-100,"max":100,"step":0.01}
use_lora2 = False # @param {type:"boolean"}
LoRA_Strength2 = 1.0 # @param {"type":"slider","min":-100,"max":100,"step":0.01}
use_lora3 = False # @param {type:"boolean"}
LoRA_Strength3 = 1.0 # @param {"type":"slider","min":-100,"max":100,"step":0.01}

# @markdown ### Teacache Settings
rel_l1_thresh = 0 # @param {"type":"slider","min":0.0,"max":10,"step":0.001}
start_percent = 0.2 # @param {"type":"slider","min":0.0,"max":1.0,"step":0.01}
end_percent = 1.0 # @param {"type":"slider","min":0.0,"max":1.0,"step":0.01}

# @markdown ---

import random
seed = seed if seed != 0 else random.randint(0, 2**32 - 1)
print(f"Using seed: {seed}")

# with torch.inference_mode():
generate_video(
    image_path=file_uploaded,
    LoRA_Strength=LoRA_Strength,
    rel_l1_thresh=rel_l1_thresh,
    start_percent=start_percent,
    end_percent = end_percent,
    positive_prompt=positive_prompt,
    negative_prompt=negative_prompt,
    width=width,
    height=height,
    seed=seed,
    steps=steps,
    cfg_scale=cfg_scale,
    sampler_name=sampler_name,
    scheduler=scheduler,
    frames=frames,
    fps=fps,
    output_format=output_format,
    overwrite=overwrite_previous_video,
    use_lora = use_lora,
    use_lora2=use_lora2,
    LoRA_Strength2=LoRA_Strength2,
    use_lora3=use_lora3,
    LoRA_Strength3=LoRA_Strength3,
    use_causvid=use_causvid,
    causvid_Strength=causvid_Strength,
    causvid_steps=causvid_steps,
    use_lightx2v=use_lightx2v,
    lightx2v_Strength=lightx2v_Strength,
    lightx2v_steps=lightx2v_steps,
    use_fusionx=use_fusionx,
    fusionx_Strength=fusionx_Strength,
    fusionx_steps=fusionx_steps,
    use_sage_attention = use_sage_attention,
    enable_flow_shift = use_flow_shift,
    shift = flow_shift
)

end_time = time.time()
duration = end_time - start_time
mins, secs = divmod(duration, 60)
print(f"Seed: {seed}")
print(f"✅ Generation completed in {int(mins)} min {secs:.2f} sec")

clear_memory()

In [None]:

# @markdown # 💥4. Apply Frame Interpolation
# interpolate_optional_video=False # @param {type:"boolean"}

# if interpolate_optional_video:
#     try:
#         output_path = oIoutput_path
#     except NameError:
#         pass


import glob
from IPython.display import Video as outVid
import time
start_time = time.time()

FRAME_MULTIPLIER = 2 # @param {"type":"number"}
vid_fps = 30 # @param {"type":"number"}
crf_value = 17 # @param {"type":"slider","min":0,"max":51,"step":1}

print(f"Converting video to {vid_fps} fps...")

%cd /content/Practical-RIFE

# Suppress ALSA errors
os.environ["XDG_RUNTIME_DIR"] = "/tmp"
os.environ["SDL_AUDIODRIVER"] = "dummy"

# Disable warnings from ffmpeg about missing audio
os.environ["PYGAME_HIDE_SUPPORT_PROMPT"] = "1"
os.environ["FFMPEG_LOGLEVEL"] = "quiet"

!python3 inference_video.py --multi={FRAME_MULTIPLIER} --fps={vid_fps} --video={output_path} --scale={1}
video_folder = "/content/ComfyUI/output/"

# Find the latest MP4 file
video_files = glob.glob(os.path.join(video_folder, "*.mp4"))

if video_files:
    latest_video = max(video_files, key=os.path.getctime)
    # !ffmpeg -i "{latest_video}" -vcodec libx264 -crf 18 -preset fast output_converted.mp4 -loglevel error -y
    !ffmpeg -i "{latest_video}" -vcodec libx264 -crf {crf_value} -preset fast output_converted.mp4 -loglevel error -y

    print(f"Displaying video: {latest_video}")
    # display(outVid("output_converted.mp4", embed=True))
    display_video("output_converted.mp4")
    # displayVid(outVid(latest_video, embed=True))
else:
    print("❌ No video found in output/")

del video_files

end_time = time.time()
duration = end_time - start_time
mins, secs = divmod(duration, 60)
print(f"✅ Frame Interpolation completed in {int(mins)} min {secs:.2f} sec")

clear_memory()

%cd /content/ComfyUI

# @markdown ---