Skip to content

Optimization suggestions #60

@sl33pyC01E

Description

@sl33pyC01E

Have you considered Tiny Auto Encoder for Hunyuan, Wan variant? It's a direct drop in for the vae you use that takes up a fraction of the memory and latency bandwidth. I successfully subbed it in myself to great success.

and

Did you know there are GGUF quantization options for Wan based models? I'm testing compatibility now, but I see no reason why a Q4 quant wouldn't run.

The combination of both would likely bring real time inference down to 4090 scale and trajectory rollout to laptop scale.

Just thoughts, I appreciate the project regardless.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions