-
Nanyang Technological University (NTU)
- Singapore
-
22:32
- 8h ahead - https://iceclear.github.io
- https://orcid.org/0000-0001-7025-3626
- @Iceclearwjy
Highlights
- Pro
Lists (21)
Sort Name ascending (A-Z)
Stars
🔥 InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity
Implementation of [CVPR 2025] "DiffSensei: Bridging Multi-Modal LLMs and Diffusion Models for Customized Manga Generation"
CAMixerSR: Only Details Need More “Attention” (CVPR 2024)
[ICLR 2025] Rectified Diffusion: Straightness Is Not Your Need
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
RichHF-18K dataset contains rich human feedback labels we collected for our CVPR'24 paper: https://arxiv.org/pdf/2312.10240, along with the file name of the associated labeled images (no urls or im…
[CVPR 2025] Teaching Large Language Models to Regress Accurate Image Quality Scores using Score Distribution
Wan: Open and Advanced Large-Scale Video Generative Models
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Solve Visual Understanding with Reinforced VLMs
mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis (arXiv, 2024)
Video Generation Foundation Models: https://saiyan-world.github.io/goku/
[CVPR 2025] MatAnyone: Stable Video Matting with Consistent Memory Propagation
[ICLR 2025] Pyramidal Flow Matching for Efficient Video Generative Modeling
Official PyTorch implementation of paper "CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up".
PyTorch code and models for the DINOv2 self-supervised learning method.
This repository provides the code and model checkpoints for AIMv1 and AIMv2 research projects.
Source code to the ProPainter website
Arbitrary-steps Image Super-resolution via Diffusion Inversion (CVPR 2025)
Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
RAG that intelligently adapts to your use case, data, and queries
victorchall / genmoai-smol
Forked from genmoai/mochiThe best OSS video generation models
[ICLR'25 Oral] Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
A comparison tool to aid image/video enhancement research