- University of Toronto, Canada
- https://wuziyi616.github.io/
- @Dazitu_616
Highlights
- Pro
Starred repositories
Diffusion Model as a Noise-Aware Latent Reward Model for Step-Level Preference Optimization
This package contains the original 2012 AlexNet code.
This repo contains the code for 1D tokenizer and generator
[ARXIV'25] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
[CVPR 2025] VGGT: Visual Geometry Grounded Transformer
Sky-T1: Train your own O1 preview model within $450
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Official implementation of LiFT: Leveraging Human Feedback for Text-to-Video Model Alignment.
Improving Video Generation with Human Feedback
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
Machine Learning Engineering Open Book
Official PyTorch implementation for "Large Language Diffusion Models"
PyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism
AC3D: Analyzing and Improving 3D Camera Control in Video Diffusion Transformers
Perceptual video quality assessment based on multi-method fusion.
Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers"
Wan: Open and Advanced Large-Scale Video Generative Models
This is the official implementation of SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation.
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
Official Repository of "Unpaired Image-to-Image Translation via Neural Schrödinger Bridge" (ICLR 2024)
Dual Diffusion Implicit Bridges for Image-to-Image Translation. ICLR 2023.