-
CUHK
- Hong Kong, China
-
02:40
- 8h ahead - https://wbhu.github.io/
- @wbhu_cuhk
- in/huwenbo
Lists (3)
Sort Name ascending (A-Z)
Stars
A Datacenter Scale Distributed Inference Serving Framework
Cosmos-Transfer1 is a world-to-world transfer model designed to bridge the perceptual divide between simulated and real-world environments.
[Single/Sparse View-to-Scene on a 4090(24G)] VistaDream: Sampling multiview consistent images for single-view scene reconstruction
Stereo Any Video: Temporally Consistent Stereo Matching
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
[Arxiv'25] BlobCtrl: A Unified and Flexible Framework for Element-level Image Generation and Editing
[ARXIV'25] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
[CVPR 2025] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
[CVPR 2025] VGGT: Visual Geometry Grounded Transformer
LBM: Latent Bridge Matching for Fast Image-to-Image Translation ✨
Implementation of "EasyControl: Adding Efficient and Flexible Control for Diffusion Transformer"
Next-gen fast plotting library running on WGPU using the pygfx rendering engine
Official implementation of Inductive Moment Matching
PE3R: Perception-Efficient 3D Reconstruction. Take 2 - 3 photos with your phone, upload them, wait a few minutes, and then start exploring your 3D world via text!
Any-length Video Inpainting and Editing with Plug-and-Play Context Control
Official implementation of TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
Simulating the Real World: Survey & Resources, which contains our survey "Simulating the Real World: A Unified Survey of Multimodal Generative Models" and Awesome-Text2X-Resources. Watch this repos…
ConceptAttention: A method for interpreting multi-modal diffusion transformers.
An open source implementation of the gameNgen paper
【CVPR 2025】MonSter: Marry Monodepth to Stereo Unleashes Power