-
Max Planck Institute for Intelligent Systems
- https://ps.is.mpg.de/person/mkocabas
Lists (1)
Sort Name ascending (A-Z)
Stars
Official PyTorch Implementation of Opt-CWM: Self-Supervised Learning of Motion Concepts by Optimizing Counterfactuals.
Simplifying reinforcement learning for complex game environments
Physics-based Noise Modeling for Extreme Low-light Photography (CVPR 2020 Oral & TPAMI 2021)
This package contains the original 2012 AlexNet code.
SpatialLM: Large Language Model for Spatial Understanding
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
Open-Sora: Democratizing Efficient Video Production for All
🦉 OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation
Official implementation of TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
HunyuanVideo-I2V: A Customizable Image-to-Video Model based on HunyuanVideo
Toolkit for linearizing PDFs for LLM datasets/training
[CVPR 2025] Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass
[CVPR 2025] Sparse Voxels Rasterization: Real-time High-fidelity Radiance Field Rendering
[CVPR 2025] DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention
Enjoy the magic of Diffusion models!
[CVPR 2025] MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors
FastVideo is a lightweight framework for accelerating large video diffusion models.
SkyReels V1: The first and most advanced open-source human-centric video foundation model
Official Implementation for our NeurIPS 2024 paper, "Don't Look Twice: Run-Length Tokenization for Faster Video Transformers".
Code and weights for the paper "Cluster and Predict Latents Patches for Improved Masked Image Modeling"
VGGSfM: Visual Geometry Grounded Deep Structure From Motion
Frontier Multimodal Foundation Models for Image and Video Understanding
An efficient video loader for deep learning with smart shuffling that's super easy to digest