
-
Peking University
- Peking,China
-
11:32
- 8h ahead - https://muzhancun.github.io/
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
Awesome In-Context RL: A curated list of In-Context Reinforcement Learning - - —
Official Implementation of Paper "ROCKET-2: Steering Visuomotor Policy via Cross-View Goal Alignment"
DexGraspVLA: A Vision-Language-Action Framework Towards General Dexterous Grasping
Training and inference code for "Training Language Models for Social Deduction with Multi-Agent Reinforcement Learning"
🚀 Efficient implementations of state-of-the-art linear attention models in Torch and Triton
Displaying code blocks with line numbers and line highlighting.
Pretraining code for a large-scale depth-recurrent language model
[NeurIPS 2024] Pre-Trained Multi-Goal Transformers with Prompt Optimization for Efficient Online Adaptation
Official PyTorch Implementation of "History-Guided Video Diffusion"
Probabilistic programming with HuggingFace language models
👌[ICLR 2025] TFG-Flow: Training-free Guidance in Multimodal Generative Flow
⚡⚡ Lightning Fast (~300TPS) Reinforcement Learning environment on latest Minecraft 🏝️
[ICLR 2025] LAPA: Latent Action Pretraining from Videos
Cosmos is a world model development platform that consists of world foundation models, tokenizers and video processing pipeline to accelerate the development of Physical AI at Robotics & AV labs. C…
Re-implementation of pi0 vision-language-action (VLA) model from Physical Intelligence
The Large-scale Manipulation Platform for Scalable and Intelligent Embodied Systems
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
Programmatic generation of high-quality CVs
Implementation of a framework for Genie2 in Pytorch
Pytorch implementation of "Genie: Generative Interactive Environments", Bruce et al. (2024).
[ICLR25] High-performance Image Tokenizers for VAR and AR
Code of Pyramidal Flow Matching for Efficient Video Generative Modeling
MineStudio: A Streamlined Package for Minecraft AI Agent Development
Official code for "Behavior Generation with Latent Actions" (ICML 2024 Spotlight)
[ECCV2024] Grounded Multimodal Large Language Model with Localized Visual Tokenization