Skip to content

Popular repositories Loading

  1. Tune-A-Video Tune-A-Video Public

    [ICCV 2023] Tune-A-Video: One-Shot Tuning of Image Diffusion Models for Text-to-Video Generation

    Python 4.3k 389

  2. Awesome-Video-Diffusion Awesome-Video-Diffusion Public

    A curated list of recent diffusion models for video generation, editing, and various other applications.

    4.3k 253

  3. computer_use_ootb computer_use_ootb Public

    Out-of-the-box (OOTB) GUI Agent for Windows and macOS

    Python 1.5k 154

  4. Show-o Show-o Public

    [ICLR 2025] Repository for Show-o, One Single Transformer to Unify Multimodal Understanding and Generation.

    Python 1.4k 59

  5. ShowUI ShowUI Public

    [CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

    Python 1.2k 76

  6. Show-1 Show-1 Public

    [IJCV] Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation

    Python 1.1k 56

Repositories

Showing 10 of 92 repositories
  • livecc Public

    LiveCC: Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)

    Python 9 0 0 0 Updated Apr 23, 2025
  • omg Public

    Open Multimodal Gathering workshop @ NUS

    JavaScript 0 0 0 0 Updated Apr 23, 2025
  • videollm-online Public

    VideoLLM-online: Online Video Large Language Model for Streaming Video (CVPR 2024)

    Python 448 Apache-2.0 45 24 0 Updated Apr 23, 2025
  • FAR Public

    Code for: "Long-Context Autoregressive Video Modeling with Next-Frame Prediction"

    Python 181 MIT 5 0 0 Updated Apr 23, 2025
  • Awesome-Video-Diffusion Public

    A curated list of recent diffusion models for video generation, editing, and various other applications.

    4,316 253 1 0 Updated Apr 23, 2025
  • Awesome-Robotics-Diffusion Public

    A curated list of recent robot learning papers incorporating diffusion models for robotics tasks.

    141 5 0 0 Updated Apr 16, 2025
  • ROICtrl Public

    Code for [CVPR 2025] ROICtrl: Boosting Instance Control for Visual Generation

    Python 107 0 1 0 Updated Apr 16, 2025
  • computer_use_ootb Public

    Out-of-the-box (OOTB) GUI Agent for Windows and macOS

    Python 1,516 Apache-2.0 154 29 7 Updated Apr 15, 2025
  • GUI-Thinker Public

    Enable AI to control your PC. This repo includes the WorldGUI Benchmark and GUI-Thinker Agent Framework.

    Python 62 5 1 0 Updated Apr 11, 2025
  • Awesome-MLLM-Hallucination Public

    📖 A curated list of resources dedicated to hallucination of multimodal large language models (MLLM).

    656 24 1 0 Updated Apr 9, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…