Skip to content

Popular repositories Loading

  1. VLM-R1 VLM-R1 Public

    Solve Visual Understanding with Reinforced VLMs

    Python 5.7k 366

  2. OmAgent OmAgent Public

    Build multimodal language agents for fast prototype and production

    Python 2.6k 279

  3. OmDet OmDet Public

    Real-time and accurate open-vocabulary end-to-end object detection

    Python 1.3k 111

  4. RS5M RS5M Public

    RS5M: a large-scale vision language dataset for remote sensing [TGRS]

    Python 287 13

  5. awesome-RSVLM awesome-RSVLM Public

    Collection of Remote Sensing Vision-Language Models

    141 4

  6. VL-CheckList VL-CheckList Public

    Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]

    Python 135 4

Repositories

Showing 10 of 19 repositories
  • VLM-FO1 Public

    VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

    om-ai-lab/VLM-FO1’s past year of commit activity
    Python 19 1 0 0 Updated Nov 3, 2025
  • VLM-R1 Public

    Solve Visual Understanding with Reinforced VLMs

    om-ai-lab/VLM-R1’s past year of commit activity
    Python 5,670 Apache-2.0 366 161 0 Updated Oct 21, 2025
  • ZoomEye Public

    [EMNLP-2025 Oral] ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

    om-ai-lab/ZoomEye’s past year of commit activity
    Python 59 5 6 0 Updated Aug 28, 2025
  • om-ai-lab.github.io Public

    Official website for the org

    om-ai-lab/om-ai-lab.github.io’s past year of commit activity
    HTML 0 1 0 0 Updated Aug 15, 2025
  • ImageRAG Public

    Enhancing Ultrahigh Resolution Remote Sensing Imagery Analysis With ImageRAG [GRSM]

    om-ai-lab/ImageRAG’s past year of commit activity
    Jupyter Notebook 22 MIT 0 1 0 Updated Jul 10, 2025
  • open-agent-leaderboard Public

    Reproducible Language Agent Research

    om-ai-lab/open-agent-leaderboard’s past year of commit activity
    Python 29 2 0 0 Updated Jun 25, 2025
  • VLM-R1.github.io Public

    Blog Site for VLM-R1

    om-ai-lab/VLM-R1.github.io’s past year of commit activity
    HTML 1 0 0 0 Updated Mar 19, 2025
  • OmAgent Public

    Build multimodal language agents for fast prototype and production

    om-ai-lab/OmAgent’s past year of commit activity
    Python 2,569 Apache-2.0 279 6 12 Updated Mar 19, 2025
  • RS5M Public

    RS5M: a large-scale vision language dataset for remote sensing [TGRS]

    om-ai-lab/RS5M’s past year of commit activity
    Python 287 MIT 13 4 0 Updated Mar 17, 2025
  • OmChat Public

    A suite of multimodal language models that are powerful and efficient

    om-ai-lab/OmChat’s past year of commit activity
    Python 17 Apache-2.0 3 0 0 Updated Jan 13, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.