Skip to content
@om-ai-lab

Om AI Lab

Open Multimodal AGI Research

Popular repositories Loading

  1. VLM-R1 VLM-R1 Public

    Solve Visual Understanding with Reinforced VLMs

    Python 5.4k 338

  2. OmAgent OmAgent Public

    Build multimodal language agents for fast prototype and production

    Python 2.5k 283

  3. OmDet OmDet Public

    Real-time and accurate open-vocabulary end-to-end object detection

    Python 1.3k 110

  4. RS5M RS5M Public

    RS5M: a large-scale vision language dataset for remote sensing [TGRS]

    Python 272 11

  5. awesome-RSVLM awesome-RSVLM Public

    Collection of Remote Sensing Vision-Language Models

    138 4

  6. VL-CheckList VL-CheckList Public

    Evaluating Vision & Language Pretraining Models with Objects, Attributes and Relations. [EMNLP 2022]

    Python 132 5

Repositories

Showing 10 of 18 repositories
  • om-ai-lab.github.io Public

    Official website for the org

    HTML 0 0 0 0 Updated Jul 17, 2025
  • ImageRAG Public

    Enhancing Ultrahigh Resolution Remote Sensing Imagery Analysis With ImageRAG [GRSM]

    Jupyter Notebook 14 MIT 0 0 0 Updated Jul 10, 2025
  • VLM-R1 Public

    Solve Visual Understanding with Reinforced VLMs

    Python 5,439 Apache-2.0 338 145 1 Updated Jun 26, 2025
  • open-agent-leaderboard Public

    Reproducible Language Agent Research

    Python 30 2 0 0 Updated Jun 25, 2025
  • VLM-R1.github.io Public

    Blog Site for VLM-R1

    HTML 1 0 0 0 Updated Mar 20, 2025
  • OmAgent Public

    Build multimodal language agents for fast prototype and production

    Python 2,540 Apache-2.0 283 6 13 Updated Mar 19, 2025
  • RS5M Public

    RS5M: a large-scale vision language dataset for remote sensing [TGRS]

    Python 272 MIT 11 4 0 Updated Mar 17, 2025
  • OmChat Public

    A suite of multimodal language models that are powerful and efficient

    Python 18 Apache-2.0 3 0 0 Updated Jan 13, 2025
  • OmAgentDocs Public
    HTML 3 4 0 0 Updated Jan 8, 2025
  • ZoomEye Public

    ZoomEye: Enhancing Multimodal LLMs with Human-Like Zooming Capabilities through Tree-Based Image Exploration

    Python 48 3 4 0 Updated Jan 1, 2025

People

This organization has no public members. You must be a member to see who’s a part of this organization.