Skip to content
@DAMO-NLP-SG

Language Technology Lab at Alibaba DAMO Academy

Pinned Loading

  1. DAMO-SeaLLMs Public

    [ACL 2024 Demo] SeaLLMs - Large Language Models for Southeast Asia

    JavaScript 160 15

  2. VideoLLaMA3 Public

    Frontier Multimodal Foundation Models for Image and Video Understanding

    Jupyter Notebook 623 42

  3. CoI-Agent Public

    Official code for paper: Chain of Ideas: Revolutionizing Research via Novel Idea Development with LLM Agents

    Python 440 27

  4. Inf-CLIP Public

    [CVPR 2025] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.

    Python 234 11

  5. multimodal_textbook Public

    The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"

    Python 145 16

  6. VideoRefer Public

    [CVPR 2025] The code for "VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM"

    Python 169 8

Repositories

Showing 10 of 50 repositories
  • MMR1 Public Forked from LengSicong/MMR1

    MMR1: Advancing the Frontiers of Multimodal Reasoning

    0 Apache-2.0 2 0 0 Updated Mar 12, 2025
  • VideoLLaMA3 Public

    Frontier Multimodal Foundation Models for Image and Video Understanding

    Jupyter Notebook 623 Apache-2.0 42 43 (2 issues need help) 2 Updated Mar 12, 2025
  • VideoRefer Public

    [CVPR 2025] The code for "VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM"

    Python 169 8 5 0 Updated Mar 3, 2025
  • FineReason Public

    FineReason: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving

    Python 2 0 1 0 Updated Mar 3, 2025
  • LongPO Public

    [ICLR 2025] LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

    Python 29 4 0 0 Updated Feb 27, 2025
  • VideoLLaMA2 Public

    VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

    Python 1,105 Apache-2.0 74 72 (2 issues need help) 0 Updated Jan 23, 2025
  • multimodal_textbook Public

    The official repository for "2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining"

    Python 145 Apache-2.0 16 3 1 Updated Jan 18, 2025
  • Inf-CLIP Public

    [CVPR 2025] The official CLIP training codebase of Inf-CL: "Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss". A super memory-efficiency CLIP training scheme.

    Python 234 Apache-2.0 11 2 0 Updated Jan 16, 2025
  • CoI-Agent Public

    Official code for paper: Chain of Ideas: Revolutionizing Research via Novel Idea Development with LLM Agents

    Python 440 Apache-2.0 27 6 0 Updated Jan 15, 2025
  • LLM-R2 Public
    Python 41 8 6 0 Updated Nov 26, 2024

Most used topics

Loading…