Skip to content

Pinned Loading

  1. OLMo Public

    Modeling, training, eval, and inference code for OLMo

    Python 5.6k 605

  2. dolma Public

    Data and tools for generating and inspecting OLMo pre-training data.

    Python 1.2k 140

  3. ai2thor Public

    An open-source platform for Visual AI.

    C# 1.4k 238

  4. olmocr Public

    Toolkit for linearizing PDFs for LLM datasets/training

    Python 12.4k 856

  5. OLMoE Public

    OLMoE: Open Mixture-of-Experts Language Models

    Jupyter Notebook 751 67

Repositories

Showing 10 of 498 repositories
  • olmocr Public

    Toolkit for linearizing PDFs for LLM datasets/training

    Python 12,386 Apache-2.0 856 87 17 Updated May 16, 2025
  • datamap-rs Public

    Data mapping framework for rust stuff

    Rust 3 1 0 0 Updated May 15, 2025
  • ScienceWorld Public

    ScienceWorld is a text-based virtual environment centered around accomplishing tasks from the standardized elementary science curriculum.

    Scala 261 Apache-2.0 28 13 (1 issue needs help) 0 Updated May 15, 2025
  • olmo-cookbook Public

    OLMost every training recipe you need to perform data interventions with the OLMo family of models.

    Python 26 Apache-2.0 6 0 9 Updated May 16, 2025
  • OLMo Public

    Modeling, training, eval, and inference code for OLMo

    Python 5,606 Apache-2.0 605 45 59 Updated May 16, 2025
  • ai2thor Public

    An open-source platform for Visual AI.

    C# 1,375 Apache-2.0 238 251 4 Updated May 15, 2025
  • beaker-gantry Public

    Gantry streamlines running Python experiments in Beaker by managing containers and boilerplate for you

    Python 23 Apache-2.0 6 2 2 Updated May 15, 2025
  • Python 9 Apache-2.0 2 11 10 Updated May 15, 2025
  • open-instruct Public

    AllenAI's post-training codebase

    Python 2,961 Apache-2.0 384 16 5 Updated May 15, 2025
  • rslearn Public

    A tool for developing remote sensing datasets and models.

    Python 35 Apache-2.0 4 9 6 Updated May 15, 2025