Skip to content

Pinned Loading

  1. OLMo Public

    Modeling, training, eval, and inference code for OLMo

    Python 5.5k 587

  2. dolma Public

    Data and tools for generating and inspecting OLMo pre-training data.

    Python 1.2k 131

  3. ai2thor Public

    An open-source platform for Visual AI.

    C# 1.3k 233

  4. olmocr Public

    Toolkit for linearizing PDFs for LLM datasets/training

    Python 10.8k 725

  5. OLMoE Public

    OLMoE: Open Mixture-of-Experts Language Models

    Jupyter Notebook 699 60

Repositories

Showing 10 of 494 repositories
  • olmocr Public

    Toolkit for linearizing PDFs for LLM datasets/training

    Python 10,791 Apache-2.0 725 69 18 Updated Apr 2, 2025
  • open-instruct Public

    AllenAI's post-training codebase

    Python 2,866 Apache-2.0 369 19 13 Updated Apr 2, 2025
  • Python 6 Apache-2.0 2 6 3 Updated Apr 2, 2025
  • OLMo-core Public

    PyTorch building blocks for the OLMo ecosystem

    Python 183 Apache-2.0 33 1 18 Updated Apr 2, 2025
  • OLMo-in-loop-evals Public

    Code for in-loop evaluation tasks used by the OLMo training team

    Python 5 Apache-2.0 2 0 1 Updated Apr 2, 2025
  • ai2thor Public

    An open-source platform for Visual AI.

    C# 1,323 Apache-2.0 233 248 4 Updated Apr 2, 2025
  • Holodeck Public

    CVPR 2024: Language Guided Generation of 3D Embodied AI Environments.

    Python 394 Apache-2.0 38 15 0 Updated Apr 2, 2025
  • olmo-cookbook Public

    OLMost every training recipe you need to perform data interventions with the OLMo family of models.

    Python 17 Apache-2.0 5 1 5 Updated Apr 2, 2025
  • dolma Public

    Data and tools for generating and inspecting OLMo pre-training data.

    Python 1,174 Apache-2.0 131 26 18 Updated Apr 1, 2025
  • codescientist Public

    CodeScientist: An automated scientific discovery system for code-based experiments

    Python 88 Apache-2.0 10 0 0 Updated Apr 1, 2025