Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


This work was performed at The Allen Institute of Artificial Intelligence.

This project is constantly being improved. Contributions, comments and suggestions are welcome!

This repository contains the code for our paper oLMpics - On what Language Model Pre-training Captures.


Probe Data
Always-Never train , dev
Age-Comparison train , dev
Objects-Comparison train , dev
Antonym Negation train , dev
Property Conjunction train , dev
Taxonomy Conjunction train , dev
Encyclopedic Composition train , dev
Multi-Hop Composition train , dev


Setting up a virtual environment

  1. First, clone the repository:

    git clone
  2. Change your directory to where you cloned the files:

    cd oLMpics
  3. Create a virtual environment with Python 3.6 or above:

    virtualenv venv --python=python3.7 (or python3.7 -m venv venv or conda create -n olmpics python=3.7)
  4. Activate the virtual environment. You will need to activate the venv environment in each terminal in which you want to use oLMpics.

    source venv/bin/activate (or source venv/bin/activate.csh or conda activate olmpics)
  5. Install the required dependencies:

    pip install -r requirements.txt

Multi-choice Masked Language Model (MC-MLM) training

The AllenNLP train command is used for training. bert-base-uncased contains a simple test for a multi-choice language modeling baseline task (should currently achieve an accuracy of ~0.99% at the last epoch).

python -m train allennlp_models/config/transformer_masked_lm.jsonnet -s ../models_cache/roberta_local -o "{'dataset_reader': {'num_choices': '3', 'sample': '250','pretrained_model':'bert-base-uncased'}, 'validation_dataset_reader': {'num_choices': '3', 'sample': '-1', 'pretrained_model':'bert-base-uncased'}, 'iterator': {'batch_size': 8}, 'random_seed': '3', 'train_data_path': 's3://olmpics/challenge/multi_choice_language_modeling/multi_choice_language_modeling_train.jsonl.gz', 'model':{'pretrained_model': 'bert-base-uncased'}, 'trainer': {'cuda_device': -1, 'num_gradient_accumulation_steps': 2, 'learning_rate_scheduler': {'num_steps_per_epoch': 15}, 'num_epochs': '4'}, 'validation_data_path': 's3://olmpics/challenge/multi_choice_language_modeling/multi_choice_language_modeling_dev.jsonl.gz'}" --include-package allennlp_models

  • 'pretrained_model':'bert-base-uncased': one of the following LMs: bert-base-uncased, bert-large-uncased-whole-word-masking, bert-large-uncased, roberta-base, roberta-large
  • 'sample': '250': number of training examples to sample. (To produce a learning curve)
  • 'num_choices': '3': number of answer choices, depending on the task.
  • random_seed: an integer, we use (1,2,3,4,5,6)

see allennlp_models/config/transformer_masked_lm.jsonnet for other options.

Training the multi-choice question answering setup (MC-QA)

python -m train allennlp_models/config/transformer_mc_qa.jsonnet -s ../models_cache/roberta_local -o "{'dataset_reader': {'num_choices': '3', 'sample': '100','pretrained_model':'bert-base-uncased'}, 'validation_dataset_reader': {'num_choices': '3', 'sample': '-1', 'pretrained_model':'bert-base-uncased'}, 'iterator': {'batch_size': 8}, 'random_seed': '3', 'train_data_path': 's3://olmpics/challenge/multi_choice_language_modeling/multi_choice_language_modeling_train.jsonl.gz', 'model':{'pretrained_model': 'bert-base-uncased'}, 'trainer': {'cuda_device': -1, 'num_gradient_accumulation_steps': 2, 'learning_rate_scheduler': {'num_steps_per_epoch': 15}, 'num_epochs': '4'}, 'validation_data_path': 's3://olmpics/challenge/multi_choice_language_modeling/multi_choice_language_modeling_dev.jsonl.gz'}" --include-package allennlp_models

Training the MLM-Baseline

python -m train train allennlp_models/config/mlm_baseline.jsonnet -s models/esim_local -o "{'dataset_reader':{'sample': '2000'}, 'iterator': {'batch_size': '16'}, 'random_seed': '2', 'train_data_path': '', 'trainer': {'cuda_device': -1, 'num_epochs': '60', 'num_serialized_models_to_keep': 0, 'optimizer': {'lr': 0.0004}}, 'validation_data_path': ''}" --include-package allennlp_models

Training the ESIM-Baseline

The esim baseline can be run as follows:

python -m train train allennlp_models/config/esim_baseline.jsonnet -s models/esim_local -o "{'dataset_reader':{'sample': '500'}, 'iterator': {'batch_size': '16'}, 'random_seed': '2', 'train_data_path': '', 'trainer': {'cuda_device': -1, 'num_epochs': '60', 'num_serialized_models_to_keep': 0, 'optimizer': {'lr': 0.0004}}, 'validation_data_path': ''}" --include-package allennlp_models


A caching infra is used, so make sure to have enough disk space, and control the cache directory using OLMPICS_CACHE_ROOT env variable. see olmpics/common/


No description, website, or topics provided.






No releases published


No packages published