pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference
Mandar Joshi
Latest commit 0922b8f May 3, 2019
pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference


This repository contains the code for replicating results from

Getting Started

  • Install python3 requirements: pip install -r requirements.txt

Using pretrained pair2vec embeddings

  • Download pretrained pair2vec: ./
    • If you want to reproduce results from the paper on QA/NLI, please use the following:
      • Download and extract the pretrained models tar file
      • Run evaluation:
    python -m evaluate [--output-file OUTPUT_FILE]
                                 --cuda-device 0
                                 --include-package endtasks
                                 ARCHIVE_FILE INPUT_FILE
    • If you want to train your own QA/NLI model:
    python -m train <config_file> -s <serialization_dir> --include-package endtasks

See the experiments directory for relevant config files.

Training your own embeddings

  • Download the preprocessed corpus if you want to train pair2vec from scratch: ./
  • Training: This starts the training process which typically takes 7-10 days. It takes in a config file and a directory to save checkpoints.
python -m embeddings.train --config experiments/pair2vec_train.json --save_path <directory>


  • If you use the code, please cite the following paper
  author    = {Mandar Joshi and
               Eunsol Choi and
               Omer Levy and
               Daniel S. Weld and
               Luke Zettlemoyer},
  title     = {pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference},
  journal   = {CoRR},
  volume    = {abs/1810.08854},
  year      = {2018},
  url       = {},
  archivePrefix = {arXiv},
  eprint    = {1810.08854},
  timestamp = {Wed, 31 Oct 2018 14:24:29 +0100},
  biburl    = {},
  bibsource = {dblp computer science bibliography,}
