Skip to content

amazon-science/wqa-cerberus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems

This is the official CERBERUS model code repository for our long paper in Findings of EMNLP 2022, "Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems".

Citation

[Paper] [Amazon Science] [Preprint]

@inproceedings{matsubara2022ensemble,
  title={{Ensemble Transformer for Efficient and Accurate Ranking Tasks: an Application to Question Answering Systems}},
  author={Matsubara, Yoshitomo and Soldaini, Luca and Lind, Eric and Moschitti, Alessandro},
  booktitle={Findings of the Association for Computational Linguistics: EMCNLP 2022},
  pages={7259--7272},
  year={2022}
}

Implementation

Our CERBERUS implementation is based on transformers.ElectraForSequenceClassification and tested under the following conditions:

  • Python 3.6 - 3.7
  • torch==1.6.0
  • transformers==3.0.2

ASNQ: CERBERUS 11B-3B1

This CERBERUS model consists of 11 shared encoder body layers and 3 ranking heads of 1 head layer each learned from 3 teacher AS2 models: ALBERT-XXLarge, ELECTRA-Large, and RoBERTa-Large fine-tuned on the ASNQ dataset.

Download and unzip cerberus11-3_albert_electra_roberta_asnq.zip

from transformers import AutoTokenizer
from cerberus import CerberusModel

tokenizer = AutoTokenizer.from_pretrained('google/electra-base-discriminator')
start_ckpt_file_path = './cerberus11-3_albert_electra_roberta_asnq/cerberus_model.pt'
model = CerberusModel(None, 11, start_ckpt_file_path=start_ckpt_file_path)
model.eval()
input_dict = tokenizer([('question', 'answer sentence')], 
                       return_tensors='pt',
                       max_length=128,
                       truncation=True)
output = model(**input_dict)

WikiQA: CERBERUS 11B-3B1

This CERBERUS model consists of 11 shared encoder body layers and 3 ranking heads of 1 head layer each learned from 3 teacher AS2 models: ALBERT-XXLarge, ELECTRA-Large, and RoBERTa-Large fine-tuned on the ASNQ dataset and then on the WikiQA dataset.

Download and unzip cerberus11-3_albert_electra_roberta_asnq_wikiqa.zip and asnq-electra-base-discriminator.

from transformers import AutoTokenizer
from cerberus import CerberusModel

asnq_ckpt_dir_path = './asnq-electra-base-discriminator'
tokenizer = AutoTokenizer.from_pretrained(asnq_ckpt_dir_path)

head_configs = [
    {'model': {'pretrained_model_name_or_path': asnq_ckpt_dir_path},
     'base_model': 'electra', 'classifier': 'classifier'},
    {'model': {'pretrained_model_name_or_path': asnq_ckpt_dir_path},
     'base_model': 'electra', 'classifier': 'classifier'},
    {'model': {'pretrained_model_name_or_path': asnq_ckpt_dir_path},
     'base_model': 'electra', 'classifier': 'classifier'}
]

start_ckpt_file_path = './cerberus11-3_albert_electra_roberta_asnq_wikiqa/cerberus_model.pt'
model = CerberusModel(head_configs, 11, start_ckpt_file_path=start_ckpt_file_path)
model.eval()
input_dict = tokenizer([('question', 'answer sentence')], 
                       return_tensors='pt',
                       max_length=128,
                       truncation=True)
output = model(**input_dict)

Security

See CONTRIBUTING for more information.

License

This library is licensed under the CC-BY-NC-4.0 License.