Skip to content

Latest commit

 

History

History
139 lines (85 loc) · 5.07 KB

MODELS.md

File metadata and controls

139 lines (85 loc) · 5.07 KB

Models

A repository for the available models for AllenNLP. While we highlight a particular model for each task on https://allennlp.org/models we often have other trained models that might work better for a particular application.

Machine Comprehension

Based on BiDAF (Seo et al, 2017)

$ docker run allennlp/allennlp:v0.7.0 \
    evaluate \
    https://s3-us-west-2.amazonaws.com/allennlp/models/bidaf-model-2017.09.15-charpad.tar.gz \
    https://s3-us-west-2.amazonaws.com/allennlp/datasets/squad/squad-dev-v1.1.json

Metrics:
start_acc: 0.642
  end_acc: 0.671
 span_acc: 0.552
       em: 0.683
       f1: 0.778

Textual Entailment

Based on Parikh et al, 2017

$ docker run allennlp/allennlp:v0.7.0 \
    evaluate \
    https://s3-us-west-2.amazonaws.com/allennlp/models/decomposable-attention-elmo-2018.02.19.tar.gz \
    https://s3-us-west-2.amazonaws.com/allennlp/datasets/snli/snli_1.0_test.jsonl

Metrics:
accuracy: 0.864

Semantic Role Labeling

Based on He et al, 2017

f1: 0.849

Coreference Resolution

Based on End-to-End Coreference Resolution (Lee et al, 2017)

f1: 0.630

Named Entity Recognition

Based on Deep contextualized word representations

f1: 0.925

Constituency Parsing

Based on Minimal Span Based Constituency Parser (Stern et al, 2017) but with ELMo embeddings

Dependency Parsing

Biaffine Parser

Based on Dozat and Manning, 2017

f1: 0.941

Semantic Parsing

Wikitables

Caveat: that this is trained on only part of the data and not officially evaluated.

Event2Mind

Based on Event2Mind: Commonsense Inference on Events, Intents, and Reactions More information at: https://homes.cs.washington.edu/~msap/debug/event2mind/docs/

$ allennlp evaluate \
    https://s3-us-west-2.amazonaws.com/allennlp/models/event2mind-2018.09.17.tar.gz  \
    https://raw.githubusercontent.com/uwnlp/event2mind/9855e83c53083b62395cc7e1af6ee9411515a14e/docs/data/test.csv

Metrics (unigram recall):
xintent: 0.36
xreact: 0.41
oreact: 0.65

BiMPM

Based on Bilateral Multi-Perspective Matching for Natural Language Sentences

ESIM

Based on Enhanced LSTM for Natural Language Inference and uses ELMo