Stanford Natural Language Inference (SNLI) corpus is released in A large annotated corpus for learning natural language inference
Available: https://sigann.github.io/LAW-XI-2017/papers/LAW01.pdf
https://nlp.stanford.edu/projects/snli/snli_1.0.zip
Stanford Natural Language Inference corpus is a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning. At 570K pairs, it is two orders of magnitude larger than all other resources of its type. This in- crease in scale allows lexicalized classi- fiers to outperform some sophisticated ex- isting entailment models, and it allows a neural network-based model to perform competitively on natural language infer- ence benchmarks for the first time.
- Training pairs:550,152
- Development pairs:10,000
- Test pairs:10,000