This repository contains the data used in the following paper:
@inproceedings{xu2013gathering,
title = {Gathering and generating paraphrases from {Twitter} with application to normalization},
author = {Xu, Wei and Ritter, Alan and Grishman, Ralph},
booktitle = {Proceedings of the Sixth Workshop on Building and Using Comparable Corpora (BUCC)},
year = {2013},
url = {http://aclweb.org/anthology/W/W13/W13-2515.pdf}
}
The repository https://github.com/cocoxu/multip contains the source code of the Multiple-instance Learning Paraphrase (MultiP) Model in the following paper:
@article{Xu-EtAl-2014:TACL,
author = {Wei Xu and Alan Ritter and Chris Callison-Burch and William B. Dolan and Yangfeng Ji},
title = {Extracting Lexically Divergent Paraphrases from {Twitter}},
journal = {Transactions of the Association for Computational Linguistics (TACL)},
volume = {2},
number = {1},
year = {2014},
url = {http://www.cis.upenn.edu/~xwe/files/tacl2014-extracting-paraphrases-from-twitter.pdf}
}