Skip to content

malongxuan/MSD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 

Repository files navigation

MSD

The data and code of ACL 2023 Findings paper "I run as fast as a rabbit, can you? A Multilingual Simile Dialogue Dataset"

https://arxiv.org/abs/2306.05672

The simile in a dialogue scene is very different from the traditional simile in a sentence or a triplet. This paper studies the complex simile phenomena in dialogue and proposes simile dialogue data with both English and Chinese examples. The statistics are as below.

Category Ch En
Simile 5,515 3,576
Literal 5,904 4,570
Tenor in context 32.8% 48.9%
Tenor in response 67.2% 51.1%
Vehicle before Tenor 5.7% 0.9%
Tenor before Vehicle 94.3% 99.1%
Ave. context words in simile 20.76 22.22
Ave. response words in simile 18.86 17.83

The MSD-v1.0.zip contains five folders corresponding to the five tasks we defined and the data we used in the paper.

1 - Simile Recognition

2 - Simile Interpretation

3 - Simile Generation

4 - Simile Response Retrieval

5 - Simile Response Generation(Completion)

If you use the MSD data for research, please cite our paper "I run as fast as a rabbit, can you? A Multilingual Simile Dialogue Dataset"

@inproceedings{DBLP:conf/acl/MaZZSK023, author = {Longxuan Ma and Weinan Zhang and Shuhan Zhou and Churui Sun and Changxin Ke and Ting Liu}, editor = {Anna Rogers and Jordan L. Boyd{-}Graber and Naoaki Okazaki}, title = {I run as fast as a rabbit, can you? {A} Multilingual Simile Dialogues Datasets}, booktitle = {Findings of the Association for Computational Linguistics: {ACL} 2023, Toronto, Canada, July 9-14, 2023}, pages = {7223--7237}, publisher = {Association for Computational Linguistics}, year = {2023}, url = {https://doi.org/10.18653/v1/2023.findings-acl.453}, doi = {10.18653/v1/2023.findings-acl.453}, timestamp = {Wed, 23 Aug 2023 14:28:15 +0200}, biburl = {https://dblp.org/rec/conf/acl/MaZZSK023.bib}, bibsource = {dblp computer science bibliography, https://dblp.org} }

About

The data and code of ACL 2023 Findings paper "I run as fast as a rabbit, can you? A Multilingual Simile Dialogue Dataset"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published