Baseline code for NAACL 2021 paper "Blow the Dog Whistle: A Chinese Dataset for Cant Understanding with Common Sense and World Knowledge".
Please read the terms here before you download the data. The dataset is distributed under the CC-BY-NC 3.0 license. The public data include train
, dev
, test
splits.
Download the data here.
https://competitions.codalab.org/competitions/30451#results
Please refer to the description here.
This repo has the baseline code we used in our experiments. It is based on a multi-choice MRC framework. However, we encourage you to try other types of models, e.g., two-tower model.
@inproceedings{dogwhistle,
author = {Canwen Xu and
Wangchunshu Zhou and
Tao Ge and
Ke Xu and
Julian McAuley and
Furu Wei},
title = {Blow the Dog Whistle: A Chinese Dataset for Cant Understanding with Common Sense and World Knowledge},
booktitle = {{NAACL}},
year = {2021}
}