A public repository containing dataset and instructions for annotation for the paper "SGHateCheck: Functional Tests for Detecting Hate Speech in Low-Resource Languages of Singapore" (WOAH 2024)
Each folder contains the three main categories of data in this dataset. Each category contains data from each target languages of the paper (MS: Malay, SS: Singlish, TA: Tamil and ZH: Mandarin).
This folder contains test cases that were generated using the techniques described in the paper.
This folder contains annotations of selected test cases that were obtained using the techniques described in the paper.
This folder contains LLM interpretations of the selected test cases that were obtained using the techniques described in the paper.
TBC