Collection of keyword lists used to censor content on chat apps and live streaming apps used in China.
Full details on data collection and analysis methods and results are avalible in reports below:
Keyword Content Analysis
Datasets include raw keyword lists collected from the applications and processed datasets that include translations of keywords. Keywords were translated to English using combination of machine and human translation. Based on interpreting these translations with contextual information, we coded each keyword into content categories grouped under six general themes according to a code book