Supporting Online Toxicity Detection with Knowledge Graphs

Reference to the repository:

Authors: Paula Reyero Lobo (paula.reyero-lobo@open.ac.uk), Enrico Daga (enrico.daga@open.ac.uk), Harith Alani (harith.alani@open.ac.uk)

This repository supports the paper "Supporting Online Toxicity Detection with Knowledge Graphs" (link to paper) presented at ICWSM 2022. In this work, we deal with the problem of annotating toxic speech corpora and use semantic knowledge about gender and sexual orientation to identify missing target information about these groups. The workflow followed for this experiment is presented below:

The resulting output of this code corresponds to the directory tree bellow. We release these files in the following open repository:

icwsm22-supporting-toxicity-with-KG
│   readme.md  
└───data
│   │   all_data_splits.csv
│   │
│   └───gsso_annotations
│       │   file11.csv
│   └───gsso_annotations_inferred
│       │   file21.csv
│   │   identity_data_splits.csv
│   │   readme.md
└───results
│   └───1_freq_tables
│   └───2_freq_plots
│   └───3_freq_plots_category
│   └───4_candidate_scores
│   └───saved_dict
└───scripts

To set up the project using a virtual environment:

    $ python -m venv <env_name>
    $ source <env_name>/bin/activate
    (<env_name>) $ python -m pip install -r requirements.txt

Example usage:

Using the command line from project folder to detect gender and sexual orientation entities in the text:

    (<env_name>) $ python scripts/gsso_annotate.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Supporting Online Toxicity Detection with Knowledge Graphs

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Supporting Online Toxicity Detection with Knowledge Graphs