Skip to content

Latest commit

 

History

History
48 lines (38 loc) · 1.95 KB

File metadata and controls

48 lines (38 loc) · 1.95 KB

Supporting Online Toxicity Detection with Knowledge Graphs

Reference to the repository: DOI

Authors: Paula Reyero Lobo (paula.reyero-lobo@open.ac.uk), Enrico Daga (enrico.daga@open.ac.uk), Harith Alani (harith.alani@open.ac.uk)

This repository supports the paper "Supporting Online Toxicity Detection with Knowledge Graphs" (link to paper) presented at ICWSM 2022. In this work, we deal with the problem of annotating toxic speech corpora and use semantic knowledge about gender and sexual orientation to identify missing target information about these groups. The workflow followed for this experiment is presented below:

dependency graph

The resulting output of this code corresponds to the directory tree bellow. We release these files in the following open repository:

icwsm22-supporting-toxicity-with-KG
│   readme.md  
└───data
│   │   all_data_splits.csv
│   │
│   └───gsso_annotations
│       │   file11.csv
│   └───gsso_annotations_inferred
│       │   file21.csv
│   │   identity_data_splits.csv
│   │   readme.md
└───results
│   └───1_freq_tables
│   └───2_freq_plots
│   └───3_freq_plots_category
│   └───4_candidate_scores
│   └───saved_dict
└───scripts

To set up the project using a virtual environment:

    $ python -m venv <env_name>
    $ source <env_name>/bin/activate
    (<env_name>) $ python -m pip install -r requirements.txt

Example usage:

Using the command line from project folder to detect gender and sexual orientation entities in the text:

    (<env_name>) $ python scripts/gsso_annotate.py