The Target-Based Offensive Language Identification (TBO) dataset.
The TBO datset contains post-level annotations regarding the harmfulness of an offensive post and token-level annotations comprising of the target and the offensive argument expression. Popular offensive language identification datasets for social media focus on annotation taxonomies only at the post level and more recently, some datasets have been released that feature only token-level annotations. The TBO dataset is an important resource that bridges the gap between post-level and token-level annotation datasets by introducing a single comprehensive unified annotation taxonomy.
The TBO dataset contain over 4,500 instances in English collected from Twitter.