Skip to content

ims-student-projects/Braint

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BrainT : two models for Implicit Emotion Detection

Introduction

This repository contains two models that we built in the context of WASSA-2018 Implicit Emotion Shared Task and the course Teamlab at the University of Stuttgart. The methods and results of these models were described in our paper Fine-tuning Multiclass Perceptron For Implicit Emotion Classification.

Here we only give a general overview of the task, the data set and the results. For the documentation of each models, please navigate to the corresponding subfolder:

Our two models showed comparable results: F-Macro 0.63 and 0.64 for the multi-class Perceptron and the Deep Learning model respectively. We submitted the predictions of mcPerceptron model to IESA since at the time it performed (slightly) better than the DL model.

Task Description

Our task was to predict emotions in a large dataset of tweets annotated with distant supervision. The predicted emotion should have been anger​, disgust, fear, joy, sadness or surprise​. This was an implicit emotion detection task, since the words actually expressing these six emotions (or synonyms) were masked in the train data.

An example of original tweet:

I spent 24 hours with my boyfriend yet I was still sad when he dropped me off

— вяeadney|-/ (@katzlover64) 8:33 AM - 7 Aug 2017

The tweet in the dataset:

I spent 24 hours with my boyfriend yet I was still [#TRIGGERWORD#] when he dropped me off

Datasets

The train and test datasets were provided by the shared task and contain

  • train-v3.csv : 153,383
  • test-text-labels.csv : 28,757

tweets. Both our models expect that these datasets are located in the subfolder data.

Class distribution in train data

The six emotions are more or less evenly distributed in the train data.

surprise anger disgust sad fear joy total
16.7% (25,565) 16.7% (25,562) 16.7% (25,558) 15.1% (23,165) 16.7% (25,575) 18.2% (27,958) 100.0% (153,383)

References