Skip to content
Emotion Intensity Prediction for Tweets. This repository contains the IMS System submission for the WASSA-2017 Shared Task on Emotion Intensity (EmoInt)
Java Python Shell
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


This repository contains the IMS System submission for the WASSA-2017 Shared Task on Emotion Intensity (EmoInt)
Task: Given a tweet and an emotion x {anger,fear,joy,sadness}, determine the intensity or degree of emotion X felt by the speaker -- a real-valued score between 0 and 1.

The offical shared task paper

-- Requirements:

1] We make use of weka for extracting baseline features and performing the random forest regression

2] We use Keras with Thenao backend for our Regression feature base on embeddings

3] We need Lemma and Part-of-Speech tags, these were obtained using the TweetNLP Download: (we used ark-tweet-nlp-0.3.2)

4] Some features were created by the Baseline System (Affective Tweets), which is an additional weka package

5] The Automatically Extended Norms (created by us) are available for download here (65MB): this file contains exntendes resources of affective norms and emotion lexicons.
Note: if you use the extended norms please cite our work and the orgininal resource (see paper for reference details)

Example Usage:

Assuming you want to use IMS to predict intensity prediction for a given input file. We provide a full pipeline for the example in the folder: run_through_example/anger_example/anger_plain.txt Note that you need to ajdust multiple paths with respect to the required tools (TwitterNLP, weka, ...) according to your local machine. Then you need to do the following steps

1) Parse the input file

 - using a plain text file you can run scripts/

  • This will transform a one sentence/tweet per line format into a one word per line format    
2) Run the CNN-LSTM Regression model
  • The scripts trains one model per emotion for the given test file
  • By default we rely on the official training data for training
  • Note that we provide here only a subset of our vectors
  • keras_regression/twitter_sgns_subset.txt.gz covers the shared task vocabulary
  • vectors are in word2vec format (can be gz, txt or binary)
  • The output of the regression is a single file per training emotion, for further processing we create one file containing all four predictions by using paste anger.txt. fear.txt. joy.txt sadness.txt > afjs.txt
3) Create an Inputfile for weka
  • this can be done using the scripts/createarff.jar (fulll code scripts/createarff_java/)
  • This step combines the previous steps
  • Run java -jar createarff.jar <parsedFile> <inputfile w.Ratings> ratings/Ratings.csv.gz <CNN-LSTM output>
4) Add Baseline features from Affective tweets
  • Using the GUI or the command line we add the features from AffectiveTweets
  • We apply default settings of TweetToSentiStrengthFeatureVector & TweetToLexiconFeatureVector  
5) Run wekas Random Forest
  • scripts/ or scripts/
  • To apply the script link to the folder from the training (arff) files (official_train_arff/)

A full and more detailed description for using IMS emotion prediction can can be seen in the `` script.

Citation info

If you use the code or the created feature norms, please cite our paper (Bibtex) PDF

Contact info

Contact: maximilian.koeper AT

Project Homepage
University Homepage Maximilian
University Homepage Roman
University Homepage Evgeny

You can’t perform that action at this time.