An emotion-polarity classifier specifically trained on developers' communication channels
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.
Senti4SD_GoldStandard_and_DSM Delete .gitattributes Sep 12, 2017
.gitattributes Update .gitattributes Sep 12, 2017


Senti4SD is an emotion polarity classifier specifically trained to support sentiment analysis in developers' communication channels. Senti4SD is trained and evaluated on a gold standard of over 4K posts extracted from Stack Overflow.

Fair Use Policy

Please, cite the following paper if you intend to use our tool for your own research:

F. Calefato, F. Lanubile, F. Maiorano N. Novielli. “Sentiment Polarity Detection for Software Development”, to appear in Empirical Software Engineering, DOI: 10.1007/s10664-017-9546-9

NOTE: You will need to install Git LFS extension to check out this project. Once installed and initialized, simply run:

$ git lfs clone

How do I get set up?

To set up the tool, simply run the following script from the command line:

$ sh

To run the script you need:

  • Java 8
  • R

The script will also install, if not already present, three R packages:


To classify your data using Senti4SD, execute the following instruction from the command line:

$ cd ClassificationTask
$ sh inputCorpus.csv outputPredictions.csv

where inputCorpus.csv is a file containing the data you want to classify, considering a document for each line, and outputPredictions.csv is where the predictions will be saved. This last parameter is optional, if not present the output of the classification will be saved in a file called predictions.csv.

To see how the tool works, you can execute the following example:

$ cd ClassificationTask
$ sh Sample.csv

This will produce as output a csv file called predictions.csv.

Who do I talk to?