Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time


This library provides domain-specific dictionaries for sentiment analysis. Each dictionary consists of words that statistically feature a positive or negative polarity in movie reviews or financial filings. The dictionaries are extracted from two different corpora, namely, IMDb movie reviews and U.S. regulated Form 8-K filings. Details are available from the following reference.

  • Pröllochs, Feuerriegel and Neumann (2018): Statistical Inferences for Polarity Identification in Natural Language, Working Paper, Chair for Information Systems Research, University of Freiburg, Germany.


This library contains the following dictionary resources in CSV format.

  • Movie reviews dictionary : This dictionary contains words that feature a positive or negative connotation in IMDb movie reviews (DictionaryIMDB.csv).
  • Financial filings dictionary: This dictionary contains words that feature a positive or negative connotation in U.S. regulated 8-K filings (Dictionary8K.csv).

The individual columns of each dictionary are as follows:

  • Words: This column lists the individual dictionary entries. We provide stems instead of complete words as stemming is part of the document preprocessing.
  • Scores: This column denotes the polarity score of each entry.
  • Idf: This column denotes the inverse document frequency (idf) of each entry.

In addition, this library contains the following datasets that were used to generate the above dictionaries.

  • Movie reviews dataset: This dataset contains reviews and ratings for 5006 IMDb movie reviews (Dataset_IMDB.csv).
  • Financial filings dataset: This dataset contains daily stock market returns and filings paths for 76716 U.S. regulated 8-K filings (Dataset_8K.csv).

Usage in R

We also provide both dictionaries in the form of a package for the statistical software R. You can install SentimentDictionaries from github with:

# install.packages("devtools")
devtools::install_github("nproellochs/SentimentDictionaries", subdir = "R-package")

Both dictionaries can be easily used in combination with the SentimentAnalysis R package.


SentimentDictionaries is released under the MIT License

Copyright (c) 2018 Nicolas Pröllochs & Stefan Feuerriegel


No releases published


No packages published