Sentiment Analysis

Ewan Klein edited this page Oct 20, 2015 · 15 revisions

Sentiment Analysis Overview

Sentiment analysis has a wide appeal as providing information about the subjective dimension of texts. It can be regarded as a classification technique, either binary (polarity classification into positive/negative) or multi-class categorization (e.g. positive/neutral/negative).

Most approaches use a sentiment lexicon as a component (sometimes the only component). Lexicons can either be general purpose, or extracted from a suitable corpus, such as movie reviews with explicit ranking information.

We now have much better support for sentiment analysis in NLTK, with the following resources having been added:

Lexicons


Datasets

There is some documentation here:

Next Steps

  • Build a trained model of sentiment for a large-ish Tweet corpus
  • Add a module for feature-based classification (e.g. using the Customer Product Reviews)
  • Improve the documentation

We should also investigate the following resources:

Lexicons


Datasets