Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP
Course project for CS 221 Information Retrieval.
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
Milestone1
Milestone2
Milestone3
Milestone4
README

README

This is a sentiment analysis course project for CS 221 Information Retrieval at University of California, Irvine. Main purpose of this project is to find a better approach to classify tweets into three categories: positive, negative and neutral. We have tried a variety of methods, and eventually achieved 79.6% precision with 52.4% recall (more details are available in the report for Milestone 4).

References, and relevant resources / links:

Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. -- http://nlp.stanford.edu/IR-book/information-retrieval-book.html
Sanders Analytics -- http://www.sananalytics.com/lab/twitter-sentiment/
Using the Twitter Search API -- https://dev.twitter.com/docs/using-search
The J.D. Power and Associates Sentiment Corpus -- http://verbs.colorado.edu/jdpacorpus/
WordNet @ Princeton University -- http://wordnet.princeton.edu/wordnet/
hadoop -- http://hadoop.apache.org/
MapReduce Tutorial -- http://hadoop.apache.org/common/docs/current/mapred_tutorial.html
WEKA @ University of Waikato -- http://weka.wikispaces.com/

Keywords:

Naive Bayes, mutual information, χ2, SentiWordNet, emoticon, n-gram, hadoop
Something went wrong with that request. Please try again.