GitHub - sp00/SentimentAnalysis: Course project for CS 221 Information Retrieval.

sp00 / SentimentAnalysis Public

Notifications You must be signed in to change notification settings
Fork 2
Star 3

Course project for CS 221 Information Retrieval.

Notifications

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Milestone1		Milestone1
Milestone2		Milestone2
Milestone3		Milestone3
Milestone4		Milestone4
README		README

Repository files navigation

This is a sentiment analysis course project for CS 221 Information Retrieval at University of California, Irvine. Main purpose of this project is to find a better approach to classify tweets into three categories: positive, negative and neutral. We have tried a variety of methods, and eventually achieved 79.6% precision with 52.4% recall (more details are available in the report for Milestone 4).

References, and relevant resources / links:

Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. -- http://nlp.stanford.edu/IR-book/information-retrieval-book.html
Sanders Analytics -- http://www.sananalytics.com/lab/twitter-sentiment/
Using the Twitter Search API -- https://dev.twitter.com/docs/using-search
The J.D. Power and Associates Sentiment Corpus -- http://verbs.colorado.edu/jdpacorpus/
WordNet @ Princeton University -- http://wordnet.princeton.edu/wordnet/
hadoop -- http://hadoop.apache.org/
MapReduce Tutorial -- http://hadoop.apache.org/common/docs/current/mapred_tutorial.html
WEKA @ University of Waikato -- http://weka.wikispaces.com/

Keywords:

Naive Bayes, mutual information, χ2, SentiWordNet, emoticon, n-gram, hadoop