Skip to content

Course project for CS 221 Information Retrieval.

Notifications You must be signed in to change notification settings

sp00/SentimentAnalysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is a sentiment analysis course project for CS 221 Information Retrieval at University of California, Irvine. Main purpose of this project is to find a better approach to classify tweets into three categories: positive, negative and neutral. We have tried a variety of methods, and eventually achieved 79.6% precision with 52.4% recall (more details are available in the report for Milestone 4).

References, and relevant resources / links:

Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Introduction to Information Retrieval, Cambridge University Press. 2008. -- http://nlp.stanford.edu/IR-book/information-retrieval-book.html
Sanders Analytics -- http://www.sananalytics.com/lab/twitter-sentiment/
Using the Twitter Search API -- https://dev.twitter.com/docs/using-search
The J.D. Power and Associates Sentiment Corpus -- http://verbs.colorado.edu/jdpacorpus/
WordNet @ Princeton University -- http://wordnet.princeton.edu/wordnet/
hadoop -- http://hadoop.apache.org/
MapReduce Tutorial -- http://hadoop.apache.org/common/docs/current/mapred_tutorial.html
WEKA @ University of Waikato -- http://weka.wikispaces.com/

Keywords:

Naive Bayes, mutual information, χ2, SentiWordNet, emoticon, n-gram, hadoop

About

Course project for CS 221 Information Retrieval.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published