Skip to content

It is Kaggle Competition-Sentiment Analysis Of Movie Reviews(SAMR)

Notifications You must be signed in to change notification settings

shivamsharma/Kaggle-SAMR

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KAGGLE-Sentiment Analysis Of Movie Reviews

###@Author Shivam Sharma(28shivamsharma@gmail.com) It is kaggle competition about mining sentiments of given test data. Link of the competition. My program is basic implementation of Naive Bayes.

How do I Run this program?

  1. First install R and packages listed below:-
  • stringr
  • openNLP
  • NLP
  1. Run program in given sequence:-

First download the test & train data-set from kaggle link. And please let me know through email if data not found I will provide it by myself. And both test.csv & train.csv dataset in data folder of repositary. Then run following programs.

  • Adjectives.R
  • train.R
  • Model.R

###Algorithm:- Algorithm contains 3 major steps:-

  1. Find features
  2. Find all probability required for calculating post-prior probability.
  3. Train on test data.

1.Find Features:-

I took adjectives(JJ/JJS/JJR) as features for my model. This approach is introduced by B. Pang[1] in Empirical Methods in Natural Language Processing. I found all adjectives by combining all Phrase and applying POS Tagging on the corpus. I found number of adjectives greater than 5000.

2.Building Naive Bayes:-

It is one best approach for text classification.It is probabilistic classification based on bayes theorem with independent features.
(a) I calculated all classes occurring probability(P(class)).
(b) Then I calculated all features probability w.r.t each class.

3.Train:-

Applying above model to test data by selecting each sentence and calculating its sentiment.

###References:- [1] B. Pang, L. Lee, and S. Vaithyanathan. Thumbs up?Sentiment classification using machine learnin techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 79-86, 2002.

[2] R. Feldman. Techniques and applications for sentiment analysis. Published in: Magazine Communications of the ACM Volume 56 Issue 4, April 2013 Pages 82-89.

About

It is Kaggle Competition-Sentiment Analysis Of Movie Reviews(SAMR)

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages