Skip to content
Sentiment Analysis on Amazon Fine Food Reviews Data in Python
Branch: master
Clone or download
Latest commit 05a7eeb Jun 4, 2018
Type Name Latest commit message Commit time
Failed to load latest commit information.
AmazonFineFoodReviews.ipynb first commit Aug 2, 2017 Update Jun 3, 2018


The Amazon Fine Food Reviews dataset consists of 568,454 food reviews Amazon users left up to October 2012.

The purpose of this analysis is to make up a prediction model where we will be able to predict whether a recommendation is positive or negative. In this analysis, we will not focus on the Score, but only the positive/negative sentiment of the recommendation

Procedure involved

The project is about the sentiment analysis on the text data using

  1. nltk library which includes PorterStemmer(), word_tokenize() to change the unstructured text data to structured one
  2. Use of countvectorizer(Convert a collection of text documents to a matrix of token counts), TfidfTransformer(to scale down the impact of tokens that occur very frequently in a given corpus and that are hence empirically less informative than features that occur in a small fraction of the training corpus) from sklearn library for feature extraction
  3. naive bayes (MultinomialNB, BernoulliNB)
  4. logistic regression
  5. use of metrics like roc curve, confision matrix and classification report to evaluate the machine learning model.
You can’t perform that action at this time.