Predict helpful reviews and star ratings with new helpful rating metric and Logistic Regression.
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
Helpful_Reviews_Data_Wrangling.ipynb
Helpful_Reviews_Exploratory_Data_Analysis.ipynb
Helpful_Reviews_Final_Report.pdf
Helpful_Reviews_Machine_Learning_Tests.ipynb
Helpful_Reviews_Milestone_Report.pdf
Helpful_Reviews_Preliminary_Data_Wrangling.ipynb
Helpful_Reviews_Slidedeck.pdf
README.md

README.md

Helpful_Reviews

Can a computer predict whether a review is helpful from only the text? The answer is yes.

The Jupyter Notebooks above present Data Wrangling, EDA, and Machine Learning to predict helpful book reviews. Machine learning methods include Naive Bayes, Decision Trees, Random Forests and Logistic Regression. Text-builders include CountVectorizer and TfidfVectorizer with various n-gram ranges.

The dataset includes 8.9 million rows of Amazon Book Reviews, made available by "Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering," R. He, J. McAuley, WWW, 2016.