Can a computer predict whether a review is helpful from only the text? The answer is yes.
The Jupyter Notebooks above present Data Wrangling, EDA, and Machine Learning to predict helpful book reviews. Machine learning methods include Naive Bayes, Decision Trees, Random Forests and Logistic Regression. Text-builders include CountVectorizer and TfidfVectorizer with various n-gram ranges.
The dataset includes 8.9 million rows of Amazon Book Reviews, made available by "Ups and downs: Modeling the visual evolution of fashion trends with one-class collaborative filtering," R. He, J. McAuley, WWW, 2016.