Predicting stars on Kindle books' Reviews.

Data Scientist // Mathematician

Predicting stars on Kindle books' Reviews.

For this project, I decided to predict how many stars a customer is more likely to give to a specific written review.

Procedure

I pull up a 2 million data set of Kindle reviews, but due to computational limitations I was forzed to use only a random 10% sample of this dataset to work on.

Even though this project was challenging for me, it helped me to push myself to untaught themes like Natural Language Processing NLP. So this job was done focusing more on the Statistical tool rather than on NLP tools.

Deployment

I deployed, as a stretch goal, an app on Heroku, where you can take a shot of how this works. You can see it here.

Result

Some highlights of the analysis of the data could be read in my blog post.

In this repository:

Here is the final notebooks and some pickles I had to make, due the size of the data set.

Built With

gzip
json
pandas as pd
numpy as np
matplotlib.pyplot as plt
urllib.request.urlopen
string
seaborn
pickle
Heroku

Version

This is the very first version. Sometime I'd like to use more powerfool NLP tools to compare the new results.

Sources

The dataset could be found here: https://nijianmo.github.io/amazon/
Natural Language Processing in Python - a introductorial talk by Alice Zhao
NLP in Python tutorial - repository by Alice Zhao

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
Notebooks		Notebooks
Sample pickles		Sample pickles
README.md		README.md
Screenshot from Kindles Reviews app.png		Screenshot from Kindles Reviews app.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Scientist // Mathematician

Predicting stars on Kindle books' Reviews.

Procedure

Deployment

Result

In this repository:

Built With

Version

Sources

About

Releases

Packages

Languages

CodingDuckmx/AmazonKindleReviews

Folders and files

Latest commit

History

Repository files navigation

Data Scientist // Mathematician

Predicting stars on Kindle books' Reviews.

Procedure

Deployment

Result

In this repository:

Built With

Version

Sources

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages