Skip to content

CodingDuckmx/AmazonKindleReviews

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 

Repository files navigation

coding duck MX

Data Scientist // Mathematician

TwitterBlogLinkedIn

Predicting stars on Kindle books' Reviews.

For this project, I decided to predict how many stars a customer is more likely to give to a specific written review.

Procedure

I pull up a 2 million data set of Kindle reviews, but due to computational limitations I was forzed to use only a random 10% sample of this dataset to work on.

Even though this project was challenging for me, it helped me to push myself to untaught themes like Natural Language Processing NLP. So this job was done focusing more on the Statistical tool rather than on NLP tools.

Deployment

I deployed, as a stretch goal, an app on Heroku, where you can take a shot of how this works. You can see it here.

Result

Some highlights of the analysis of the data could be read in my blog post.

In this repository:

Here is the final notebooks and some pickles I had to make, due the size of the data set.

Built With

  • gzip
  • json
  • pandas as pd
  • numpy as np
  • matplotlib.pyplot as plt
  • urllib.request.urlopen
  • string
  • seaborn
  • pickle
  • Heroku

Version

This is the very first version. Sometime I'd like to use more powerfool NLP tools to compare the new results.

Sources

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages