Skip to content

Exploratory data analysis and visualization on Yelp Dataset

License

Notifications You must be signed in to change notification settings

gyhou/yelp_dataset

Repository files navigation

Analyzing Yelp Dataset



Data Source

Yelp Open Dataset Challenge (https://www.yelp.com/dataset/challenge)

Round 13 started January 15, 2019 to December 31, 2019.


I trained a model to predict the user’s review rating base on reviews on the Yelp dataset in the each specific category.

My API takes in a JSON string with "category" and "review". After sending the input to my API, it will respond with the predicted rating of the review.

When submitting a review, make sure to specify which category the review is for.

Example input:

{"category": "Auto Repair", 
 "review": "Service is the worst and the wait time is too long."}

The API will return a rating base on the category and review. Example Output:

{'Category': 'Auto_Repair',
 'Review': 'Service is the worst and the wait time is too long.',
 'Predict rating': 1}

Below is the list of categories used in the Yelp dataset:

  • Active Life
  • Auto Repair
  • Automotive
  • Beauty Spas
  • Contractors
  • Doctors
  • Event Planning Services
  • Fashion
  • Fast Food
  • Hair Salons
  • Health Medical
  • Home Garden
  • Home Services
  • Local Services
  • Professional Services
  • Real Estate
  • Shopping

Scattertext Visualization

Examples base on Yelp Reviews group by categories

RV Parks and Campgrounds

RV Repair, RV Dealers, RV Rental

RV Repair, RV Dealers, RV Rental, RV Parks and Campgrounds


About

Exploratory data analysis and visualization on Yelp Dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published