===========================================================================
Dataset Details:
Name: Yelp dataset
Source: Yelp website
These instructions will get you a copy of the project up and running on your local machine.
In order to obtain the raw data:
You would need to download Yelp Open Dataset from https://www.yelp.com/dataset. We mainly focus on business.json, review.json, and users.json in our project.
Yelp is one of the most popular sites for finding evaluations of most business categories in the United States. Yelp offers a massive collection of reviews that serve as an important source of information for learning about particular businesses. Surprisingly, Yelp does not take much advantage of these valuable user evaluations to recommend restaurants or other businesses based on their interests. A review often gives multiple viewpoints of a business as well as personal experience narratives of consumers, making it easy to understand user preferences.
Problem description: We plan to fulfill the following objectives through our project:
Project hypothesis: Is there a correlation between the restaurant's Average Rating and its Price Range?
Predictive model: A recommendation system to recommend restaurants to Yelp users.
Techniques employed: Following are the techniques that we plan to use for the project:
- Sentiment Analysis using TfidfVectorizer for EDA
- Collaborative Filtering, Matrix Factorization and Simple SVD for Recommendation system
End result:
- Justification to the hypothesis proposed.
- Findings through exploratory data analysis.
- Word cloud for most positive and negative words in reviews.
- A web-based recommendation system to recommend restaurants to Yelp users based on various metrics.