Analysis and Classification of Restaurant Reviews

Brief background

It is essential for a restaurant manager to know feedbacks from clients (like what are the defects in services, how people feel about the food, etc) in order to improve the customer satisfaction and restaurants' overall business. The most important resource of feedback are guests' reviews, especially those negative reviews. However, looking through thousands of guests' reviews can be too time-consuming and frustrating. Moreover, third party platform like Yelp just give ratings without information regarding why ratings are low or reviews are negative (type of drawbacks). Wouldn't it be perfect to have a NLP classification model that can classify different types of negative reviews for a restaurant manager before he look into those reviews?

In this project, we use a McDonald's review dataset and reviews we collected using Yelp API to analyze the negative reviews, to analyze feedbacks from McDonalds' guests and KFC guests, to give an idea about how various fast food restaurants do in different cities, and finally to establish NLP classification models that can classify negative reviews for restaurant managers.

Goals of this project:

(1) analyze negative reviews and feedback from guests of fast food restaurant;

(2) get an idea about how various fast food restaurants do in different cities (what are the different defects in services, food, etc);

(3) develop NLP classification models that can classify negative reviews for restaurant managers, help the manager save time on absorbing information, help restaurants improve the business.

(4) try various ML technics, deep learning, xgboost to build NLP models, evaluate models.

(5) try our model on a similar fast food restaurant's (KFC) reviews.

Resources of data:

McDonald's review dataset from data.world: https://data.world/crowdflower/mcdonalds-review-sentiment

KFC restaurant reviews we collected using Yelp API.

This repository contains the following files:

AnalysisandClassificationofRestaurantReviews.ipynb, -- the main jupyter notebook of this project, containing 6 parts:

Data collection, API and cleaning
Explanatory Data Analysis, descriptive statistics of reviews
Preprocessing & feature engineering
Various model fitting
Models evaluation
Business conclusion and discussion

review_API.ipynb, -- the jupyter notebook for Yelp API to get KFC restaurants reviews, clean up and to save data to .csv file
McDonalds-Yelp-Sentiment-DFE.csv, -- the dataset that contains 1525 observations of reviews, cities, sentiments, etc from clients who ate in McDonald's stores.
kfc.csv, -- the data file we saved after getting KFC restaurants reviews data by Yelp API. This is a clean dataset.
reviewclassifier.sav -- the classification model that gives best accuracy and f1 score.

Some conclusions:

The distribution of negative review classes:

Among all negative review categories, 'RudeService' and 'SlowService' are the major classes. This is shown in our reviews' statistical analysis and validated in the KFC reviews. These 2 points shall address attention of fast food restaurant managers. You cannot treat guests poorly just because you are selling fast food! Every one cares about how he/she is served regardless if they're staying for just 1 minute in fast food restaurant or 2 hours in a fancier sit-down establishment. Recommendations: McDonald's store managers should pay close attention to employees' attitude towards guests. SlowService is the second major complaint among customers. One of the main reasons that people choose to eat fast food is to save time. Our SlowService corpus statistics show that people are complaining about less staff, long wait time, etc. Recommendations: Restaurant managers should arrange more staff to work during rush hours or establish a more efficient process to serve guests. The following plot shows negative review categories.

Accuracy and F1 score metrics are appropriate in this project:

The metrics we used to evaluate our models are the accuracy score and weighted f1 score, as shown in section 4 in the main jupyter notebook of this project. We have 8 categories of negative reviews (RudeService, SlowService, OrderProblem, BadFood, BadNeighborhood, Filthy, MissingFood and Cost) in this case, and they are highly imbalanced (the major 2 classes are 'RudeService' and 'SlowService' and account for 52% of all negative reviews; while the minor classes, 'MissingFood' or 'Cost' for example, account for just around 2% of all negative reviews). In our case, after thorough discussion, we realized that we are not leaning to any specific class. Although BadNeighborhood and Cost problems may not be easily solved by the restaurant manager, all other classes are very important from management point of view. So precision and recall are not what we prioritized as our evaluation metrics. We used an accuracy score and a weighted f1 socre as metrics to evaluate models. We used weighted f1 score because we want to take the class imbalance into account, and calculate the metric for each label and find their average weighted by support(number of true predictions for each class label).

Stemming works better than lemmatization in processing this review data case:

The way we preprocess the data really has an influence on prediction. The way the corpus is preprocessed will differentiate the results. For example, in this project, we found out that stemming works better than lemmatization while normally lemmatization will do a better job than stemming. This is because we are dealing with Yelp reviews. When writing Yelp reviews, many people do not pay attention to proper grammar rules, some times not even completing words, they don't fully complete their sentances and they are trying to write as simple as possible as long as they can express their dissatisfaction. In this particular case, it makes sense that stemming works better. Thus, in NLP, data scientists shall firstly look into the corpus, understand how the texts look like and then determine which preprocessing steps make sense.

The classifier model developed in this project can be used to classify negative reviews of similar fast food restaurant (KFC, Burger King, etc). When validating this model on KFC reviews dataset, the accuracy is close to 70%, which means the model is valid. Here is a sample classification of KFC reviews (data from Yelp API):

Project presentation link:

https://docs.google.com/presentation/d/1lnyGBvrsaY2GQ7cxckzzuVZVGe3VINJ4v_QemdnUe_o/edit?usp=sharing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Analysis and Classification of Fast Food Restaurant Reviews.pdf

Analysis and Classification of Fast Food Restaurant Reviews.pdf