# Yelp Review Analysis

I am a consultant for DelFalco's Itaian Restaurant. The owner asked ,e to identify whether there are any foods on their menu that diners find disappointing.

In [1]:
import pandas as pd

The business owner suggested me use diner reviews from the Yelp website to determine which dishes people liked and disliked. I pulled the data from Yelp. Before I get to analysis, let's look at the data I have to work with.

In [2]:
# Load in the data from JSON file
data = pd.read_json("restaurant.json")
data.head()

Unnamed: 0,review_id,user_id,business_id,stars,useful,funny,cool,text,date
109,lDJIaF4eYRF4F7g6Zb9euw,lb0QUR5bc4O-Am4hNq9ZGg,r5PLDU-4mSbde5XekTXSCA,4,2,0,0,I used to work food service and my manager at ...,2013-01-27 17:54:54
1013,vvIzf3pr8lTqE_AOsxmgaA,MAmijW4ooUzujkufYYLMeQ,r5PLDU-4mSbde5XekTXSCA,4,0,0,0,We have been trying Eggplant sandwiches all ov...,2015-04-15 04:50:56
1204,UF-JqzMczZ8vvp_4tPK3bQ,slfi6gf_qEYTXy90Sw93sg,r5PLDU-4mSbde5XekTXSCA,5,1,0,0,Amazing Steak and Cheese... Better than any Ph...,2011-03-20 00:57:45
1251,geUJGrKhXynxDC2uvERsLw,N_-UepOzAsuDQwOUtfRFGw,r5PLDU-4mSbde5XekTXSCA,1,0,0,0,Although I have been going to DeFalco's for ye...,2018-07-17 01:48:23
1354,aPctXPeZW3kDq36TRm-CqA,139hD7gkZVzSvSzDPwhNNw,r5PLDU-4mSbde5XekTXSCA,2,0,0,0,"Highs: Ambience, value, pizza and deserts. Thi...",2018-01-21 10:52:58


The owner also gave me this list of menu items and common alternate spellings.

In [3]:
menu = ["Cheese Steak", "Cheesesteak", "Steak and Cheese", "Italian Combo", "Tiramisu", "Cannoli",
        "Chicken Salad", "Chicken Spinach Salad", "Meatball", "Pizza", "Pizzas", "Spaghetti",
        "Bruchetta", "Eggplant", "Italian Beef", "Purista", "Pasta", "Calzones",  "Calzone",
        "Italian Sausage", "Chicken Cutlet", "Chicken Parm", "Chicken Parmesan", "Gnocchi",
        "Chicken Pesto", "Turkey Sandwich", "Turkey Breast", "Ziti", "Portobello", "Reuben",
        "Mozzarella Caprese",  "Corned Beef", "Garlic Bread", "Pastrami", "Roast Beef",
        "Tuna Salad", "Lasagna", "Artichoke Salad", "Fettuccini Alfredo", "Chicken Parmigiana",
        "Grilled Veggie", "Grilled Veggies", "Grilled Vegetable", "Mac and Cheese", "Macaroni",  
         "Prosciutto", "Salami"]

First, I plan my analysis.  
**Here is the idea how to find which menu items have disappointed diners**: I can group reviews by what ment items they mention, and then calculate the average rating for reviews that mentioned each item. I can tell which foods are mentioned in reviews with low scores, so the restaurant can fix the recipe or remove those foods from the menu.

Second, I will find items in one overview.  
As a first step I will write code to extract the foods mentioned in a single review. Since menu items are multiple token long, I will use **PharseMatcher** which can match series of tokens. 

In [4]:
import spacy
from spacy.matcher import PhraseMatcher

index_of_review_to_test_on = 627
text_to_test_on = data.text.iloc[index_of_review_to_test_on]

#Load the SpaCy model
nlp = spacy.blank('en')

#Create the tokenized version of text_to_test_on
review_doc = nlp(text_to_test_on)

#Create the PhaseMatcher object. The tokeizer is the first argument. Use attr = 'LOWER' to make consistent capitalization
matcher = PhraseMatcher(nlp.vocab, attr='LOWER')

#Create a list of tokens for each item in the menu
menu_tokens_list = [nlp(item) for item in menu]

matcher.add("MENU", menu_tokens_list)

#Find mathces in the review_doc
matches = matcher(review_doc)

Let's print the matches.

In [5]:
for match in matches:
    print(f"Token number {match[1]}: {review_doc[match[1]:match[2]]}")

Token number 1: meatball


Now we will match on the whole dataset. I will run this matcher over the whole dataset and collect ratings for each menu item. Each review has a rating, ***review.stars*** . For each item that appears in the review text (***review.text***), I will append the review's rating to a list of ratings for that item. The lists are kept in a dictionary ***item_ratings***. 

In [6]:
from collections import defaultdict

#item_ratings is a dictionary of lists.
item_ratings = defaultdict(list)

for idx, review in data.iterrows():
    doc = nlp(review.text)
    matches = matcher(doc) 
    
    #Create a set of the items found in thr review text
    found_items = set([doc[match[1]:match[2]].lower_ for match in matches])

    #Update item_ratings with rating for each item in found_items
    #Transform the item strings to lowercase to make it case insensitive
    for item in found_items:
        item_ratings[item].append(review.stars)
    
# print(found_items)

Using these item ratings, I will find the menu with the worst average rating.

In [7]:
#Calculate the mean ratings for each menu item as a dictionary
mean_ratings = {item: sum(ratings)/len(ratings) for item, ratings in item_ratings.items()}

#Find the worst item, and write it as a string in worst_text. 
worst_item = sorted(mean_ratings, key=mean_ratings.get)[0]

print(worst_item)
print(mean_ratings[worst_item])

chicken cutlet
3.4


Similar to the mean ratings, I can calculate the number of reviews for each item.

In [8]:
counts = {item: len(ratings) for item, ratings in item_ratings.items()}

item_counts = sorted(counts, key=counts.get, reverse=True)
for item in item_counts:
    print(f"{item:>25}{counts[item]:>5}")

                    pizza  265
                    pasta  206
                 meatball  128
              cheesesteak   97
             cheese steak   76
                  cannoli   72
                  calzone   72
                 eggplant   69
                  purista   63
                  lasagna   59
          italian sausage   53
               prosciutto   50
             chicken parm   50
             garlic bread   39
                  gnocchi   37
                spaghetti   36
                 calzones   35
                   pizzas   32
                   salami   28
            chicken pesto   27
             italian beef   25
                 tiramisu   21
            italian combo   21
                     ziti   21
         chicken parmesan   19
       chicken parmigiana   17
               portobello   14
           mac and cheese   11
           chicken cutlet   10
         steak and cheese    9
                 pastrami    9
               roast beef    7
       f

Here is code to print the 10 best and 10 worst rated items.

In [9]:
sorted_ratings = sorted(mean_ratings, key=mean_ratings.get)

print("Worst rated menu items:")
for item in sorted_ratings[:10]:
    print(f"{item:20} Ave rating: {mean_ratings[item]:.2f} \tcount: {counts[item]}")
    
print("\n\nBest rated menu items:")
for item in sorted_ratings[-10:]:
    print(f"{item:20} Ave rating: {mean_ratings[item]:.2f} \tcount: {counts[item]}")

Worst rated menu items:
chicken cutlet       Ave rating: 3.40 	count: 10
turkey sandwich      Ave rating: 3.80 	count: 5
spaghetti            Ave rating: 3.89 	count: 36
italian beef         Ave rating: 3.92 	count: 25
tuna salad           Ave rating: 4.00 	count: 5
macaroni             Ave rating: 4.00 	count: 5
italian combo        Ave rating: 4.05 	count: 21
garlic bread         Ave rating: 4.13 	count: 39
roast beef           Ave rating: 4.14 	count: 7
eggplant             Ave rating: 4.16 	count: 69


Best rated menu items:
chicken pesto        Ave rating: 4.56 	count: 27
chicken salad        Ave rating: 4.60 	count: 5
purista              Ave rating: 4.67 	count: 63
prosciutto           Ave rating: 4.68 	count: 50
reuben               Ave rating: 4.75 	count: 4
steak and cheese     Ave rating: 4.89 	count: 9
artichoke salad      Ave rating: 5.00 	count: 5
fettuccini alfredo   Ave rating: 5.00 	count: 6
turkey breast        Ave rating: 5.00 	count: 1
corned beef          Ave ratin

After looking at the results I think that it is important to consider the number of reviews when interpreting scores of which items are best and worst and it depends on data that we have. The less data we have for any specific item, the less we can trust that the average rating is the "real" sentiment of the customers. This is fairly common sense. If more people tell us the same thing, we are more likely to belive it. As the number of data points increases, the error on the mean decreases as 1 / sqrt(n).