[image credit]: https://leitesculinaria.com/103294/writings-how-to-pair-wine-and-chinese-food.html
JSON file reading script:
read_yelp_data_business.R, read_yelp_data_checkin.R, read_yelp_data_photo.R, read_yelp_data_tip.R, read_yelp_data_user.R
[JSON file reading coding credit]: https://github.com/dpliublog/yelp_data_challenge_R10
SQL file reading script:
yelp_review_Chinese.R
[data files]: https://drive.google.com/drive/folders/1_4749ED32FJuprWEspjkIkFQJDiH1yAX?usp=sharing
R script:
yelp.analysis.R
-
Just out of curiousity, where do these restaurants located? plotly link below 👇
[Restaurants Location Interactive Map]: https://plot.ly/~angelayuanyuan/1/
CHICKEN!!🍗🍗🍗 Of course.... What else would we expect😂😂😂😂
Looks like dim sum and fried rice are popular dishes
YASSSSSSS👏👏👏
Okay, we see something weird here, some words with negative sentiment scores actually indicate pretty decent ratings.
Negative words like "die","disappoint" don't necessarily means dissatisfaction, people say things like "the food is to die for!!", "the food really doesn't disappoint us..."
- Are we putting sentiment words into context? Not yet!
Just as we mentioned above, sentiment words might follow or followed by word that turn them into completely different meanings.
So let's take a look at the contexts
We could see from the graph, although some bigrams contain sentiment words (gluten free, egg drop etc), they are not used to discribe restaurants quality related stuff.
Next, we take a step further to explore positive and negative sentiment words in their contexts seperately
It seems that we interprete most of the postive sentiment words fine. However, bigrams like overly sweet, pretty bad are actually not expressing positve sentiments.
How about negative sentiments 👎👎👎?
What's wrong with hard boiled, earl grey, jerk chicken and so on ??
They are food names, but unfortunately contain negative sentinent words in their name !!:broken_heart:
Please keep in mind, these situations would definitely cause inaccuracy when we try to predict ratings using sentiment.
Seems promising:sunglasses:
Before going into prediction, let's take a step back by looking at how the users' rating data looks like
- How many reviews do users usually write
Wow...we can't tell anything from it
Try remove the outliers so we could actually see something
Most users don't give a lot of reviews
Users don't tend to give really low ratings
- sentiment polarity model
Regression models and results in yelp.analysis.R script
In this model, we use [sentimentr]https://cran.r-project.org/web/packages/sentimentr/sentimentr.pdf package to assign sentiment scores for each review text
- one more question: do ratings related to the text length of the review
It doesn't look like.
Review length seems to be related to personal habits rather than restaurants' quality
- before fitting any model, how are our response variable distributed
The ratings are not normally distributed, we might want to use logit or multinomial models
- logistic models
Try split the data into ratings higher than 3 stars and ratings lower than 3 stars
In our regression output, the log odds is extremely big, which means we didn't include sufficient information in our model building process or there are outliers in the data
- multinomial models
Using the predictors as above to fit multinomial models would cause the same problem, so we think about what other information can we add to our model.
From the users' perspective, different users have different standard when giving ratings. Some users tend to have a strict requirements for dining, so the ratings they give on Yelp will be generally low. Some users might be more tolerating, even though the quality of restaurants are not that satisfying, they are still giving quite decent ratings. So the underlying standard of each users is a factor that influences the outcome. Therefore, we go back to the Users dataset, and calculate the average ratings per user and add that information in our regression models.
From the business's perspective, their ratings are definitely related to their own quality. Since we only have data in a limited period of time, we might not be able to get a full picture of how the businesses perform over the years. However, wo do have their ratings on Yelp, which is a cumulated results over a longer period. So we go ahead and add that in too.
The plots below shows the relationship between users' average rating versus their rating for a particular restaurants and the relationship between restaurants Yelp ratings versus their ratings in reviews.
- multilevel models
Still, linear multilevel models don't suit our data
Therefore we try fitting multilevel logit models
- multilevel logit models
1) model building
In this model, we split the restaurants into two categories, those who have users' ratings lower than 3 stars, and those who have users' ratings equal or higher than three stars.
Our main objective is to find out whether we can use the sentiment which users' have shown in thier reviews to predict the ratings they might give to a certain restaurant. Besides the sentiment score of reviews, our predictors also include: indicator for restaurants' price range, parking availability and users' average ratings(all the ratings they have given on YELP/ numbers of reviews they have posted on YELP). However, by looking at our outcome data or looking at our residual plots when running a linear regression, we could see seperate trends, since the data contains repeated measurement for restaurants. Therefore, restaurants' public ratings (the one rating which shows up at the business page of a certain restaurant on YELP) is our group level predictor, which cover the information of different restaurants' random effect to our model outcome.
2) regression output
After running the regression, we find two predictors that have relatively big influence on users' ratings, the sentiment score of the review and users' average rating on YELP. On the other hand, whether the restaurant has parking slot and the price range of the restaurant doesn't matter much in users' rating process. The results are quite intuitive, people who express positve emotions in their reviews tend to give higher ratings, and people who have the habit, although we don't know the exact reason why, of giving decent ratings tend to give higher ratings. Of course, among these two factors, sentiment score plays a more important role when predicting ratings.
3) model checking
How do our model perform when used to predict ratings?
To know that, we run predictive checkings. The results are shown below.
On the left side is the prediction value, on the right side is our original data. Honestly, the distributions are similar, but our model is definitely over estimate the difference between two categories.
Hence, we run a chi square test to see if the two distribution are really different.
Well, it's not! 💃
4) discussion and implication
Using sentiment score to predict ratings can be fun, but it is not that accurate. Well it can tell whether a restaurant has above average quality or below, it is difficult to predict mild difference in ratings. After all, different person has different language habit and rating habit. People tend to go write reviews when the service they receive is remarkably pleasant or remarkably unpleasant. Some people swear when they really hate something and some people swear when they really love something. Some do both. These are all information that might affect our model but we are not taking into account of right now. Not to mention the circumstances that we discussed before, where sentiment words are not adjectives but nouns.