Skip to content

angelayuanyuan/yelp-Angela-Yuan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

38 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Predict Ratings for Chinese Restaurants using Sentiment Analysis

screen shot 2017-12-02 at 11 45 55 pm


[image credit]: https://leitesculinaria.com/103294/writings-how-to-pair-wine-and-chinese-food.html

1. Read Yelp Data 🍜


JSON file reading script:
       read_yelp_data_business.R, read_yelp_data_checkin.R, read_yelp_data_photo.R, read_yelp_data_tip.R,        read_yelp_data_user.R
[JSON file reading coding credit]: https://github.com/dpliublog/yelp_data_challenge_R10

SQL file reading script:
       yelp_review_Chinese.R

2. Load Data 🍚


[data files]: https://drive.google.com/drive/folders/1_4749ED32FJuprWEspjkIkFQJDiH1yAX?usp=sharing

3. Data Manipulation 🍲


R script:
       yelp.analysis.R

4. Restaurants EDA 📈

  • How's the ratings on Chinese restaurants? general ratings on chinese restaurants

  • Just out of curiousity, where do these restaurants located? plotly link below 👇

[Restaurants Location Interactive Map]: https://plot.ly/~angelayuanyuan/1/

5. Text Analysis 📚

  • What words are frequently used when reviewing a Chinese Restaurant? word cloud 1

CHICKEN!!🍗🍗🍗 Of course.... What else would we expect😂😂😂😂

  • What else? word cloud 2

Looks like dim sum and fried rice are popular dishes

6. Sentiment Analysis 😋😃😄😑😞

  • Does the rating of a restaurant related to the average sentiment score of a person's review? average sentiment scores

YASSSSSSS👏👏👏

  • How does sentiment words related to ratings? sentiment and rating

Okay, we see something weird here, some words with negative sentiment scores actually indicate pretty decent ratings.

How did that happened? score and rating

Negative words like "die","disappoint" don't necessarily means dissatisfaction, people say things like "the food is to die for!!", "the food really doesn't disappoint us..."

  • Are we putting sentiment words into context? Not yet!

Just as we mentioned above, sentiment words might follow or followed by word that turn them into completely different meanings.

So let's take a look at the contexts sentient all

We could see from the graph, although some bigrams contain sentiment words (gluten free, egg drop etc), they are not used to discribe restaurants quality related stuff.

Next, we take a step further to explore positive and negative sentiment words in their contexts seperately

Positve sentiments first 👍👍👍 positive

It seems that we interprete most of the postive sentiment words fine. However, bigrams like overly sweet, pretty bad are actually not expressing positve sentiments.

How about negative sentiments 👎👎👎? negative

What's wrong with hard boiled, earl grey, jerk chicken and so on ??
They are food names, but unfortunately contain negative sentinent words in their name !!:broken_heart:

Please keep in mind, these situations would definitely cause inaccuracy when we try to predict ratings using sentiment.

  • Can we predict ratings using sentiment score? sentiment prediction

Seems promising:sunglasses:

7. Users info EDA 📊

Before going into prediction, let's take a step back by looking at how the users' rating data looks like

  • How many reviews do users usually write

number of reviews per user

Wow...we can't tell anything from it
Try remove the outliers so we could actually see something

number of reviews per user without outliers

Most users don't give a lot of reviews

  • What's the average ratings by users average ratings by users

Users don't tend to give really low ratings

8. Regression Analysis 💡

  • sentiment polarity model
    Regression models and results in yelp.analysis.R script

In this model, we use [sentimentr]https://cran.r-project.org/web/packages/sentimentr/sentimentr.pdf package to assign sentiment scores for each review text

  • one more question: do ratings related to the text length of the review

text length and ratings

It doesn't look like.
Review length seems to be related to personal habits rather than restaurants' quality

  • before fitting any model, how are our response variable distributed

emoji 1

The ratings are not normally distributed, we might want to use logit or multinomial models

  • logistic models

Try split the data into ratings higher than 3 stars and ratings lower than 3 stars

emoji 2

In our regression output, the log odds is extremely big, which means we didn't include sufficient information in our model building process or there are outliers in the data

  • multinomial models

Using the predictors as above to fit multinomial models would cause the same problem, so we think about what other information can we add to our model.

From the users' perspective, different users have different standard when giving ratings. Some users tend to have a strict requirements for dining, so the ratings they give on Yelp will be generally low. Some users might be more tolerating, even though the quality of restaurants are not that satisfying, they are still giving quite decent ratings. So the underlying standard of each users is a factor that influences the outcome. Therefore, we go back to the Users dataset, and calculate the average ratings per user and add that information in our regression models.

From the business's perspective, their ratings are definitely related to their own quality. Since we only have data in a limited period of time, we might not be able to get a full picture of how the businesses perform over the years. However, wo do have their ratings on Yelp, which is a cumulated results over a longer period. So we go ahead and add that in too.

The plots below shows the relationship between users' average rating versus their rating for a particular restaurants and the relationship between restaurants Yelp ratings versus their ratings in reviews.

users rating

yelp rating

  • multilevel models

Still, linear multilevel models don't suit our data
Therefore we try fitting multilevel logit models

  • multilevel logit models

1) model building

In this model, we split the restaurants into two categories, those who have users' ratings lower than 3 stars, and those who have users' ratings equal or higher than three stars.

Our main objective is to find out whether we can use the sentiment which users' have shown in thier reviews to predict the ratings they might give to a certain restaurant. Besides the sentiment score of reviews, our predictors also include: indicator for restaurants' price range, parking availability and users' average ratings(all the ratings they have given on YELP/ numbers of reviews they have posted on YELP). However, by looking at our outcome data or looking at our residual plots when running a linear regression, we could see seperate trends, since the data contains repeated measurement for restaurants. Therefore, restaurants' public ratings (the one rating which shows up at the business page of a certain restaurant on YELP) is our group level predictor, which cover the information of different restaurants' random effect to our model outcome.

2) regression output

After running the regression, we find two predictors that have relatively big influence on users' ratings, the sentiment score of the review and users' average rating on YELP. On the other hand, whether the restaurant has parking slot and the price range of the restaurant doesn't matter much in users' rating process. The results are quite intuitive, people who express positve emotions in their reviews tend to give higher ratings, and people who have the habit, although we don't know the exact reason why, of giving decent ratings tend to give higher ratings. Of course, among these two factors, sentiment score plays a more important role when predicting ratings.

3) model checking

How do our model perform when used to predict ratings?

To know that, we run predictive checkings. The results are shown below.

predictive

On the left side is the prediction value, on the right side is our original data. Honestly, the distributions are similar, but our model is definitely over estimate the difference between two categories.

Hence, we run a chi square test to see if the two distribution are really different.

Well, it's not! 💃

4) discussion and implication

Using sentiment score to predict ratings can be fun, but it is not that accurate. Well it can tell whether a restaurant has above average quality or below, it is difficult to predict mild difference in ratings. After all, different person has different language habit and rating habit. People tend to go write reviews when the service they receive is remarkably pleasant or remarkably unpleasant. Some people swear when they really hate something and some people swear when they really love something. Some do both. These are all information that might affect our model but we are not taking into account of right now. Not to mention the circumstances that we discussed before, where sentiment words are not adjectives but nouns.

Releases

No releases published

Packages

No packages published

Languages