# Import Libraries

In [1]:
import pandas as pd
import seaborn as sns

# Read Data

In [2]:
# Load the New York reviews data
NY_reviews = pd.read_csv('./data/New_York_reviews.csv', index_col=0)
NY_reviews.head()

  exec(code_obj, self.user_global_ns, self.user_ns)


Unnamed: 0,parse_count,restaurant_name,rating_review,sample,review_id,title_review,review_preview,review_full,date,city,url_restaurant,author_id
1,2,Lido,5,Positive,review_773559838,A Regular Treat,My wife and I have been eating dinner frequent...,My wife and I have been eating dinner frequent...,"October 8, 2020",New_York_City_New_York,https://www.tripadvisor.com/Restaurant_Review-...,UID_0
2,3,Lido,4,Positive,review_769429529,Good neighborhood spot!,Came with family for Labor Day weekend brunch ...,Came with family for Labor Day weekend brunch ...,"September 8, 2020",New_York_City_New_York,https://www.tripadvisor.com/Restaurant_Review-...,UID_1
3,4,Lido,1,Negative,review_745700258,Disappointing,Food was mediocre at best. The lamb chops are...,Food was mediocre at best. The lamb chops are ...,"February 17, 2020",New_York_City_New_York,https://www.tripadvisor.com/Restaurant_Review-...,UID_2
4,5,Lido,5,Positive,review_728859349,What a find in Harlem,My co-workers were volunteering at a foodbank ...,My co-workers were volunteering at a foodbank ...,"November 25, 2019",New_York_City_New_York,https://www.tripadvisor.com/Restaurant_Review-...,UID_3
5,6,Lido,5,Positive,review_728429643,Lunch,Lido is an intimate boutique style restaurant....,Lido is an intimate boutique style restaurant....,"November 23, 2019",New_York_City_New_York,https://www.tripadvisor.com/Restaurant_Review-...,UID_4


**Features**

- parse_count: numerical (integer), corresponding number of extracted review by the web scraper (auto-incremental)
- author_id: categorical (string), univocal, incremental and anonymous identifier of the user (UID_XXXXXXXXXX)
- restaurant_name: categorical (string), name of the restaurant matching the review
- rating_review: numerical (integer), review score in the range 1-5
- sample: categorical (string), indicating “positive” sample for scores [4-5] and “negative” for scores [1-3]
- review_id: categorical (string), univocal and internal identifier of the review (review_XXXXXXXXX)
- title_review: text, review title
- review_preview: text, preview of the review, truncated in the website when the text is very long
- review_full: text, complete review
- date: timestamp, publication date of the review in the format (day, month, year)
- city: categorical (string), city of the restaurant which the review was written for
- url_restaurant: text, restaurant url

In [3]:
print("The shape of review data for New York is ", NY_reviews.shape)

The shape of review data for New York is  (510463, 12)


# Data Cleaning

## 1. Remove redundant columns, e.g. review_id, city, url, parse_count

- Since the row index can serve as the unique identifier for each comment in our data set, we will remove the review_id column.
- The entire data set contains only restaurants in new york, so will drop the city column.
- The restaurant_name is sufficient for identifying the corresponding restaurant for each comment, and hence we will remove the 'url_restaurant' column.
- Lastly, the 'parse_count' column is some irrelevant information related to web scraping that we will also drop.

In [4]:
NY_reviews = NY_reviews.drop(['parse_count','review_id','city','url_restaurant'], axis = 1)
NY_reviews.shape

(510463, 8)

## 2. Remove rows with missing values

In [5]:
# Check if any missing values
NY_reviews.isna().sum()

restaurant_name    0
rating_review      0
sample             0
title_review       1
review_preview     1
review_full        2
date               2
author_id          2
dtype: int64

In [6]:
# Since the volume of missing value is extremely small and we do not know how to impute them, thus rows with missingness are dropped
NY_reviews = NY_reviews.dropna(axis = 0)
NY_reviews.shape # 2 rows are dropped

(510461, 8)

## 3. Clean data type

In [7]:
# Check the data types
NY_reviews.dtypes

restaurant_name    object
rating_review      object
sample             object
title_review       object
review_preview     object
review_full        object
date               object
author_id          object
dtype: object

In [8]:
# Find unique values in rating_review column
NY_reviews['rating_review'].unique()

array([5, 4, 1, 3, 2, '4', '5', '2', '1', '3'], dtype=object)

We will cast 'date', which is string originally, into datetime and 'rating_review', which is mix of string and integers originally, into integers.

In [9]:
# Correct the data types in 'rating_review' and 'date' column
NY_reviews['rating_review'] = NY_reviews['rating_review'].astype(int)
NY_reviews['date'] = pd.to_datetime(NY_reviews['date'])
NY_reviews.dtypes

restaurant_name            object
rating_review               int32
sample                     object
title_review               object
review_preview             object
review_full                object
date               datetime64[ns]
author_id                  object
dtype: object

## 4. Convert all text to lowercase

Later in our project, we are planning to perform analysis using text data. For keyword analysis, in order to avoid distinction between "Good" and "good", we will turn all letters into lowercase from the beginning.

In [None]:
NY_reviews['title_review'] = NY_reviews['title_review'].str.lower()
NY_reviews['review_preview'] = NY_reviews['review_preview'].str.lower()
NY_reviews['review_full'] = NY_reviews['review_full'].str.lower()

## 5. Scan authors with too many reviews

We want to analyze the number of reviews each unique user wrote, and furthermore, identify the possible fake reviews.

In [10]:
# #Count the number of reviews made by one reviewer on the same day and include date to the DataFrame
# Group by author and date
grouped = NY_reviews.groupby(['author_id', 'date'])

# Count the number of reviews per author and date
review_counts = grouped.size().reset_index(name='review_count')

# Sort in descending order by review count
sorted_counts = review_counts.sort_values(by='review_count', ascending=False)

# Create a new DataFrame with author IDs, their biggest review counts in one day and the date
sorted_counts_df = pd.DataFrame({'author_id':sorted_counts['author_id'], 'date':sorted_counts['date'], 'review_count':sorted_counts['review_count'] })

display(sorted_counts[sorted_counts['review_count']>15])

Unnamed: 0,author_id,date,review_count
275205,UID_3874,2012-05-30,29
389689,UID_868,2012-01-10,24
124396,UID_17188,2016-05-20,22
35785,UID_11909,2015-11-03,18
248031,UID_29050,2012-01-20,17
246438,UID_28478,2011-12-22,16


In [11]:
#This reviewer left 29 reviews on different restuarants on the same day
pd.set_option('display.max_colwidth', None)

# Select the date with the highest review count
top_date = sorted_counts.iloc[0]['date']

# Filter the original reviews data frame for the selected author and date
selected_reviews_3874 = NY_reviews[(NY_reviews['author_id'] == 'UID_3874') & (NY_reviews['date'] == top_date)]
selected_reviews_3874

Unnamed: 0,restaurant_name,rating_review,sample,title_review,review_preview,review_full,date,author_id
4107,Cafe_Mogador,4,Positive,great deal,"a moroccan place in the heart of east village. often confused with cafe orlin: the size and space and food is very similar… it’s great for people on a budget…got the yogurt tangine [served with pita]- so good we got two orders of it,...","a moroccan place in the heart of east village. often confused with cafe orlin: the size and space and food is very similar… it’s great for people on a budget… got the yogurt tangine [served with pita]- so good we got two orders of it, merguez sandwich, the chicken tangine [w/charmoulla] and halumi eggs [poached eggs with roasted tomato, halumi cheese, salad.] I would definitely order the yogurt and the sandwich again. they’re standard and delicious. the tangine was merely acceptable. not great at all. they’re best known for their halumi eggs, but it was wayyy overcooked this time around. they need to be consistent with their food quality. i’m never sure if it’ll be good or not. overall, i think the food experience at cafe orlin is better, though the service at both is really slow and you shouldn’t be in a hurry if you decide to eat here.",2012-05-30,UID_3874
14667,Hakata_Ippudo_NY,4,Positive,"good, but overrated.","the king of pork buns. but that is all. it isn’t worth getting a ramen with sodium overload. some people may love it, but i’ve had better. the ramen [i’ve sampled them all over the years, and my conclusion is that the ramen is too...","the king of pork buns. but that is all. it isn’t worth getting a ramen with sodium overload. some people may love it, but i’ve had better. the ramen [i’ve sampled them all over the years, and my conclusion is that the ramen is too greasy for me] the pork buns are the perfect blend of smooth, spicy, and flavorful for every bite. the best of the best. yum. too bad they don’t allow you to get them to go!",2012-05-30,UID_3874
21884,Little_Owl,3,Negative,don't expect too much,"although the little owl might get rave reviews, i find the food very average. the best thing about it is that the service is down to earth and there is the cutest red door.outside of that, the restaurant is small, the food is mediocre,...","although the little owl might get rave reviews, i find the food very average. the best thing about it is that the service is down to earth and there is the cutest red door. outside of that, the restaurant is small, the food is mediocre, and i have been multiple times and have never experienced why/how people think they have the best burgers in town. not somewhere i'd go out of the way for, but decent.",2012-05-30,UID_3874
41244,Russ_Daughters,5,Positive,best smoked salmon,"as much as i love sandwiches, i prefer eating my bagels open-faced: with all the cream cheese, smoked salmon, tomatoes, onions, chives, etc. then continuing onto the other side. i do NOT like eating them sandwich-style.i had a pumpernickle bagel, tofu scallion cream cheese,...","as much as i love sandwiches, i prefer eating my bagels open-faced: with all the cream cheese, smoked salmon, tomatoes, onions, chives, etc. then continuing onto the other side. i do NOT like eating them sandwich-style. i had a pumpernickle bagel, tofu scallion cream cheese, & scottish salmon loin. the scottish salmon loin cut is the filet mignon of the fish. i love it so much that i usually order a little extra so i can eat it on the side [in addition to what is already in my bagel]~yes, THAT good. the smoked fish here is great; do not opt for the seafood salad weak stuff. you gotta go real deal. the pumpernickle is great here, & the bagel is the right balance between chewy and soft. my only qualm is that i’m eating two thick pieces of the bagel in every bite. i can imagine it being even better if i was eating it open faced! and with a friend!! sitting down!!! anyway, this meal costs about $15 for 1 bagel sandwich [without the extra order of fish]. its a pretty price to pay, so i only have it when i have a craving.",2012-05-30,UID_3874
80817,Danji,4,Positive,fun korean legit!,a twist on korean food and it is actually very good. the ambience is fun and very fitting for hells kitchen. i'll definitely re-visit once i get to all the other restaurants i'm dying to try!,a twist on korean food and it is actually very good. the ambience is fun and very fitting for hells kitchen. i'll definitely re-visit once i get to all the other restaurants i'm dying to try!,2012-05-30,UID_3874
85812,Jean_Georges,4,Positive,fun lunch,"if you have a few spare hours.... lunch is always fun to try the seasonal dishes here.no matter what is had, it is delicious. there is a reason why it is visited over and over again. service is not perfect, but it is still...","if you have a few spare hours.... lunch is always fun to try the seasonal dishes here. no matter what is had, it is delicious. there is a reason why it is visited over and over again. service is not perfect, but it is still great.",2012-05-30,UID_3874
90869,Caracas_Arepa_Bar,4,Positive,chips!,"its your average nyc-sized joint. yes. its 20-25 people, long lines, and cramped space.what are arepas you ask? they’re little round flat little cornmeal disks. you can fill them with any goodies you like. sometimes the arepas are great; at other times the combinations...","its your average nyc-sized joint. yes. its 20-25 people, long lines, and cramped space. what are arepas you ask? they’re little round flat little cornmeal disks. you can fill them with any goodies you like. sometimes the arepas are great; at other times the combinations are not… i tried a couple of them, like la mulata [added chorizo] but the cheese gets cold and rubbery, which is never a good thing. i think i’ll go with the plates next time. and now… onto what IS fantastic! the guasacaca [guac] & chips! are a must get here… despite the weird name that can potentially be unappetizing, it is soooooo delicious! instead of using corn tortilla chips, they make plantain & taro chips…. which can only mean its amazing~ and of course, it does not fail to deliver. i can eat those chips every day…",2012-05-30,UID_3874
100819,The_Meatball_Shop,4,Positive,post alcohol,the meatball shop is amazing after a drink or seven.as long as there isn't a long line it is totally and completely worth trying!,the meatball shop is amazing after a drink or seven.as long as there isn't a long line it is totally and completely worth trying!,2012-05-30,UID_3874
105833,Balthazar,4,Positive,fries are a must!,balthazar’s fries….. so delicious. they are the best fries. consistently delicious!!!steak frites & sole. so flaky and delicious. smallest tables ever and perpetual lines make this a place i do not go out of my way for; its filled with turistos galore! but the...,balthazar’s fries….. so delicious. they are the best fries. consistently delicious!!! steak frites & sole. so flaky and delicious. smallest tables ever and perpetual lines make this a place i do not go out of my way for; its filled with turistos galore! but the best part is the balthazar to-go next door where u can pick up yummy iced teas and baked goodies. if i lived in soho i’d def def be a regular there.,2012-05-30,UID_3874
115167,Clinton_St_Baking_Company_Restaurant,5,Positive,breakfast all day,"its another one of those famed places for brunch that have 2.5 hour waits. like all things, it is definitely not worth the wait. but if you decide to come at an off hour, or early in the morning. it is delicious!the wild maine...","its another one of those famed places for brunch that have 2.5 hour waits. like all things, it is definitely not worth the wait. but if you decide to come at an off hour, or early in the morning. it is delicious! the wild maine blueberry pancakes are great, southern breakfast [sans grits] is amazing. who doesn’t love fried green tomatoes? eggs benedict, biscuits, mixed berry scone, and their famed muffins. what is not to love here?",2012-05-30,UID_3874


In [12]:
#This reviewer left 24 reviews on different restuarants on the same day

# Select the date with the highest review count
top_date = sorted_counts.iloc[1]['date']

# Filter the original reviews data frame for the selected author and date
selected_reviews_868 = NY_reviews[(NY_reviews['author_id'] == 'UID_868') & (NY_reviews['date'] == top_date)]
selected_reviews_868

Unnamed: 0,restaurant_name,rating_review,sample,title_review,review_preview,review_full,date,author_id
5728,Marea,4,Positive,Precious,"Very good and elegant, event dinner and extremely thought about menu but really just a little ""precious"". Very Euro. Very Importante","Very good and elegant, event dinner and extremely thought about menu but really just a little ""precious"". Very Euro. Very Importante",2012-01-10,UID_868
22496,Del_Posto,3,Negative,Disappointed,"Thought this would be the absolute best after hearing rave reviews BUT it just wasn't special. Everything was ok, formal, considerate service but not that memorable. Maybe I was in a bad mood. Bring all your credit cards.","Thought this would be the absolute best after hearing rave reviews BUT it just wasn't special. Everything was ok, formal, considerate service but not that memorable. Maybe I was in a bad mood. Bring all your credit cards.",2012-01-10,UID_868
33786,John_s_of_Bleecker_Street,5,Positive,New York Classic,"Is it still there? Are the wooden booths still filled with scribbles. Is the pizza still fresh, thin and perfecto? Veddy good.","Is it still there? Are the wooden booths still filled with scribbles. Is the pizza still fresh, thin and perfecto? Veddy good.",2012-01-10,UID_868
82785,BLT_Steak,5,Positive,Mmmmmmm,"Yum. Big juicy steaks, onion rings yum lotsa guys. comfortable , great staff, definitely a treat. I deserve it, don't you?","Yum. Big juicy steaks, onion rings yum lotsa guys. comfortable , great staff, definitely a treat. I deserve it, don't you?",2012-01-10,UID_868
85116,Jean_Georges,4,Positive,Ah yes,"Jean Georges is what it sets out to be - quietly elegant, gourmet food, great location, interesting clientelle. You might be seated next to a Baron, or a Donald or a hedge fund guy or a great pianist, who knows?","Jean Georges is what it sets out to be - quietly elegant, gourmet food, great location, interesting clientelle. You might be seated next to a Baron, or a Donald or a hedge fund guy or a great pianist, who knows?",2012-01-10,UID_868
103930,Balthazar,5,Positive,"Balthazar , C'est Si Bon","Friendly staff although cramped when you enter. Always fun, always good. A Parisian paradise in SOHO.","Friendly staff although cramped when you enter. Always fun, always good. A Parisian paradise in SOHO.",2012-01-10,UID_868
111495,Eataly,4,Positive,Square Market,"Fun in a market sort of way. A little hectic to the restaurants. Overall a casual, enjoyable New York experience, a delightful addition to the square.","Fun in a market sort of way. A little hectic to the restaurants. Overall a casual, enjoyable New York experience, a delightful addition to the square.",2012-01-10,UID_868
120109,Cafe_Boulud,4,Positive,consistent,quiet neighborhood bistro. excellent chef. well done. It will not hit you over the head but you will return for the feeling of a relaxed dinner at home ( with a wonderful chef of course),quiet neighborhood bistro. excellent chef. well done. It will not hit you over the head but you will return for the feeling of a relaxed dinner at home ( with a wonderful chef of course),2012-01-10,UID_868
123653,Gotham_Bar_Grill,5,Positive,A refined experience,low key excellence from decor to menu selection. You will be well taken care of and feel superb at the end. Been back many times and always a winner.,low key excellence from decor to menu selection. You will be well taken care of and feel superb at the end. Been back many times and always a winner.,2012-01-10,UID_868
144732,Eleven_Madison_Park,5,Positive,Top Choice,"Expensive- yes, crowded seating-maybe, GREAT dining experience, highly recommended for a wow and you will want to return.","Expensive- yes, crowded seating-maybe, GREAT dining experience, highly recommended for a wow and you will want to return.",2012-01-10,UID_868


In [13]:
#This reviewer left 22 reviews on different restuarants on the same day
# Select the date with the highest review count
top_date = sorted_counts.iloc[2]['date']

# Filter the original reviews data frame for the selected author and date
selected_reviews_17188 = NY_reviews[(NY_reviews['author_id'] == 'UID_17188') & (NY_reviews['date'] == top_date)]
selected_reviews_17188

Unnamed: 0,restaurant_name,rating_review,sample,title_review,review_preview,review_full,date,author_id
26497,Margon,5,Positive,Great food,Easy laid back vibe. Casual dining. Take out or eat in.Great tasting food with yummy shakes.Near time sq.,Easy laid back vibe. Casual dining. Take out or eat in.Great tasting food with yummy shakes.Near time sq.,2016-05-20,UID_17188
81833,Birdland,4,Positive,Great performances,Django reinhardt festival annually is a great event.Great performances here small venue food and drinks available.,Django reinhardt festival annually is a great event.Great performances here small venue food and drinks available.,2016-05-20,UID_17188
96139,Arturo_s,4,Positive,Good pizza,Been here for ages good village spot.Good brick oven pizza.Gets busy on the weekend. Don't think they take reservations,Been here for ages good village spot.Good brick oven pizza.Gets busy on the weekend. Don't think they take reservations,2016-05-20,UID_17188
111402,Artichoke_Basille_s_Pizza,5,Positive,Creamy pizza,"Great pizza, yummy cheesy, 1 slice is enough. This is not thin crust.heavy duty thick pizza. Lines can be long.","Great pizza, yummy cheesy, 1 slice is enough. This is not thin crust.heavy duty thick pizza. Lines can be long.",2016-05-20,UID_17188
137342,Pret_A_Manger,4,Positive,Decent,"I will say, how do they keep their bread crispy. The mini sandwiches are always fresh with a crispy baguette never spongy.","I will say, how do they keep their bread crispy. The mini sandwiches are always fresh with a crispy baguette never spongy.",2016-05-20,UID_17188
142892,Ear_Inn,3,Negative,Old school pub,dive bar with locals.Small place with history.Great for drinks with friends. not much else. Out of the area a bit,dive bar with locals.Small place with history.Great for drinks with friends. not much else. Out of the area a bit,2016-05-20,UID_17188
159019,Riverpark,3,Negative,Not bad,Came here for restaurant week was a decent pre fix meal on the east river but very noisy with the fdr right outside.,Came here for restaurant week was a decent pre fix meal on the east river but very noisy with the fdr right outside.,2016-05-20,UID_17188
177129,Strip_House,4,Positive,Great steak,Great steak house with a lounge feel. Dark and mysterious inside with comfy red velvet couches. Porterhouse is great also Filet Mignon.,Great steak house with a lounge feel. Dark and mysterious inside with comfy red velvet couches. Porterhouse is great also Filet Mignon.,2016-05-20,UID_17188
229125,Untitled_at_the_Whitney,4,Positive,Great service,"Stopped here for drinks, our waiter was the best.Suggested a really good drink had some conversation. Friendly staff good crowd.","Stopped here for drinks, our waiter was the best.Suggested a really good drink had some conversation. Friendly staff good crowd.",2016-05-20,UID_17188
247744,Bread_Butter,3,Negative,Many choices,Good salad bar with many choices. Quick abrupt service.Sorta want you in and out. Next Next Next is all you here but good for a fresh squeezed ginger shot when you under the weaher.,Good salad bar with many choices. Quick abrupt service.Sorta want you in and out. Next Next Next is all you here but good for a fresh squeezed ginger shot when you under the weaher.,2016-05-20,UID_17188


In [14]:
#This reviewer left 18 reviews on different restuarants on the same day

# Select the date with the highest review count
top_date = sorted_counts.iloc[3]['date']

# Filter the original reviews data frame for the selected author and date
selected_reviews_11909 = NY_reviews[(NY_reviews['author_id'] == 'UID_11909') & (NY_reviews['date'] == top_date)]
selected_reviews_11909

Unnamed: 0,restaurant_name,rating_review,sample,title_review,review_preview,review_full,date,author_id
13108,Marea,5,Positive,The best,"This was our second visit to Marea, and everything changed for good. Despite the boring and intrusive company, and the bad weather, everybody got their alerts in their smartphone about the terrible rain outside. It was the best. The red wine, ""Tyler"" made me happy....","This was our second visit to Marea, and everything changed for good. Despite the boring and intrusive company, and the bad weather, everybody got their alerts in their smartphone about the terrible rain outside. It was the best. The red wine, ""Tyler"" made me happy. It is a power scene lunch. High end and really fine. The service has the best vibe. We loved the five course chef tasting. The best Italian restaurant in New York with a Michelin star.",2015-11-03,UID_11909
41536,The_Polo_Bar,5,Positive,Sterling,"To get in The Polo Bar you have to tell your reservation name outside in the street to the staff, all dressed in Ralph Lauren. You walk into the bar, wait, then you take an elevator and go to the main salon, decorated by Ralph...","To get in The Polo Bar you have to tell your reservation name outside in the street to the staff, all dressed in Ralph Lauren. You walk into the bar, wait, then you take an elevator and go to the main salon, decorated by Ralph Lauren. The space is lavish. The waiters are very friendly, their treat is gracious. The menu is small but with good recommendations. Their steak is from the ranch of Mr. Lauren. One of the best steaks I've eaten. Deluxe. We loved the polo bar. A High level restaurant. Palatial, one of the best night outs. The chocolate soufflé was sterling.",2015-11-03,UID_11909
75394,Il_Gattopardo,3,Negative,Refined Italian,"Italian Nouvelle cuisine in a cold, hidden and desolate location. You can be in front and not find it, you have to walk down a rounded stairs. The food is satisfying and the quality valuable. Refined and a bit depressing.","Italian Nouvelle cuisine in a cold, hidden and desolate location. You can be in front and not find it, you have to walk down a rounded stairs. The food is satisfying and the quality valuable. Refined and a bit depressing.",2015-11-03,UID_11909
79215,5_Napkin_Burger_Hell_s_Kitchen,4,Positive,My burger was delicious.,"My burger was delicious. Always crowded and the tables very close. Post theater, they have sushi and red wine. The staff is cool. New Yorkers love this place. Delicious and yummy experience. It has various locations and they are growing. It is the modern concept...","My burger was delicious. Always crowded and the tables very close. Post theater, they have sushi and red wine. The staff is cool. New Yorkers love this place. Delicious and yummy experience. It has various locations and they are growing. It is the modern concept of a cafeteria. If I'm starving and walking in New York and find a 5NB I don't think it and walk inside.",2015-11-03,UID_11909
127043,Le_Parisien,4,Positive,Great bistro,"It is a small place, with the bonsoir monsieur attitude s'il vous plaît. You feel in the French Riviera. Informal. Went for dinner, reserved through Open Table. The best steak tartare, the best escargots, the red wine warm like a bistro. Good looking serveuse. The...","It is a small place, with the bonsoir monsieur attitude s'il vous plaît. You feel in the French Riviera. Informal. Went for dinner, reserved through Open Table. The best steak tartare, the best escargots, the red wine warm like a bistro. Good looking serveuse. The best french bistro recently.",2015-11-03,UID_11909
141473,ABC_Kitchen,1,Negative,A total failure.,"Oh deception, waiting so much for this? The service was rushed, th food was bad, you can eat better anywhere. A disaster. We were excited to try the popular and famous ABC Kitchen. We've been to ABC Cocina and it was way better. The plates...","Oh deception, waiting so much for this? The service was rushed, th food was bad, you can eat better anywhere. A disaster. We were excited to try the popular and famous ABC Kitchen. We've been to ABC Cocina and it was way better. The plates were tiny, the flavors and quality of the fish a blow. The crowd isn't the trendiest... The hosts neither. The only thing i liked were the black and white photographs in the walls. Very hard place to get a reservation and a total failure.",2015-11-03,UID_11909
151520,Aretsky_s_Patroon,4,Positive,Classy,"The hostess is welcoming. The space is nice, a classy dining room. The ambiance is corporative midtown, a power lunch. Everybody is dressed in suits. The shrimp cocktail is ok, I like the little Tabasco bottle in the ice plate. Impeccable service, terrace, a great...","The hostess is welcoming. The space is nice, a classy dining room. The ambiance is corporative midtown, a power lunch. Everybody is dressed in suits. The shrimp cocktail is ok, I like the little Tabasco bottle in the ice plate. Impeccable service, terrace, a great variety of dishes.",2015-11-03,UID_11909
159312,Riverpark,1,Negative,A mediocre experience.,"Hard to find, in a modern building that looks like a hospital. It is supposed to have a great view, but we were seated at the worst table with the worst view. Worst brunch ever, not yummy, tasteless. They say that Tom Coliccio's s never...","Hard to find, in a modern building that looks like a hospital. It is supposed to have a great view, but we were seated at the worst table with the worst view. Worst brunch ever, not yummy, tasteless. They say that Tom Coliccio's s never been there, so its just his name/brand? It looks so. It needs a visit from the real Coliccio and put some order. A mediocre experience.",2015-11-03,UID_11909
161864,Fogo_de_Chao_Brazilian_Steakhouse,4,Positive,Tasty,"The location is super, it is outside the MoMa, that is a big plus. The decor is beseeching and modern. It has various levels, huge and spacious. It emerged as an appetising place to eat different cuts of meat. It has an incredible salad bar....","The location is super, it is outside the MoMa, that is a big plus. The decor is beseeching and modern. It has various levels, huge and spacious. It emerged as an appetising place to eat different cuts of meat. It has an incredible salad bar. Their wine list is very good. It is worth it, and it is not expensive. You can eat all what you want. A noble joint.",2015-11-03,UID_11909
228584,Sparks_Steak_House,1,Negative,Stay Away,Unfavourable steakhouse. There are many better options in the neighbourhood. The service was mediocre. The steak was dry. Worst restaurant experience ever. Shun. Evade.,Unfavourable steakhouse. There are many better options in the neighbourhood. The service was mediocre. The steak was dry. Worst restaurant experience ever. Shun. Evade.,2015-11-03,UID_11909


In [15]:
#This reviewer left 17 reviews on different restuarants on the same day
# Select the date with the highest review count
top_date = sorted_counts.iloc[4]['date']

# Filter the original reviews data frame for the selected author and date
selected_reviews_29050 = NY_reviews[(NY_reviews['author_id'] == 'UID_29050') & (NY_reviews['date'] == top_date)]
selected_reviews_29050

Unnamed: 0,restaurant_name,rating_review,sample,title_review,review_preview,review_full,date,author_id
34456,Tony_s_Di_Napoli_Midtown,5,Positive,THE BEST OF THE BEST,I love this place with all of my heart. They will always fit you in...The eggplant parm is the best I have ever had... the waitstaff are so good at what they do... We wanted for NOTHING during our meal. I eat here about...,I love this place with all of my heart. They will always fit you in... The eggplant parm is the best I have ever had... the waitstaff are so good at what they do... We wanted for NOTHING during our meal. I eat here about six times a year and whenever I see a Broadway show... convenient.. they get you in and out quickly if need be... not to be MISSED!,2012-01-20,UID_29050
48880,Carmine_s_Italian_Restaurant_Times_Square,4,Positive,Good fun,Not as good as Tony DiNapoli's and if you want to pay more go to Carmine's. but I suspect you will seek out Tony's over Carmine's.,Not as good as Tony DiNapoli's and if you want to pay more go to Carmine's. but I suspect you will seek out Tony's over Carmine's.,2012-01-20,UID_29050
79816,The_Perfect_Pint,4,Positive,After Work Destination,"Fun atmosphere, good staff, plentiful bar... great margarita!","Fun atmosphere, good staff, plentiful bar... great margarita!",2012-01-20,UID_29050
102710,Aquavit,5,Positive,Excellent,Another diamond in the rough.... you will enjoy the food... go go and go soon.,Another diamond in the rough.... you will enjoy the food... go go and go soon.,2012-01-20,UID_29050
111644,Eataly,5,Positive,Go to EATALY,What a neat place this is... it is a marketplace for everything Italian.. especially the Olive Bread... I could not stop eating it...I didn't actually eat at any of the restaurants yet... but I hear nothing but good things... It's a great way to...,What a neat place this is... it is a marketplace for everything Italian.. especially the Olive Bread... I could not stop eating it... I didn't actually eat at any of the restaurants yet... but I hear nothing but good things... It's a great way to enjoy a couple hours shopping for fresh and hard to find ingredients...,2012-01-20,UID_29050
138376,Nobu,2,Negative,Skip it,Too expensive for what you get... they must have great marketing people because you would think this place was the bomb.. but it definitely is not.,Too expensive for what you get... they must have great marketing people because you would think this place was the bomb.. but it definitely is not.,2012-01-20,UID_29050
152376,21_Club,5,Positive,SUPER,Was taken here for lunch a couple of weeks ago. Had the Salmon... OMG! And the dessert was truly spectacular... This place has an interesting history and I love love love the original Frederic Remington artwork all around the place... I also like the renovation...,Was taken here for lunch a couple of weeks ago. Had the Salmon... OMG! And the dessert was truly spectacular... This place has an interesting history and I love love love the original Frederic Remington artwork all around the place... I also like the renovation they made to the front of the house... they added an additional bar and sitting area... well done...,2012-01-20,UID_29050
161709,Churrascaria_Plataforma,4,Positive,Vegetarians love this place too,I do not eat meat but when with some colleagues... there were so many options for me to eat... i could barely walk when I left the place... really neat concept...,I do not eat meat but when with some colleagues... there were so many options for me to eat... i could barely walk when I left the place... really neat concept...,2012-01-20,UID_29050
179751,Loeb_Boathouse_Central_Park,5,Positive,Beautiful Location,Waiters were excellent. Food was tasty...and plentiful. Include this restaurant in your trip into Central Park... It's closer to the east side....,Waiters were excellent. Food was tasty...and plentiful. Include this restaurant in your trip into Central Park... It's closer to the east side....,2012-01-20,UID_29050
187536,Nobu,2,Negative,SO EXPENSIVE,$25 for a large saki... give me a break... what is all the hoopla about? The place was not so nicely adorned that I should have to pay so much for so little... SKIP it and go to MONSTER SUSHI.,$25 for a large saki... give me a break... what is all the hoopla about? The place was not so nicely adorned that I should have to pay so much for so little... SKIP it and go to MONSTER SUSHI.,2012-01-20,UID_29050


In [16]:
#This reviewer left 17 reviews on different restuarants on the same day
# Select the date with the highest review count
top_date = sorted_counts.iloc[5]['date']

# Filter the original reviews data frame for the selected author and date
selected_reviews_28478 = NY_reviews[(NY_reviews['author_id'] == 'UID_28478') & (NY_reviews['date'] == top_date)]
selected_reviews_28478

Unnamed: 0,restaurant_name,rating_review,sample,title_review,review_preview,review_full,date,author_id
33749,John_s_of_Bleecker_Street,4,Positive,Ol Timer,"It is true what they say about John's. It is the best pizza in New York,","It is true what they say about John's. It is the best pizza in New York,",2011-12-22,UID_28478
50255,5_Napkin_Burger_Hell_s_Kitchen,3,Negative,Local Hangout,"The regulars outnumber the first timers. This is practically a club, a well fed club.","The regulars outnumber the first timers. This is practically a club, a well fed club.",2011-12-22,UID_28478
96316,Bubby_s,3,Negative,Local Favorite,This place is a Tribecca standby. Ordinary food in plain setting. But you'll grow to like it if you live nearby.,This place is a Tribecca standby. Ordinary food in plain setting. But you'll grow to like it if you live nearby.,2011-12-22,UID_28478
103767,Balthazar,4,Positive,Always Crowded,"I'd like this place better if the crowds were missing. It is a good place for oysters, for coffee, for looking around Soho.","I'd like this place better if the crowds were missing. It is a good place for oysters, for coffee, for looking around Soho.",2011-12-22,UID_28478
107078,Union_Square_Cafe,3,Negative,Let's Have Lunch,This place is always a treat. Sometimes better than others. You won't be sorry.,This place is always a treat. Sometimes better than others. You won't be sorry.,2011-12-22,UID_28478
111258,Eataly,2,Negative,Too Many People,"Chaos. If that is what you want with an Italian accent, then welcome.","Chaos. If that is what you want with an Italian accent, then welcome.",2011-12-22,UID_28478
174405,Katz_s_Deli,2,Negative,Want to Eat on a Crowded Subway?,"Hustle and bustle and slam bang, too. But the corned beef sandwich is almost worth it.","Hustle and bustle and slam bang, too. But the corned beef sandwich is almost worth it.",2011-12-22,UID_28478
178782,Loeb_Boathouse_Central_Park,2,Negative,Everything Depends on the Service,"Lovely venue. OK food. But you go for the occasion, and service is important.","Lovely venue. OK food. But you go for the occasion, and service is important.",2011-12-22,UID_28478
195140,Molyvos,3,Negative,Good Looking Greek,"Old reliabel Greek. Nothing special, though. Don't order the mousakka.","Old reliabel Greek. Nothing special, though. Don't order the mousakka.",2011-12-22,UID_28478
275219,Blue_Smoke_Flatiron,2,Negative,Like Barbeque?,"It is pleasant enough if you love this style of cooking. If not, grin and stay with the good drinks.","It is pleasant enough if you love this style of cooking. If not, grin and stay with the good drinks.",2011-12-22,UID_28478


<!--  Comment on the users with > 15 reviews on one day -->
- We observe that for 3 of our 6 flagged users the rating counts are > 15 in Jan of 2012.
- For author `UID_3874`, who made 29 reviews on one day, the reviews seem legit, thus the reviews from this author are kept.
- For author `UID_868`, who made 24 reviews on one day, the 24 reviews this author left on 2012-1-10 are shorter in length compared to the ones the author wrote later. However, the content looks legit. It could be that 2012-1-10 was the first day the author created an account. Thus, the reviews from this author are kept.
- For author `UID_17188`, who made 22 reviews on one day, the reviews are pretty short but describing, look legit, thus the reviews from this author are kept.
- For author `UID_11909`, who made 18 reviews on one day, the reviews seem genuine and there is no anomaly observed and hence are retained.
- For author `UID_29050`, who made 17 reviews on one day, the reviews seem legit, thus the reviews from this author are kept.
- For author `UID_28478`, who made 16 reviews on one day, the reviews seem legit, thus the reviews from this author are kept.


Thus we decide to move forward (xxx).

## 6. Save the cleaned data set as a separate file

In [18]:
NY_reviews.to_csv('./data/New_York_reviews_cleaned.csv',  sep=',')