## Let's use Pandas and Plotly to start exploring the dataset

In [1]:
import pandas as pd
import plotly.express as px

df_train = pd.read_csv("../data/arm-english-train.csv")
df_train

Unnamed: 0,review_id,product_id,reviewer_id,stars,review_body,review_title,language,product_category
0,en_0964290,product_en_0740675,reviewer_en_0342986,1,Arrived broken. Manufacturer defect. Two of th...,I'll spend twice the amount of time boxing up ...,en,furniture
1,en_0690095,product_en_0440378,reviewer_en_0133349,1,the cabinet dot were all detached from backing...,Not use able,en,home_improvement
2,en_0311558,product_en_0399702,reviewer_en_0152034,1,I received my first order of this product and ...,The product is junk.,en,home
3,en_0044972,product_en_0444063,reviewer_en_0656967,1,This product is a piece of shit. Do not buy. D...,Fucking waste of money,en,wireless
4,en_0784379,product_en_0139353,reviewer_en_0757638,1,went through 3 in one day doesn't fit correct ...,bubble,en,pc
...,...,...,...,...,...,...,...,...
199995,en_0046316,product_en_0980158,reviewer_en_0629807,5,"Cute slippers, my MIL loved them.",Nice and fit as advertised,en,shoes
199996,en_0956024,product_en_0954574,reviewer_en_0459072,5,My 6 year old likes this and keeps him engaged...,good to keep the kids engaged,en,toy
199997,en_0589358,product_en_0402982,reviewer_en_0199163,5,Replaced my battery with it. Works like new.,This works,en,wireless
199998,en_0970602,product_en_0873374,reviewer_en_0590563,5,"I like them, holding up well.",Well made.,en,industrial_supplies


In [2]:
camera_reviews = df_train.query("product_category == 'camera'")["review_body"]
camera_reviews

11        It doesn’t work after less than a month. The c...
16        Doesn’t work and tried contacting via email ba...
126       returned . it would not recognize internet . s...
171       Price is expensive and very hard to use and al...
205       Charges with no issue but will not record. Lig...
                                ...                        
199484    I spent a bit of time looking into the big bra...
199669                              Easy to use and sturdy.
199749    I got these for an old camcorder that will be ...
199752    The motion detection is great on this camera. ...
199823    I got these for my nephews and they love them....
Name: review_body, Length: 2139, dtype: object

In [3]:
from sklearn.cluster import KMeans
from sklearn.feature_extraction.text import CountVectorizer

# Bag-of-words embedding
X_embedding = CountVectorizer().fit_transform(camera_reviews.to_list())

# KMeans clustering arbitrarily choosing 10 clusters
model = KMeans(n_clusters=10)
model.fit(X_embedding)

Y_clusters = model.predict(X_embedding)

In [4]:
df_clusters = pd.concat([
    camera_reviews.reset_index(drop=True),
    pd.Series(Y_clusters).rename("Cluster")
], axis=1)
df_clusters

Unnamed: 0,review_body,Cluster
0,It doesn’t work after less than a month. The c...,8
1,Doesn’t work and tried contacting via email ba...,8
2,returned . it would not recognize internet . s...,8
3,Price is expensive and very hard to use and al...,8
4,Charges with no issue but will not record. Lig...,8
...,...,...
2134,I spent a bit of time looking into the big bra...,0
2135,Easy to use and sturdy.,8
2136,I got these for an old camcorder that will be ...,8
2137,The motion detection is great on this camera. ...,8


In [6]:
pd.set_option('display.max_colwidth', 200)

df_clusters.query("Cluster == 0")[:20]

Unnamed: 0,review_body,Cluster
13,"I can't imagine how anybody would give this thing a 5 star review, it was terrible. Even at the lowest resolution setting the refresh rate of the video is about every three seconds. It's very fidd...",0
27,The quality is low. It falls down easily and there's no way to fix the issue. Assembling and dissembling is all hassle. First time I used the bulb base stuck and the bulb itself came out. So I hav...,0
38,"THIS THING IS DANGEROUS! So, I had mine for 5 months, plugged in full time. Today, the video went blurry. I checked EVERYTHING (and I'm an electronic wiz) but found no issues. I finally resorted t...",0
68,I purchased this bag to take on a 1 week vacation. I liked the slim design and the sling for easy access to my camera while still wearing the bag. Unfortunately I had to return it because the fabr...,0
72,I hate leaving bad reviews but this one is deserved! I decided to give these a try and order a dozen of these for my d750. I shoot professionally and these have stopped working mid shoot and mid w...,0
76,"Cameras were clear unless is was dark good pictures for the most part but had to keep my phone on the screen to record, couldn’t get the motion detection to record automatically at all called cust...",0
81,Tripod was serviceable while it worked but after a month the plastic on the phone holder started to break apart from VERY light use. So I have the tripod but I can't use it with my phone because t...,0
96,"Reluctantly, I bought this as a low-cost option to take macro pictures with my D5600. The first thing I noticed was how difficult it was to twist onto the camera body. It was almost as if the prod...",0
111,"I was very happy when I recieved the item & so excited about it. But after one day studying of the camera, i was so disappointed because the Mode (top of the camera in the right side) is defective...",0
121,"If you are looking to shoot a selfie off the top of your car, or something relatively flat, this works like a charm, and is light enough to fit into your backpack. However, if you are looking to u...",0


In [7]:
df_clusters.query("Cluster == 1")[:20]

Unnamed: 0,review_body,Cluster
5,Definitely buy the camera. Just the camera. Then get accessories that are worthy of the camera. Most of the accessories in this package are cheaply made and embarrassing to have been included the ...,1
29,The shell is flimsy and brittle with flaps that don't sit flat and a shape that doesn't sit square. The design relies on single snap-button connections on each side and pop-out hangers for the fab...,1
32,"the picture clearly shows a labeled item and what i received appears to be a cheap knockoff. the stand will barely ""stand"" by itself much less with a camera on it. i have another one that is clear...",1
33,The picture I took from this camera is worse than my Iphone 4. It's so blurred. I tried to adjust to the highest resolution and with the macro len but still worse than Iphone 4. I can't even read ...,1
34,Very disappointed in this product. First off the setup is that of nightmares and only works about half of the time. Also the camera never stays online and constantly disconnects from the Wi-Fi. I ...,1
35,The screen protector kit I received was in total disarray. The screen protectors were out of their protective shields with no adhesive to attach them to the camera. The wiping clothes when opened ...,1
43,"Didn’t workout, the suction cups with the Arlo Pro cameras couldn’t be used as they have the screw in the back and I wanted to mounted in the inside window facing out onto the street, also not lon...",1
47,"Firstly, I do like the backdrop. But in the product overview, it says that there are 3 velcro mounting loops: One at the top (i.e. portrait) and two along one side (i.e. landscape). I only have th...",1
60,"The product was easy too install but within 2 months the iOS app stopped working. It opens and then crashes right away. Completely useless. There are plenty of cheap choices, try a different one b...",1
62,I didn’t receive any of the party box only the camera Spent 25 min on the phone What a waste of time amazon !,1
