In [2]:
import json 
import numpy as np
import pandas as pd
import seaborn as sns

In [3]:
with open('Cell_Phones_and_Accessories_5.json', 'r') as file:
    data = [json.loads(line) for line in file]

norm_data = pd.json_normalize(data)
df = pd.DataFrame(norm_data)
df.head()

Unnamed: 0,reviewerID,asin,reviewerName,helpful,reviewText,overall,summary,unixReviewTime,reviewTime
0,A30TL5EWN6DFXT,120401325X,christina,"[0, 0]",They look good and stick good! I just don't li...,4.0,Looks Good,1400630400,"05 21, 2014"
1,ASY55RVNIL0UD,120401325X,emily l.,"[0, 0]",These stickers work like the review says they ...,5.0,Really great product.,1389657600,"01 14, 2014"
2,A2TMXE2AFO7ONB,120401325X,Erica,"[0, 0]",These are awesome and make my phone look so st...,5.0,LOVE LOVE LOVE,1403740800,"06 26, 2014"
3,AWJ0WZQYMYFQ4,120401325X,JM,"[4, 4]",Item arrived in great time and was in perfect ...,4.0,Cute!,1382313600,"10 21, 2013"
4,ATX7CZYFXI1KW,120401325X,patrice m rogoza,"[2, 3]","awesome! stays on, and looks great. can be use...",5.0,leopard home button sticker for iphone 4s,1359849600,"02 3, 2013"


In [4]:
print(f"{df.shape[0]} reviews")
print(f"{df["asin"].nunique()} unique products")
print(f"{df['reviewerID'].nunique()} unique reviewers")

194439 reviews
10429 unique products
27879 unique reviewers


In [5]:
print(df["asin"].value_counts())
print(df["reviewerID"].value_counts())

asin
B005SUHPO6    837
B0042FV2SI    694
B008OHNZI0    657
B009RXU59C    636
B000S5Q9CA    628
             ... 
B00LH1QQH2      5
B000056PYW      5
B00001W0ET      5
9983798883      5
B00KJ0QV9K      5
Name: count, Length: 10429, dtype: int64
reviewerID
A2NYK9KWFMJV4Y    152
A1EVV74UQYVKRY    138
A22CW0ZHY3NJH8    138
A1ODOGXEYECQQ8    134
A2NOW4U7W3F7RI    132
                 ... 
A1ORQPX3LA1WV5      5
A2U5NF3IH4YVKH      5
A2044GMMSXXXEF      5
AP5NLK40Y9AG6       5
A4N8QJCW9EY1A       5
Name: count, Length: 27879, dtype: int64


We can see here that the dataset has only considered reviewers with a minimum of 5 reviews written, and products with a minimum of 5 reviews recieved. The below cell shows all the reviews submitted by user "A4N8QJCW9EY1A"

In [6]:
print(df.loc[df["reviewerID"] == "A4N8QJCW9EY1A"] ["reviewText"])

186013    I don't use apple products, so when someone wi...
190481    Adoption of the USB standard for chargers has ...
193880    I use my laptop or desktop to charge most devi...
194066    My truck already has three 12 volt sockets (on...
194099    After using this for about a week, I can't thi...
Name: reviewText, dtype: object


My next goal will be to get 100 reviews of a specific product and identify key aspects that are commonly mentioned in reviews. I will use product "B005SUHPO6" since it is the most reviewed in the dataset. 

In [7]:
df_filtered = df[df["asin"] == "B005SUHPO6"][:100].reset_index()
df_filtered_reviews = df_filtered["reviewText"]
df_filtered_reviews
for i in range(10):
    print(df_filtered_reviews[i])

excellent product at 1/2 the price as sale at electronic store, wow fit perfect on my iphone
Sometimes the flap over the charging place is hard to stay locked in, I have to keep trying and trying to lock it in there, it drives me crazy!!!!  I love the  colors that I bought, the blue one I have not used yet, maybe next year.  I like a change once in awhile......other than the locking in flap, I am happy with them.
Great case.  Fits like every other Otterbox Defender case I have own.  It does a great job of protecting your phone from drops onto the ground.  I feel that the Defender case is a bit bulky sometimes, but the holster is a major plus for my dad.
Use these for our technicians and anyone that is hard on a phone.  In a business environment you need a touch case to avoid costly repairs.
It's very strong and protects my 4S phone! I think this was a great value! I will buy another one in a newer color!
you know what. It has three layers, and for what? It does protect your phone again

Looking at this paragraph these ten topics seem to be frequently mentioned: ["price", "drop protection", "fit", "aesthetics", "durability", "weight", "ease of use", "quality", "dust protection", "slimness"]. Time to run the zero-shot classification to find the main topic of each review.

In [23]:
aspects = ["price", "protection", "fit", "aesthetics", "durability", "weight", "ease of use", "quality", "repurchase intent", "bulkyness"]
review_list = df_filtered_reviews.tolist()

from transformers import pipeline

pipe = pipeline(model="facebook/bart-large-mnli")
score_table = pipe(review_list,
    candidate_labels=aspects,
)

Device set to use cuda:0


In [24]:
score_table = pd.DataFrame(score_table)
score_table = score_table.explode(["labels", "scores"])
score_table = score_table.pivot_table(index="sequence",
                                 columns="labels",
                                 values="scores",
                                 aggfunc="mean")
score_table

labels,aesthetics,bulkyness,durability,ease of use,fit,price,protection,quality,repurchase intent,weight
sequence,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
"A great phone case wow I dropped my phone so many times and still no damage. However, do not throw your phone with force as this product, though great, will not save your phone from your own stupidity.",0.002801,0.003635,0.190034,0.017657,0.136014,0.007503,0.16717,0.453536,0.01383,0.00782
A must have to protect your iphone. I will say that it adds a little weight to the phone as expected but the protection far outweighs the inconvenience.,0.001791,0.238908,0.016436,0.003797,0.071566,0.00748,0.343809,0.122131,0.005291,0.188792
"After two years, I needed to replace my Otterbox Defender case because the silicone rubber became loose and was ripping where charger plugs in. I decided to replace the entire case instead of just buying an aftermarket silicone rubber piece. The case is very sturdy and my iPhone is protected from damage. I really like the belt clip/holster that comes with the Defender model. I will continue to use Otterbox cases in the future.",0.008107,0.014931,0.166029,0.01994,0.13756,0.013406,0.269721,0.303464,0.042167,0.024676
Bulky and annoying but does what it's supposed to. Keeps the phone protected. Bulkiness got in the way so I stopped using it. Seller got it to me on time though.,0.002496,0.629344,0.02071,0.004351,0.049713,0.004883,0.207284,0.032134,0.006761,0.042324
"Did a nice job. Makes my son's I-phone secure, have no complaints about it. The colors make it cool to look at and easy to find. Don&#8217;t have to worry about her dropping it. Well not as much anyway.",0.061514,0.000933,0.036877,0.0706,0.231695,0.003503,0.130643,0.456038,0.002669,0.005527
...,...,...,...,...,...,...,...,...,...,...
purchased as a gift loved it and a great buy to protect your iphone def recommend this item to anyone,0.001495,0.001253,0.028826,0.014357,0.213083,0.046399,0.280193,0.401557,0.009769,0.003069
this case offers great protection for your iphone. as long as you don't mind the bulkiness of the case (which you get used) it's worth the money. you can literally through your phone against the wall without worry (not that i'm telling you to do it but you could if you wanted to).,0.002047,0.15348,0.060528,0.00412,0.141291,0.057199,0.311474,0.209551,0.003598,0.056712
"this is a great case, fits the phone nicely, and protects it from damage. I like the protectant for the charger. Im told by others that after a while it will stretch out and be hard to close (the charger part) but the people I know who have otterbox covers just cut it off when that happened. this case is worth the cost.",0.007978,0.007504,0.030605,0.032251,0.304783,0.058049,0.180706,0.366108,0.003489,0.008526
very good protection but is quite bulky recommend to anyone who doesnt mind the bulk and drops their phone often enough,0.001601,0.31436,0.054946,0.001597,0.051783,0.002498,0.200599,0.219288,0.004422,0.148908


In [None]:
for col in score_table.columns:
    score_table[col] = pd.to_numeric(score_table[col], errors="coerce")
score_table["main topic"] = score_table.idxmax(axis=1, numeric_only=True)


<class 'pandas.core.frame.DataFrame'>
Index: 100 entries, A great phone case wow I dropped my phone so many times and still no damage.  However, do not throw your phone with force as this product, though great, will not save your phone from your own stupidity. to you know what. It has three layers, and for what? It does protect your phone against falls (that's why I gave it 2 stars instead of 1) but that's the best that can be said about it.  The silicone gasket that wraps around the phone never stays in place, as well as the port covers. This product lets in a lot of dust and then traps it.  Look for another product to protect your iPhone.
Data columns (total 11 columns):
 #   Column             Non-Null Count  Dtype  
---  ------             --------------  -----  
 0   aesthetics         100 non-null    float64
 1   bulkyness          100 non-null    float64
 2   durability         100 non-null    float64
 3   ease of use        100 non-null    float64
 4   fit                100 no

labels,aesthetics,bulkyness,durability,ease of use,fit,price,protection,quality,repurchase intent,weight,main topic
sequence,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
"A great phone case wow I dropped my phone so many times and still no damage. However, do not throw your phone with force as this product, though great, will not save your phone from your own stupidity.",0.002801,0.003635,0.190034,0.017657,0.136014,0.007503,0.167170,0.453536,0.013830,0.007820,quality
A must have to protect your iphone. I will say that it adds a little weight to the phone as expected but the protection far outweighs the inconvenience.,0.001791,0.238908,0.016436,0.003797,0.071566,0.007480,0.343809,0.122131,0.005291,0.188792,protection
"After two years, I needed to replace my Otterbox Defender case because the silicone rubber became loose and was ripping where charger plugs in. I decided to replace the entire case instead of just buying an aftermarket silicone rubber piece. The case is very sturdy and my iPhone is protected from damage. I really like the belt clip/holster that comes with the Defender model. I will continue to use Otterbox cases in the future.",0.008107,0.014931,0.166029,0.019940,0.137560,0.013406,0.269721,0.303464,0.042167,0.024676,quality
Bulky and annoying but does what it's supposed to. Keeps the phone protected. Bulkiness got in the way so I stopped using it. Seller got it to me on time though.,0.002496,0.629344,0.020710,0.004351,0.049713,0.004883,0.207284,0.032134,0.006761,0.042324,bulkyness
"Did a nice job. Makes my son's I-phone secure, have no complaints about it. The colors make it cool to look at and easy to find. Don&#8217;t have to worry about her dropping it. Well not as much anyway.",0.061514,0.000933,0.036877,0.070600,0.231695,0.003503,0.130643,0.456038,0.002669,0.005527,quality
...,...,...,...,...,...,...,...,...,...,...,...
purchased as a gift loved it and a great buy to protect your iphone def recommend this item to anyone,0.001495,0.001253,0.028826,0.014357,0.213083,0.046399,0.280193,0.401557,0.009769,0.003069,quality
this case offers great protection for your iphone. as long as you don't mind the bulkiness of the case (which you get used) it's worth the money. you can literally through your phone against the wall without worry (not that i'm telling you to do it but you could if you wanted to).,0.002047,0.153480,0.060528,0.004120,0.141291,0.057199,0.311474,0.209551,0.003598,0.056712,protection
"this is a great case, fits the phone nicely, and protects it from damage. I like the protectant for the charger. Im told by others that after a while it will stretch out and be hard to close (the charger part) but the people I know who have otterbox covers just cut it off when that happened. this case is worth the cost.",0.007978,0.007504,0.030605,0.032251,0.304783,0.058049,0.180706,0.366108,0.003489,0.008526,quality
very good protection but is quite bulky recommend to anyone who doesnt mind the bulk and drops their phone often enough,0.001601,0.314360,0.054946,0.001597,0.051783,0.002498,0.200599,0.219288,0.004422,0.148908,bulkyness
