<a href="https://colab.research.google.com/github/Manishkatel/Ecommerce_Product_Recommendation_Using_Reviews/blob/main/Ecommerce_Product_Recommendation_Ratings_from_Reviews.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Predicting E-Commerce Product Recommendation Ratings from Reviews**

This is a classic NLP problem dealing with data from an e-commerce store focusing on women's clothing. Each record in the dataset is a customer review which consists of the review title, text description and a rating (ranging from 1 - 5) for a product amongst other features

Converting this into a binary classification problem such that a customer recommends a product (label 1) is the rating is > 3 else they do not recommend the product (label 0)

# Main Objective: To leverage the review text attributes to predict the recommendation rating (classification)

# **Loading up basic dependencies**

In [None]:
import numpy as np
import pandas as pd

from sklearn.metrics import confusion_matrix, classification_report

# **Load and view the dataset**

The dataset used is a kaggle dataset:  https://www.kaggle.com/nicapotato/womens-ecommerce-clothing-reviews


In [None]:
df = pd.read_csv('https://raw.githubusercontent.com/dipanjanS/feature_engineering_session_dhs18/master/ecommerce_product_ratings_prediction/Womens%20Clothing%20E-Commerce%20Reviews.csv', keep_default_na=False)
df.head()

Unnamed: 0.1,Unnamed: 0,Clothing ID,Age,Title,Review Text,Rating,Recommended IND,Positive Feedback Count,Division Name,Department Name,Class Name
0,0,767,33,,Absolutely wonderful - silky and sexy and comf...,4,1,0,Initmates,Intimate,Intimates
1,1,1080,34,,Love this dress! it's sooo pretty. i happene...,5,1,4,General,Dresses,Dresses
2,2,1077,60,Some major design flaws,I had such high hopes for this dress and reall...,3,0,0,General,Dresses,Dresses
3,3,1049,50,My favorite buy!,"I love, love, love this jumpsuit. it's fun, fl...",5,1,0,General Petite,Bottoms,Pants
4,4,847,47,Flattering shirt,This shirt is very flattering to all due to th...,5,1,6,General,Tops,Blouses


# **Basic Data Processing**

In [None]:
df['Review'] = (df['Title'].map(str) +' '+ df['Review Text']).apply(lambda row: row.strip())
df['Rating'] = [1 if rating > 3 else 0 for rating in df['Rating']]
df = df[['Review', 'Rating']]
df.head()

Unnamed: 0,Review,Rating
0,Absolutely wonderful - silky and sexy and comf...,1
1,Love this dress! it's sooo pretty. i happene...,1
2,Some major design flaws I had such high hopes ...,0
3,"My favorite buy! I love, love, love this jumps...",1
4,Flattering shirt This shirt is very flattering...,1


# **Removing all records with no review text**

In [None]:
df = df[df['Review'] != '']
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 22642 entries, 0 to 23485
Data columns (total 2 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   Review  22642 non-null  object
 1   Rating  22642 non-null  int64 
dtypes: int64(1), object(1)
memory usage: 530.7+ KB


# There is some imbalance in the data based on product ratings

In [None]:
df['Rating'].value_counts()

Unnamed: 0_level_0,count
Rating,Unnamed: 1_level_1
1,17449
0,5193


# **Build train and test datasets**

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(df[['Review']], df['Rating'], random_state=42)
X_train.shape, X_test.shape

((16981, 1), (5661, 1))

In [None]:
from collections import Counter
Counter(y_train), Counter(y_test)

(Counter({1: 13059, 0: 3922}), Counter({1: 4390, 0: 1271}))

# **Experiment 1: Basic NLP Count based Features**

A number of basic text based features can also be created which sometimes are helpful for improving text classification models. Some examples are:

 Word Count: total number of words in the documents

*    Word Count: total number of words in the documents
*    Character Count: total number of characters in the documents
*    Average Word Density: average length of the words used in the documents
*    Puncutation Count: total number of punctuation marks in the documents
*    Upper Case Count: total number of upper count words in the documents
*   Title Word Count: total number of proper case (title) words in the documents

In [None]:
import string

X_train['char_count'] = X_train['Review'].apply(len)
X_train['word_count'] = X_train['Review'].apply(lambda x: len(x.split()))
X_train['word_density'] = X_train['char_count'] / (X_train['word_count']+1)
X_train['punctuation_count'] = X_train['Review'].apply(lambda x: len("".join(_ for _ in x if _ in string.punctuation)))
X_train['title_word_count'] = X_train['Review'].apply(lambda x: len([wrd for wrd in x.split() if wrd.istitle()]))
X_train['upper_case_word_count'] = X_train['Review'].apply(lambda x: len([wrd for wrd in x.split() if wrd.isupper()]))


X_test['char_count'] = X_test['Review'].apply(len)
X_test['word_count'] = X_test['Review'].apply(lambda x: len(x.split()))
X_test['word_density'] = X_test['char_count'] / (X_test['word_count']+1)
X_test['punctuation_count'] = X_test['Review'].apply(lambda x: len("".join(_ for _ in x if _ in string.punctuation)))
X_test['title_word_count'] = X_test['Review'].apply(lambda x: len([wrd for wrd in x.split() if wrd.istitle()]))
X_test['upper_case_word_count'] = X_test['Review'].apply(lambda x: len([wrd for wrd in x.split() if wrd.isupper()]))

In [None]:

X_train.head()

Unnamed: 0,Review,char_count,word_count,word_density,punctuation_count,title_word_count,upper_case_word_count
12896,Soooo soft! This is a delightfully soft and fl...,268,52,5.056604,8,2,0
13183,"Had my eye on this, but dind't get I finally v...",399,84,4.694118,20,2,1
1496,I wanted to like this... I wanted to like this...,525,104,5.0,19,2,2
5205,Beautiful blouse Bought this for my daughter i...,203,35,5.638889,10,2,0
13366,"Boxy. large. Boxy, unflattering, and large.\n\...",295,51,5.673077,22,2,0


# **Training a Logistic Regression Model**

A logistic regression model is easy to train, interpret and works well on a wide variety of NLP problems

In [None]:
from sklearn.linear_model import LogisticRegression

lr = LogisticRegression(C=1, random_state=42, solver='liblinear')

# **Model Evaluation Metrics**

Just accuracy is never enough in datasets with a rare class problem.

Precision: The positive predictive power of a model. Out of all the predictions made by a model for a class, how many are actually correct

Recall: The coverage or hit-rate of a model. Out of all the test data samples belonging to a class, how many was the model able to predict (hit or cover) correctly.

F1-score: The harmonic mean of the precision and recall

In [None]:
lr.fit(X_train.drop(['Review'], axis=1), y_train)
predictions = lr.predict(X_test.drop(['Review'], axis=1))

print(classification_report(y_test, predictions))
pd.DataFrame(confusion_matrix(y_test, predictions))

              precision    recall  f1-score   support

           0       0.00      0.00      0.00      1271
           1       0.78      1.00      0.87      4390

    accuracy                           0.78      5661
   macro avg       0.39      0.50      0.44      5661
weighted avg       0.60      0.78      0.68      5661



  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


Unnamed: 0,0,1
0,0,1271
1,0,4390


# **Leveraging Text Sentiment**

Reviews are pretty subjective, opinionated and people often express stong emotions, feelings. This makes it a classic case where the text documents here are a good candidate for extracting sentiment as a feature.

The general expectation is that highly rated and recommended products (label 1) should have a positive sentiment and products which are not recommended (label 0) should have a negative sentiment.

TextBlob is an excellent open-source library for performing NLP tasks with ease, including sentiment analysis. It also an a sentiment lexicon (in the form of an XML file) which it leverages to give both polarity and subjectivity scores.


*   The polarity score is a float within the range [-1.0, 1.0].
*   The subjectivity is a float within the range [0.0, 1.0] where 0.0 is very objective and 1.0 is very subjective.




In [None]:
import textblob

textblob.TextBlob('This is an AMAZING pair of Jeans!').sentiment

Sentiment(polarity=0.7500000000000001, subjectivity=0.9)

In [None]:
textblob.TextBlob('I really hated this UGLY T-shirt!!').sentiment

Sentiment(polarity=-0.95, subjectivity=0.85)

# **Experiment 2: Features from Sentiment Analysis**

This is unsupervised, lexicon-based sentiment analysis where we don't have any pre-labeled data saying which review might have a positive or negative sentiment, so using the lexicon to determine this.

In [None]:
x_train_snt_obj = X_train['Review'].apply(lambda row: textblob.TextBlob(row).sentiment)
X_train['Polarity'] = [obj.polarity for obj in x_train_snt_obj.values]
X_train['Subjectivity'] = [obj.subjectivity for obj in x_train_snt_obj.values]

x_test_snt_obj = X_test['Review'].apply(lambda row: textblob.TextBlob(row).sentiment)
X_test['Polarity'] = [obj.polarity for obj in x_test_snt_obj.values]
X_test['Subjectivity'] = [obj.subjectivity for obj in x_test_snt_obj.values]

In [None]:
X_train.head()

Unnamed: 0,Review,char_count,word_count,word_density,punctuation_count,title_word_count,upper_case_word_count,Polarity,Subjectivity
12896,Soooo soft! This is a delightfully soft and fl...,268,52,5.056604,8,2,0,0.170455,0.490909
13183,"Had my eye on this, but dind't get I finally v...",399,84,4.694118,20,2,1,0.101944,0.719537
1496,I wanted to like this... I wanted to like this...,525,104,5.0,19,2,2,0.186538,0.458761
5205,Beautiful blouse Bought this for my daughter i...,203,35,5.638889,10,2,0,0.625,0.825
13366,"Boxy. large. Boxy, unflattering, and large.\n\...",295,51,5.673077,22,2,0,0.329613,0.510268


# **Model Training and Evaluation**

In [None]:
lr.fit(X_train.drop(['Review'], axis=1), y_train, )
predictions = lr.predict(X_test.drop(['Review'], axis=1))

print(classification_report(y_test, predictions))
pd.DataFrame(confusion_matrix(y_test, predictions))

              precision    recall  f1-score   support

           0       0.69      0.27      0.38      1271
           1       0.82      0.97      0.89      4390

    accuracy                           0.81      5661
   macro avg       0.75      0.62      0.64      5661
weighted avg       0.79      0.81      0.77      5661



Unnamed: 0,0,1
0,338,933
1,153,4237


Intersting ! F1-Score for bad reviews is now 38% and good reviews is 89% which brings our overall F1-Score to 77% which is quite good.

But,we can still improve on our model since the recall of bas reviews is pretty low.

# **Text Pre-processing and Wrangling**

Extracting some specific features based on standard NLP feature engineering models like the classic Bag of Words model and cleaning and pre-processing our text data and building a simple text pre-processor here since the main intent is to look at feature engineering strategies.

The focus will be on:
   

*  Text Lowercasing
*  Removal of contractions
*  Removing unnecessary characters, numbers and symbols
*   Stemming
*   Stopword removal



In [None]:
!pip install contractions
!pip install textsearch
!pip install tqdm
import nltk
nltk.download('punkt')
nltk.download('punk_tab')
nltk.download('stopwords')



[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Error loading punk_tab: Package 'punk_tab' not found in
[nltk_data]     index
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

In [None]:
import contractions

contractions.fix('I didn\'t like this t-shirt')

'I did not like this t-shirt'

In [None]:

import nltk
import contractions
import re

# remove some stopwords to capture negation in n-grams if possible
stop_words = nltk.corpus.stopwords.words('english')
stop_words.remove('no')
stop_words.remove('not')
stop_words.remove('but')

# load up a simple porter stemmer - nothing fancy
ps = nltk.porter.PorterStemmer()

def simple_text_preprocessor(document):
    # lower case
    document = str(document).lower()

    # expand contractions
    document = contractions.fix(document)

    # remove unnecessary characters
    document = re.sub(r'[^a-zA-Z]',r' ', document)
    document = re.sub(r'nbsp', r'', document)
    document = re.sub(' +', ' ', document)

    # simple porter stemming
    document = ' '.join([ps.stem(word) for word in document.split()])

    # stopwords removal
    document = ' '.join([word for word in document.split() if word not in stop_words])

    return document

stp = np.vectorize(simple_text_preprocessor)


In [None]:
X_train['Clean Review'] = stp(X_train['Review'].values)
X_test['Clean Review'] = stp(X_test['Review'].values)

X_train.head()

Unnamed: 0,Review,char_count,word_count,word_density,punctuation_count,title_word_count,upper_case_word_count,Polarity,Subjectivity,Clean Review
12896,Soooo soft! This is a delightfully soft and fl...,268,52,5.056604,8,2,0,0.170455,0.490909,soooo soft thi delight soft fluffi sweater mig...
13183,"Had my eye on this, but dind't get I finally v...",399,84,4.694118,20,2,1,0.101944,0.719537,eye thi but dind get final visit store petit t...
1496,I wanted to like this... I wanted to like this...,525,104,5.0,19,2,2,0.186538,0.458761,want like thi want like thi top badli badli fa...
5205,Beautiful blouse Bought this for my daughter i...,203,35,5.638889,10,2,0,0.625,0.825,beauti blous bought thi daughter law birthday ...
13366,"Boxy. large. Boxy, unflattering, and large.\n\...",295,51,5.673077,22,2,0,0.329613,0.510268,boxi larg boxi unflatt larg curvi pound thi to...


# **Extracting out the structured features from previous experiments**

In [None]:
X_train_metadata = X_train.drop(['Review', 'Clean Review'], axis=1).reset_index(drop=True)
X_test_metadata = X_test.drop(['Review', 'Clean Review'], axis=1).reset_index(drop=True)

X_train_metadata.head()

Unnamed: 0,char_count,word_count,word_density,punctuation_count,title_word_count,upper_case_word_count,Polarity,Subjectivity
0,268,52,5.056604,8,2,0,0.170455,0.490909
1,399,84,4.694118,20,2,1,0.101944,0.719537
2,525,104,5.0,19,2,2,0.186538,0.458761
3,203,35,5.638889,10,2,0,0.625,0.825
4,295,51,5.673077,22,2,0,0.329613,0.510268


# **Experiment 3:Adding Bag of Words based Features - 1-grams**


This is perhaps the most simple vector space representational model for unstructured text. A vector space model is simply a mathematical model to represent unstructured text (or any other data) as numeric vectors, such that each dimension of the vector is a specific feature\attribute.

The bag of words model represents each text document as a numeric vector where each dimension is a specific word from the corpus and the value could be its frequency in the document, occurrence (denoted by 1 or 0) or even weighted values.

The model’s name is such because each document is represented literally as a ‘bag’ of its own words, disregarding word orders, sequences and grammar.

In [None]:
from sklearn.feature_extraction.text import CountVectorizer

cv = CountVectorizer(min_df=0.0, max_df=1.0, ngram_range=(1, 1))
X_traincv = cv.fit_transform(X_train['Clean Review']).toarray()
X_traincv = pd.DataFrame(X_traincv, columns=cv.get_feature_names_out())  # Updated method

X_testcv = cv.transform(X_test['Clean Review']).toarray()
X_testcv = pd.DataFrame(X_testcv, columns=cv.get_feature_names_out())  # Updated method

X_traincv.head()


Unnamed: 0,aa,aaaaandidon,aaaaannnnnnd,aaaah,aaaahmaz,aaah,ab,abbey,abbi,abck,...,zing,zip,zipper,zipperi,zippi,zone,zooland,zoom,zowi,zuma
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
3,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [None]:
X_train_comb = pd.concat([X_train_metadata, X_traincv], axis=1)
X_test_comb = pd.concat([X_test_metadata, X_testcv], axis=1)

X_train_comb.head()

Unnamed: 0,char_count,word_count,word_density,punctuation_count,title_word_count,upper_case_word_count,Polarity,Subjectivity,aa,aaaaandidon,...,zing,zip,zipper,zipperi,zippi,zone,zooland,zoom,zowi,zuma
0,268,52,5.056604,8,2,0,0.170455,0.490909,0,0,...,0,0,0,0,0,0,0,0,0,0
1,399,84,4.694118,20,2,1,0.101944,0.719537,0,0,...,0,0,0,0,0,0,0,0,0,0
2,525,104,5.0,19,2,2,0.186538,0.458761,0,0,...,0,0,0,0,0,0,0,0,0,0
3,203,35,5.638889,10,2,0,0.625,0.825,0,0,...,0,0,0,0,0,0,0,0,0,0
4,295,51,5.673077,22,2,0,0.329613,0.510268,0,0,...,0,0,0,0,0,0,0,0,0,0



# **Model Training and Evaluation**

In [None]:
lr.fit(X_train_comb, y_train)
predictions = lr.predict(X_test_comb)

print(classification_report(y_test, predictions))
pd.DataFrame(confusion_matrix(y_test, predictions))

              precision    recall  f1-score   support

           0       0.76      0.70      0.73      1271
           1       0.92      0.93      0.93      4390

    accuracy                           0.88      5661
   macro avg       0.84      0.82      0.83      5661
weighted avg       0.88      0.88      0.88      5661



Unnamed: 0,0,1
0,896,375
1,286,4104


Now,

Precision is quite good at 76%

F1-Score for bad reviews is now 73% and good reviews is 92%

This brings our overall F1-Score to 88% which is quite good.