In [1]:
!pip install textblob



In [2]:
# for Importing the Dataset
import pandas as pd

# for calculating Polarity and Subjectivity
from textblob import TextBlob

In [3]:
# lets read the dataset
data = pd.read_csv('amazon.csv')

# lets check the shape of the dataset
data.shape

(3150, 5)

In [4]:
# lets check the head of the dataset
data.head()

Unnamed: 0,rating,date,variation,verified_reviews,feedback
0,5,31-Jul-18,Charcoal Fabric,Love my Echo!,1
1,5,31-Jul-18,Charcoal Fabric,Loved it!,1
2,4,31-Jul-18,Walnut Finish,"Sometimes while playing a game, you can answer...",1
3,5,31-Jul-18,Charcoal Fabric,I have had a lot of fun with this thing. My 4 ...,1
4,5,31-Jul-18,Charcoal Fabric,Music,1


In [5]:
len(data['verified_reviews'])

3150

In [6]:
# Lets calculate the length of the Reviews
data['length'] = data['verified_reviews'].apply(len)

In [8]:
data['length'].shape

(3150,)

### Text Polarity

It is the expression that determines the sentimental aspect of an opinion. In textual data, the result of sentiment analysis can be determined for each entity in the sentence, document or sentence. The sentiment polarity can be determined as positive, negative and neutral.

In [9]:
data['verified_reviews']

0                                           Love my Echo!
1                                               Loved it!
2       Sometimes while playing a game, you can answer...
3       I have had a lot of fun with this thing. My 4 ...
4                                                   Music
                              ...                        
3145    Perfect for kids, adults and everyone in betwe...
3146    Listening to music, searching locations, check...
3147    I do love these things, i have them running my...
3148    Only complaint I have is that the sound qualit...
3149                                                 Good
Name: verified_reviews, Length: 3150, dtype: object

In [10]:
# Lets calculate the Polarity of the Reviews
def get_polarity(text):
    textblob = TextBlob(str(text.encode('utf-8')))
    pol = textblob.sentiment.polarity
    return pol

# lets apply the function
data['polarity'] = data['verified_reviews'].apply(get_polarity)

In [11]:
data['polarity']

0       0.625000
1       0.875000
2      -0.100000
3       0.350000
4       0.000000
          ...   
3145    1.000000
3146    0.333333
3147    0.237662
3148    0.316667
3149    0.700000
Name: polarity, Length: 3150, dtype: float64

### Text Subjectivity

In natural language, subjectivity refers to expression of opinions, evaluations, feelings, and speculations and thus incorporates sentiment. Subjective text is further classified with sentiment or polarity.

In [12]:
data['verified_reviews']

0                                           Love my Echo!
1                                               Loved it!
2       Sometimes while playing a game, you can answer...
3       I have had a lot of fun with this thing. My 4 ...
4                                                   Music
                              ...                        
3145    Perfect for kids, adults and everyone in betwe...
3146    Listening to music, searching locations, check...
3147    I do love these things, i have them running my...
3148    Only complaint I have is that the sound qualit...
3149                                                 Good
Name: verified_reviews, Length: 3150, dtype: object

In [13]:
# Lets calculate the Subjectvity of the Reviews
def get_subjectivity(text):
    textblob = TextBlob(str(text.encode('utf-8')))
    subj = textblob.sentiment.subjectivity
    return subj

# lets apply the Function
data['subjectivity'] = data['verified_reviews'].apply(get_subjectivity)

In [14]:
## lets summarize the Newly Created Features
data[['length','polarity','subjectivity']].describe()

Unnamed: 0,length,polarity,subjectivity
count,3150.0,3150.0,3150.0
mean,132.049524,0.349792,0.528922
std,182.099952,0.303362,0.256324
min,1.0,-1.0,0.0
25%,30.0,0.123852,0.419196
50%,74.0,0.35,0.585
75%,165.0,0.533333,0.695486
max,2851.0,1.0,1.0
