# NLP - Counting Nouns and Verbs

In [1]:
import pandas as pd
from textblob import TextBlob
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\vixen\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\vixen\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping taggers\averaged_perceptron_tagger.zip.


True

In [2]:
data = pd.read_csv('amazon_alexa.tsv', delimiter='\t')

In [3]:
# create part of speech dictionary
pos_dic = {
    'noun' : ['NN', 'NNS', 'NNP', 'NNPS'],
    'pron' : ['PRP', 'PRP$', 'WP', 'WP$'],
    'verb' : ['VB', 'VBD', 'VBG', 'VBN', 'VBP', 'VBZ'],
    'adj' : ['JJ', 'JJR', 'JJS'],
    'adv' : ['RB', 'RBR', 'RBS', 'WRB']
}

# function to check and get the part of speech tag count of words in a given sentence
def pos_check(x, flag):
    cnt = 0
    try:
        wiki = TextBlob(x)
        for tup in wiki.tags:
            ppo = list(tup)[1]
            if ppo in pos_dic[flag]:
                cnt += 1
    except:
        pass
    return cnt

Nouns

A noun is a word that functions as the name of a specific object or set of objects, such as living creatures, places, actions, quantities, states of existence, or ideas. However, noun is not a semantic category, so that it cannot be characterized in terms of its meaning.

In [4]:
# Lets calculate the count of Nouns in the text
data['noun_count'] = data['verified_reviews'].apply(lambda x: pos_check(x, 'noun'))

Verbs

A verb, from the Latin meaning word, is a word that in syntax conveys an action, an occurrence, or a state of being. In the usual description of English, the basic form, with or without the participle to, is the infinitive. In many languages, verbs are inflected to encode tense, aspect, mood, and voice.

In [5]:
# Lets calculate the count of verbs in the text
data['verb_count'] = data['verified_reviews'].apply(lambda x: pos_check(x, 'verb'))

In [6]:
# Let's summarize the newly created features
data[['noun_count', 'verb_count']].describe()

Unnamed: 0,noun_count,verb_count
count,3150.0,3150.0
mean,5.945397,5.155873
std,8.222776,7.223565
min,0.0,0.0
25%,1.0,1.0
50%,3.0,3.0
75%,7.0,7.0
max,137.0,102.0


# ADJECTIVE, ADVERBS AND PRONOUNS

Adjectives

In linguistics, an adjective is a word that modifies a noun phrase or describes its referent. Its semantic role is to change information given by the noun. Adjectives are one of the main parts of speech of the English language, although historically they were classed together with nouns.

In [7]:
# Lets calculate the count of adjectives in the text
data['adj_count'] = data['verified_reviews'].apply(lambda x: pos_check(x, 'adj'))

Adverbs

An adverb is a word or an expression that modifies a verb, adjective, another adverb, determiner, clause, preposition, or sentence. Adverbs typically express manner, place, time, frequency, degree, level of cerainty, etc., answering questions such as how?, in what way?, when?, where?, and to what extent?

Manner, place, time, frequency, degree, etc.

In [8]:
data['adv_count'] = data['verified_reviews'].apply(lambda x: pos_check(x, 'adv'))

Pronouns

A pronoun (I, me, he, she, herself, you, it, that, they, each, few, many, who, whover, whose, someone, everybody, etc.) is a word that takes the place of a noun. In the sentence Joe saw Jill, and waved at her, the pronouns he and her take the place of Joe and Jill, respectively.

In [9]:
# Lets calculate the count of pronouns in the text
data['pron_count'] = data['verified_reviews'].apply(lambda x: pos_check(x, 'pron'))

In [10]:
# Lets summarize the created features
data[['adj_count', 'adv_count', 'pron_count']].describe()

Unnamed: 0,adj_count,adv_count,pron_count
count,3150.0,3150.0,3150.0
mean,2.172381,2.00254,3.242222
std,2.93539,3.277083,4.627235
min,0.0,0.0,0.0
25%,0.0,0.0,0.0
50%,1.0,1.0,2.0
75%,3.0,3.0,4.0
max,39.0,54.0,70.0
