* Aspect-based opinion mining focuses on the extraction of aspects (or product features) from opinionated text

* **Explicit** aspects explicitly denote targets 
  - e.g. I love the *touchscreen* of my phone but the *battery life* is so short

* Aspect can also be expressed indirectly through an **implicit aspect clue** (IAC)
    - e.g. This is the best phone one could have. It is *lightweight*, *sleek* and *attractive*. I found it very *user-friendly* and *easy to manipulate*
    - `lightweight` -> weight; 
    - `sleek` and `attractive`  -> appearance; 
    - `user-friendly`  -> interface; 
    - `easy to manipulate` -> functionality

* Detect explicit aspects and IACs from opinionated documents.
* Map IACs to their respective aspect **categories**.
* IACs = single words (`sleek`) or multi-word expressions (`easy to manipulate`); different part-of-speech (POS) (adjectives, noun, verbs);

The proposed aspect parser is based on two general rules:
1. Rules for the sentences having subject verb.
2. Rules for the sentences which do not have subject verb.

In [29]:
import spacy
nlp = spacy.load('en_core_web_md')

In [136]:
def explore(doc):
    for t in doc:
        print(t, t.dep_, t.pos_,t.tag_, [c for c in t.children], t.head,t.i)

### Subject Noun Rule:
* **Trigger**: token is a syntactic subject
* **Behavior**: if token *h* subject-noun (nsubj) relationship with word t:
    - if t has any adverbial or adjective modifier = t aspect

- if t has any adverbial or adjective modifier and the modifier exists in SenticNet, then t is extracted as an aspect.

In [227]:
def rule1(doc):
    '''
    Aspect extraction following Subject Noun Rule 1
    There is a subject of t that has any adverbial or adjective modifier.
    T is the aspect
    '''
    for token in doc:
        if token.dep_ in ["nsubj", "nsubjpass"]:
            for child in token.head.children:
                if child.dep_ in ["amod", "advmod"]:
                    #if child is in SenticNet
                    print(token.head)


In [228]:
doc = nlp("This mp3 player also costs a lot less than the ipod.")
rule1(doc)


costs


* if the sentence does not have auxiliary verb, i.e., `is`, `was`, `would`, `should`, `could`, then:
    - if the verb t is modified by an adjective or an adverb or it is in adverbial clause modifier relation with another token, then both h and t are extracted as aspects. In EX, battery is in a subject relation with lasts and lasts is modified by the adjective modifier little, hence both the aspects last and battery are extracted.
      - *The battery lasts little.*
    - if t has any direct object relation with a token n and the POS of the token is `Noun` then n is extracted as an aspect. In EX, like is in direct object relation with lens so the aspect lens is extracted. 
      - *I like the lens of this camera.*
    - if t has any direct object relation with a token n and the POS of the token is `Noun`, then the token n extracted as aspect. In the dependency parse tree of the sentence, if another token n1 is connected to n using any dependency relation and the POS of n is Noun, then n1 is extracted as an aspect. In (3), like is in direct object relation with beauty which is connected to screen via a preposition relation. So the aspects screen and beauty are extracted.
      - *I like the beauty of the screen.*
    - if t is in open clausal complement relation with a token t1 , then the aspect t-t1 is extracted if t-t1 exists in the opinion lexicon. If t1 is connected with a token t2 whose POS is Noun, then t2 is extracted as an aspect. In EX, like and comment is in clausal complement relation and comment is connected to camera using a preposition relation. Here, the POS of camera is Noun and, hence, camera is extracted as an aspect.
      - I would like to comment on the camera of this phone. 

In [231]:
def rule2(doc):
    '''
    Aspect extraction following Subject Noun Rule 2
    Sentence without auxiliary verbs and t with adjective, adverbial or adverbial modifier clause with another token -> h and t are aspects
    or with direct object relation with a NOUN n, n is aspect if in SentiNet
    or with direct object relation with a NOUN n and not in SentiNet derive list of connected nouns and that is aspect
    or open clausal complement with another token
    '''
    for token in doc:
        if token.dep_ in ["nsubj", "nsubjpass"]:
            #check if an AUX is present
            aux_presence = [t for t in doc if t.pos_ == "AUX"]
            for child in token.head.children:
                if child.dep_ in ["amod", "advmod", "advcl"] and not aux_presence:
                    print(token, token.head)
                if child.dep_ == "dobj" and child.pos_ == "NOUN" and not aux_presence:
                    #if child is in SenticNet
                    print(child)
                    #if not
                    print([child]+[cococ for coc in child.children if coc.pos_ ==
                                "ADP" for cococ in coc.children if cococ.pos_ == "NOUN"])
                if child.dep_ == "xcomp":
                    #if [child,coc] is in SenticNet
                    print([[child, coc] for coc in child.children])
                    #if not look for
                    print(
                        [cococ for coc in child.children for cococ in coc.children if cococ.pos_ == "NOUN"])


In [232]:
doc = nlp("The battery lasts little.")
print(f"EXAMPLE 1: {doc}")
rule2(doc)
doc = nlp("I like the lens of this camera.")
print(f"EXAMPLE 2: {doc}")
rule2(doc)
doc = nlp("I like the beauty of the screen.")
print(f"EXAMPLE 3: {doc}")
rule2(doc)
doc = nlp("I would like to comment on the camera of this phone.")
print(f"EXAMPLE 4: {doc}")
rule2(doc)


EXAMPLE 1: The battery lasts little.
battery lasts
EXAMPLE 2: I like the lens of this camera.
lens
[lens, camera]
EXAMPLE 3: I like the beauty of the screen.
beauty
[beauty, screen]
EXAMPLE 4: I would like to comment on the camera of this phone.
[[comment, to], [comment, on]]
[camera]


- A copula is the relation between the complement of a copular verb and the copular verb. If the token t is in copula relation with a copular verb and the copular verb exists in the implicit aspect lexicon, then t is extract as aspect term. In EX, expensive is extracted as an aspect.
  - *The car is expensive.*

In [235]:
def rule3(doc):
    '''
    Subject Noun Rule
    Sentence with auxiliary verb (copula) and token as complement -> token is aspect
    '''
    for token in doc:
        if token.dep_ in ["nsubj", "nsubjpass"]:
            for child in token.head.children:
                if child.dep_ in ["acomp"] and token.head.pos_=="AUX":
                    #check if child exists in the implicit aspect lexicon
                    print(child)


In [234]:
doc = nlp("The car is expensive.")
rule3(doc)

expensive

- If the token t is in copula relation with a copular verb and the POS of h is Noun, then h is extracted as an explicit aspect. In EX, camera is extracted as an aspect. 
  - *The camera is nice.*

In [236]:
def rule4(doc):
    '''
    Subject Noun Rule
    Sentence with auxiliary verb (copula) and token as complement and a Noun -> noun is aspect
    '''
    for token in doc:
        if token.dep_ in ["nsubj", "nsubjpass"]:
            for child in token.head.children:
                if child.dep_ in ["acomp"] and token.head.pos_ == "AUX" and token.pos_=="NOUN":
                    print(token)


In [237]:
doc = nlp("The camera is nice.")
rule4(doc)

camera


- If the token t is in copula relation with a copular verb and the copular verb is connected to a token t1 using any dependency relation and t1 is a verb, then both t1 and t are extracted as implicit aspect terms, as long as they exist in the implicit aspect lexicon. In EX, lightweight is in copula relation with is and lightweight is connected to the word carry by open clausal complement relation. Here, both lightweight and carry are extracted as aspects.
  - *The phone is very lightweight to carry.*

In [238]:
def rule5(doc):
    '''
    Subject Noun Rule
    Sentence with auxiliary verb (copula) and token as complement and a Noun -> noun is aspect
    '''
    for token in doc:
        if token.dep_ in ["nsubj", "nsubjpass"]:
            for child in token.head.children:
                if child.dep_ in ["acomp"] and token.head.pos_ == "AUX":
                    # check if  child and coc exists in the implicit aspect lexicon
                    print(
                        [child]+[coc for coc in child.children if coc.pos_ == "VERB"])


In [239]:
doc=nlp("The phone is very lightweight to carry.")
rule5(doc)

[lightweight, carry]


### NON subject noun rules

- if an `adjective` or `adverb` h is in `infinitival` or `open clausal complement` (ccomp, xcomp) relation with a token t and h exists in the implicit aspect lexicon, then h is extracted as an aspect. In EX, big is extracted as an aspect as it is connected to hold using a clausal complement relation.
    - Very big to hold.

In [240]:
def rule6(doc):
    '''
    NO Subject Noun Rule
    Sentence with adjective or adverb h in infinitival or open clausal complement -> if h in IAC lexicon -> h aspect
    '''
    for token in doc:
        if token.pos_ in ["ADJ", "ADV"]:
            for child in token.children:
                if child.dep_ in ["ccomp", "xcomp"]:
                    # if token is in IAC lexicon
                    print(token)


In [241]:
doc = nlp("Very big to hold.")
rule6(doc)

big


- if a token h is connected to a noun t using a prepositional relation, then both h and t are extracted as aspects. In EX, sleekness is extracted as an aspect.
    - *Love the sleekness of the player.*

In [242]:
def rule7(doc):
    '''
    NO Subject Noun Rule
    h token connected to noun t through preposition -> h+t aspect
    '''
    for token in doc:
        for child in token.children:
            if child.dep_ == "prep":
                for child_of_child in child.children:
                    if child_of_child.pos_=="NOUN":
                        print(token,child_of_child)


In [243]:
doc = nlp("Love the sleekness of the player.")
rule7(doc)

sleekness player


- if a token h is in a direct object relation (`dobj`) with a token t, t is extracted as aspect. In EX, mention is in a direct object relation with price, hence price is extracted as an aspect.
    - Not to mention the price of the phone.

In [244]:
def rule8(doc):
    '''
    NO Subject Noun Rule
    h token connected with direct object with t -> t aspect
    '''
    for token in doc:
        for child in token.children:
            if child.dep_ == "dobj":
                print(child)


In [245]:
doc = nlp("Not to mention the price of the phone.")
rule8(doc)

price


### Additional rules

- For each aspect term extracted above, if an aspect term h is in co-ordination or conjunct relation with another token t, then t is also extracted as an aspect. In EX, amazing is firstly extracted as an aspect term. As amazing is in conjunct relation with easy, then use is also extracted as an aspect.
    - *The camera is amazing and easy to use.*

In [246]:
doc = nlp("The camera is amazing or easy to use.")


In [208]:
test = rule3(doc)

In [210]:
[coc for c in test.children if c.dep_ in ["conj"]
    for coc in c.children if coc.dep_ == "xcomp"]


[use]

- A noun compound modifier of an NP is any noun that serves to modify the head noun. If t is extracted as an aspect and t has noun compound modifier h, then the aspect h-t is extracted and t is removed from the aspect list. In EX, as chicken and casserole are in noun compound modifier relation, only chicken casserole is extracted as an aspect.
  - *We ordered the chicken casserole, but what we got were a few small pieces of chicken, all dark meat and on the bone.*

In [215]:
doc = nlp("We loved the chicken casserole.")
explore(doc)

We nsubj PRON PRP [] loved 0
loved ROOT VERB VBD [We, casserole, .] loved 1
the det DET DT [] casserole 2
chicken compound NOUN NN [] casserole 3
casserole dobj NOUN NN [the, chicken] loved 4
. punct PUNCT . [] loved 5


In [180]:
test = rule2(doc)

In [183]:
[c for c in test.children if c.dep_=="compound"]+[test]

[chicken, casserole]