## Frequent Aspect Extraction in ReLi
Baseline is a adaptation for the baseline used in he opinion target extraction (OTE) for 2014, 2015 and 2016 versions of SemEval Aspect-Based Sentiment Analysis Task.

Source:

Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., & Manandhar, S. (2014, August). Semeval-2014 task 4: Aspect based sentiment analysis. In Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014) (pp. 27-35).

Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S., & Androutsopoulos, I. (2015, June). Semeval-2015 task 12: Aspect based sentiment analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Association for Computational Linguistics, Denver, Colorado (pp. 486-495).

In [1]:
from __future__ import unicode_literals
from __future__ import division
from __future__ import print_function

In [22]:
from lxml import etree
parser = etree.XMLParser(remove_blank_text=True)
trainset = etree.parse('../corpus/ReLi_train.xml', parser)
testset = etree.parse('../corpus/ReLi_test.xml', parser)

## 1. All aspects in trainset

In [3]:
from collections import Counter
targets = Counter([opinion_node.get('target').lower()
                    for opinion_node in trainset.iter('Opinion')
                    if opinion_node.get('target') != 'NULL'
                   ])

In [4]:
import ipy_table

data = [['freq', '%freq', 'target']]
for target, freq in targets.most_common(20):
    ratio = freq / sum(targets.values()) *100
    data.append([freq, '{:.1f}%'.format(ratio), target])

ipy_table.make_table(data)
ipy_table.apply_theme('basic')

0,1,2
freq,%freq,target
681,32.9%,livro
150,7.2%,história
88,4.2%,leitura
59,2.8%,personagens
47,2.3%,crepúsculo
44,2.1%,narrativa
42,2.0%,romance
39,1.9%,obra
35,1.7%,final


In [5]:
# Build a regex to match the targets in the text
import re
targets_list = sorted(list(targets))
targets_list.sort(key=len, reverse=True)
targets_pattern = r'\b(' + '|'.join([re.escape(t) for t in targets_list]) + r')\b'
len(targets_list)

436

In [8]:
test_gold = list()
prediction = list()

for sentence_node in testset.iter('sentence'):    
    sentence_opinions = []
    for opinion_node in sentence_node.iter('Opinion'):
        target = opinion_node.get('target')
        start = int(opinion_node.get('from'))
        end = int(opinion_node.get('to'))
        # evaluation explicit says to discart NULL values
        if target != 'NULL':
            sentence_opinions.append((target, start, end))
    test_gold.append(sentence_opinions)
    
    text = sentence_node.xpath('./text/text()')[0]
    text_opinions = []
    
    for m in re.finditer(targets_pattern, text, flags=re.I):
        text_opinions.append( (m.group(), m.start(), m.end()) )
    prediction.append(text_opinions)

In [10]:
data = [['Gold Standard', 'Predicted', 'Sentence']]
for index, (gold, pred) in enumerate(list(zip(test_gold, prediction))[:100]):
    sentence = list(testset.iter('sentence'))[index].xpath('./text/text()')[0]
    data.append([gold, pred, sentence])

ipy_table.make_table(data)
ipy_table.set_global_style(wrap=True)
ipy_table.apply_theme('basic')

0,1,2
Gold Standard,Predicted,Sentence
[],"[('livro', 30, 35), ('história', 48, 56)]",Está provado: Pode existir um livro bom sem uma história boa
[],"[('a', 40, 41), ('a', 107, 108), ('a', 121, 122)]","""Se querem mesmo ouvir o que aconteceu, a primeira coisa que vão querer saber é onde eu nasci, como passei a porcaria de a minha infância (...) e toda essa lenga-lenga...""."
"[('O Apanhador em o Campo de Centeio', 14, 47)]","[('O Apanhador em o Campo de Centeio', 14, 47), ('a', 76, 77), ('literatura', 78, 88)]","Assim começa ""O Apanhador em o Campo de Centeio"", um verdadeiro clássico de a literatura mundial."
[],"[('a', 116, 117), ('a', 148, 149), ('a', 183, 184), ('AUTOR', 198, 203)]","E essas três primeiras linhas são suficientes para termos um norte bem contextualizado de tudo quanto encontraremos a o longo de as 207 páginas que a versão brasileira, publicada por a EDITORA DE O AUTOR, trará."
[],"[('Apanhador', 6, 15), ('a', 39, 40), ('a', 48, 49), ('Holden Caulfield', 84, 100), ('anos', 128, 132), ('esse', 141, 145)]","Em ""O Apanhador"", o leitor é convidado a vestir a pele de um adolescente revoltado, Holden Caulfield, que, em o auge de seus 17 anos, conta ""esse negócio doido que aconteceu em o último Natal""."
[],"[('A', 0, 1)]",A verdade é que não tem negócio doido coisa nenhuma.
[],"[('A', 0, 1), ('história', 2, 10), ('a', 55, 56), ('história', 57, 65), ('literatura', 160, 170), ('este', 179, 183), ('a', 205, 206), ('autor', 246, 251), ('a', 307, 308), ('a', 353, 354)]","A história em si não tem nada de mirabolante, se é que a história despretensiosa de um adolescente rebelde que bombou em quase todas as matérias (exceto Inglês/literatura, e é em este ponto que identifico a primeira marquinha de o alter ego de o autor, J. D. Salinger), que tinha um olhar crítico para tudo a o seu redor, e ainda parecia bem perdido em a vida, soaria mirabolante ou diferente de o cotidiano de boa parte de os adolescentes de os dias de hoje."
[],"[('a', 33, 34)]","Mas vamos devagar com o andar de a carruagem, chegaremos lá."
[],"[('este', 23, 27), ('ser', 56, 59)]",Há de se ressaltar que este clássico não é tão fácil de ser encontrado.


### Aspect-extraction Evaluation methodology

Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S., & Androutsopoulos, I. (2015, June). Semeval-2015 task 12: Aspect based sentiment analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Association for Computational Linguistics, Denver, Colorado (pp. 486-495).

http://www.anthology.aclweb.org/S/S15/S15-2082.pdf

From 4.1 Evaluation Measures, page 491:

Slot 2: F-1 scores are calculated by comparing
the targets that a system returned (for all the sentences)
to the corresponding gold targets (using
micro-averaging). The targets are extracted using
their starting and ending offsets. The calculation
for each sentence considers only distinct targets
and discards NULL targets, since they do not correspond
to explicit mentions

In [12]:
# Micro-averaged Precision
correct = 0
total = 0
for index in range(len(list(testset.iter('sentence')))):
    correct += len([x for x in test_gold[index] if x in prediction[index]])
    total += len(prediction[index])

precision = 100 * correct / total
print('Precision: {:.2f}%'.format(precision))

Precision: 7.14%


In [13]:
# Micro-averaged Recall
correct = 0
total = 0
for index in range(len(list(testset.iter('sentence')))):
    correct += len([x for x in test_gold[index] if x in prediction[index]])
    total += len(test_gold[index])

recall = 100* correct / total
print('Recall: {:.2f}%'.format(recall))

Recall: 82.26%


In [14]:
print('F-measure: {:.2f}%'.format((2 * precision * recall) / (precision + recall)))

F-measure: 13.13%


In [16]:
# Save the prediction (Optional)
import re
for sentence_node in testset.iter('sentence'):
    sentence_opinions = []
    opinions_node = sentence_node.xpath('./Opinions')
    if opinions_node:
        opinions_node = opinions_node[0]
    else:
        opinions_node = etree.SubElement(sentence_node, 'Opinions')
        
    for opinion_node in sentence_node.xpath('./Opinions/Opinion'):
        opinions_node.remove(opinion_node)
    
    text = sentence_node.xpath('./text/text()')[0]
    for m in re.finditer(targets_pattern, text):
        opinion_node = etree.SubElement(opinions_node, 'Opinion')
        opinion_node.set('target', m.group())
        opinion_node.set('from', str(m.start()))
        opinion_node.set('to', str(m.end()))
        
etree.ElementTree(testset.getroot()).write('../corpus/pred.xml',encoding='utf8', xml_declaration=True, pretty_print=True)

## 2. All aspects in trainset removing stopwords

In [23]:
from collections import Counter
from nltk.corpus import stopwords
# build the targets
stopwords = stopwords.words('portuguese') + ['esse', 'ser', 'ele', 'isso']
targets = Counter([opinion_node.get('target').lower()
                    for opinion_node in trainset.iter('Opinion')
                    if opinion_node.get('target') != 'NULL' and
                       opinion_node.get('target').lower() not in stopwords
                   ])

# print the targets
import ipy_table
data = [['freq', '%freq', 'target']]
for target, freq in targets.most_common(20):
    ratio = freq / sum(targets.values()) *100
    data.append([freq, '{:.1f}%'.format(ratio), target])

ipy_table.make_table(data)
ipy_table.apply_theme('basic')

0,1,2
freq,%freq,target
681,33.9%,livro
150,7.5%,história
88,4.4%,leitura
59,2.9%,personagens
47,2.3%,crepúsculo
44,2.2%,narrativa
42,2.1%,romance
39,1.9%,obra
35,1.7%,final


In [49]:
import re

# function to evaluate targets in testset and return precision, recall and f-measure
def evaluate(targets):

    # Build a regex to match the targets in the text
    targets_list = sorted(list(targets))
    targets_list.sort(key=len, reverse=True)
    targets_pattern = r'\b(' + '|'.join([re.escape(t) for t in targets_list]) + r')\b'

    test_gold = list()
    prediction = list()

    for sentence_node in testset.iter('sentence'):    
        sentence_opinions = []
        for opinion_node in sentence_node.iter('Opinion'):
            target = opinion_node.get('target')
            start = int(opinion_node.get('from'))
            end = int(opinion_node.get('to'))
            # evaluation explicit says to discart NULL values
            if target != 'NULL':
                sentence_opinions.append((target, start, end))
        test_gold.append(sentence_opinions)

        text = sentence_node.xpath('./text/text()')[0]
        text_opinions = []

        for m in re.finditer(targets_pattern, text, flags=re.I):
            text_opinions.append( (m.group(), m.start(), m.end()) )
        prediction.append(text_opinions)
        
    # Micro-averaged Precision
    correct = 0
    total = 0
    for index in range(len(list(testset.iter('sentence')))):
        correct += len([x for x in test_gold[index] if x in prediction[index]])
        total += len(prediction[index])

    precision = 100 * correct / total    

    # Micro-averaged Recall
    correct = 0
    total = 0
    for index in range(len(list(testset.iter('sentence')))):
        correct += len([x for x in test_gold[index] if x in prediction[index]])
        total += len(test_gold[index])

    recall = 100* correct / total
    
    # F-measure
    if precision + recall != 0:
        fmeasure = (2 * precision * recall) / (precision + recall)
    else:
        fmeasure = 0
    
    return (precision, recall, fmeasure)
    
  

In [47]:
precision, recall, fmeasure = evaluate(targets)
print('Precision: {:.2f}%'.format(precision))
print('Recall: {:.2f}%'.format(recall))
print('F-measure: {:.2f}%'.format(fmeasure))

Precision: 7.14%
Recall: 82.26%
F-measure: 13.13%


## 3. All aspects in trainset with a cut in frequency

In [60]:
# build the targets
targets = Counter([opinion_node.get('target').lower()
                    for opinion_node in trainset.iter('Opinion')
                    if opinion_node.get('target') != 'NULL'])

data = [['cut', 'number of targets', 'precision', 'recall', 'f-measure']]
for min_freq in range(0,11,1):
    target_list = [target for target, freq in targets.items() if freq/sum(targets.values()) >= min_freq/100]
    precision, recall, fmeasure = evaluate(target_list)
    data.append(['{:.1f}%'.format(min_freq), 
                 len(target_list),
                 '{:.2f}%'.format(precision), 
                 '{:.2f}%'.format(recall), 
                 '{:.2f}%'.format(fmeasure)])

ipy_table.make_table(data)
ipy_table.apply_theme('basic')

0,1,2,3,4
cut,number of targets,precision,recall,f-measure
0.0%,436,7.14%,82.26%,13.13%
1.0%,15,25.66%,64.95%,36.79%
2.0%,7,30.27%,55.51%,39.17%
3.0%,3,31.39%,45.35%,37.10%
4.0%,3,31.39%,45.35%,37.10%
5.0%,2,31.85%,41.92%,36.20%
6.0%,2,31.85%,41.92%,36.20%
7.0%,2,31.85%,41.92%,36.20%
8.0%,1,33.57%,33.62%,33.60%


## 3. All aspects with relative frequency
 
The token must happen % of the time annotated as aspect in the text.

In [64]:
sentences = ' '.join([node.get('form').lower() for node in testset.iter('word')])
freqlist =  Counter([node.get('target').lower() for node in testset.iter('Opinion')])

# build the targets
targets = Counter([opinion_node.get('target').lower()
                    for opinion_node in trainset.iter('Opinion')
                    if opinion_node.get('target') != 'NULL'])

data = [['cut', 'number of targets', 'precision', 'recall', 'f-measure']]
for min_freq in range(0,100,5):
    
    target_list = [target for target, freq in targets.items() if freq/max(sentences.count(' ' + target + ' '),0.00001) >= min_freq/100]
        
    precision, recall, fmeasure = evaluate(target_list)
    data.append(['{:.1f}%'.format(min_freq), 
                 len(target_list),
                 '{:.2f}%'.format(precision), 
                 '{:.2f}%'.format(recall), 
                 '{:.2f}%'.format(fmeasure)])

ipy_table.make_table(data)
ipy_table.apply_theme('basic')

0,1,2,3,4
cut,number of targets,precision,recall,f-measure
0.0%,436,7.14%,82.26%,13.13%
5.0%,406,16.65%,80.11%,27.57%
10.0%,390,19.03%,78.25%,30.61%
15.0%,373,21.94%,74.82%,33.93%
20.0%,367,22.74%,74.11%,34.80%
25.0%,344,25.34%,70.24%,37.24%
30.0%,340,25.71%,70.24%,37.64%
35.0%,320,26.25%,68.53%,37.96%
40.0%,318,26.57%,67.95%,38.20%
