## Baseline - Frequent Aspect Extraction
Baseline is a adaptation for the baseline used in he opinion target extraction (OTE) for 2014, 2015 and 2016 versions of SemEval Aspect-Based Sentiment Analysis Task.

Source:

Pontiki, M., Galanis, D., Pavlopoulos, J., Papageorgiou, H., Androutsopoulos, I., & Manandhar, S. (2014, August). Semeval-2014 task 4: Aspect based sentiment analysis. In Proceedings of the 8th international workshop on semantic evaluation (SemEval 2014) (pp. 27-35).

Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S., & Androutsopoulos, I. (2015, June). Semeval-2015 task 12: Aspect based sentiment analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Association for Computational Linguistics, Denver, Colorado (pp. 486-495).

In [1]:
from __future__ import unicode_literals
from __future__ import division
from __future__ import print_function

In [2]:
from lxml import etree
parser = etree.XMLParser(remove_blank_text=True)
reviews = etree.parse('../corpus/SemEval_ABSA2015/SemEvalABSA2015EnglishRestaurants_train.xml', parser)

In [3]:
from collections import Counter
targets = Counter([opinion_node.get('target')
                    for opinion_node in reviews.iter('Opinion') 
                    if opinion_node.get('target') != 'NULL'
                   ])

In [4]:
import ipy_table

data = [['freq', '%freq', 'target']]
for target, freq in targets.most_common(15):
    ratio = freq / sum(targets.values()) *100
    data.append([freq, '{:.1f}%'.format(ratio), target])

ipy_table.make_table(data)
ipy_table.apply_theme('basic')

0,1,2
freq,%freq,target
146,11.4%,food
96,7.5%,service
80,6.3%,place
29,2.3%,restaurant
26,2.0%,staff
21,1.6%,Service
20,1.6%,pizza
20,1.6%,atmosphere
19,1.5%,sushi


In [5]:
# Build a regex to match the targets in the text
import re
targets_list = sorted(list(targets))
targets_list.sort(key=len, reverse=True)
targets_pattern = r'\b(' + '|'.join([re.escape(t) for t in targets_list]) + r')\b'
len(targets_list)

526

In [6]:
reviews = etree.parse('../corpus/SemEval_ABSA2015/SemEvalABSA2015EnglishRestaurants_test.xml', parser)

In [7]:
test_gold = list()
prediction = list()

for sentence_node in reviews.iter('sentence'):    
    sentence_opinions = []
    for opinion_node in sentence_node.xpath('./Opinions/Opinion'):
        target = opinion_node.get('target')
        start = int(opinion_node.get('from'))
        end = int(opinion_node.get('to'))
        # evaluation explicit says to discart NULL values
        if target != 'NULL':
            sentence_opinions.append((target, start, end))
    test_gold.append(sentence_opinions)
    
    text = sentence_node.xpath('./text/text()')[0]
    text_opinions = []
    
    for m in re.finditer(targets_pattern, text):
        text_opinions.append( (m.group(), m.start(), m.end()) )
    prediction.append(text_opinions)

In [8]:
len(test_gold)

685

In [9]:
len(prediction)

685

In [10]:
list(zip(test_gold, prediction))[1]

([('place', 17, 22)], [('place', 17, 22)])

### Evaluation methodology

Pontiki, M., Galanis, D., Papageorgiou, H., Manandhar, S., & Androutsopoulos, I. (2015, June). Semeval-2015 task 12: Aspect based sentiment analysis. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), Association for Computational Linguistics, Denver, Colorado (pp. 486-495).

http://www.anthology.aclweb.org/S/S15/S15-2082.pdf

From 4.1 Evaluation Measures, page 491:

Slot 2: F-1 scores are calculated by comparing
the targets that a system returned (for all the sentences)
to the corresponding gold targets (using
micro-averaging). The targets are extracted using
their starting and ending offsets. The calculation
for each sentence considers only distinct targets
and discards NULL targets, since they do not correspond
to explicit mentions

Task baseline F-1 score (Table 5): **48.06**

In [11]:
# Micro-averaged Precision
correct = 0
total = 0
for index in range(len(list(reviews.iter('sentence')))):
    correct += len([x for x in test_gold[index] if x in prediction[index]])
    total += len(prediction[index])

precision = 100 * correct / total
print('Precision: {:.2f}%'.format(precision))

Precision: 51.93%


In [12]:
# Micro-averaged Recall
correct = 0
total = 0
for index in range(len(list(reviews.iter('sentence')))):
    correct += len([x for x in test_gold[index] if x in prediction[index]])
    total += len(test_gold[index])

recall = 100* correct / total
print('Recall: {:.2f}%'.format(recall))

Recall: 56.28%


In [13]:
print('F-measure: {:.2f}%'.format((2 * precision * recall) / (precision + recall)))

F-measure: 54.02%


In [14]:
import re
for sentence_node in reviews.iter('sentence'):
    sentence_opinions = []
    opinions_node = sentence_node.xpath('./Opinions')
    if opinions_node:
        opinions_node = opinions_node[0]
    else:
        opinions_node = etree.SubElement(sentence_node, 'Opinions')
        
    for opinion_node in sentence_node.xpath('./Opinions/Opinion'):
        opinions_node.remove(opinion_node)
    
    text = sentence_node.xpath('./text/text()')[0]
    for m in re.finditer(targets_pattern, text):
        opinion_node = etree.SubElement(opinions_node, 'Opinion')
        opinion_node.set('target', m.group())
        opinion_node.set('from', str(m.start()))
        opinion_node.set('to', str(m.end()))
        
etree.ElementTree(reviews.getroot()).write('../corpus/pred.xml',encoding='utf8', xml_declaration=True, pretty_print=True)        