# 12. Semantics 2: common tasks in computational semantics

### 12.1 [Overview](#12.1)

### 12.2 [Relation extraction](#12.2)

### 12.3 [Sentiment analysis](#12.3)

### 12.4 [Question answering](#12.4)

## 12.1 Overview
<a id='12.1'></a>

Three common tasks in computational semantics:

 - Relation extraction (RE) - obtaining structured information from language data. E.g. parsing CVs to build a database for recruitment

 - Sentiment analysis (SA) - detecting attitudes and opinions in text, e.g. user reviews of movies, products, etc.

 - Question answering (QA) - detecting a particular information need in user input and fulfilling it based on some text

## 12.2 Relation extraction
<a id='12.2'></a>

obtaining structured information from language data

RE systems typically target a particular domain, e.g.:

   - professional profile information based on CVs, Linkedin pages

   - product specifications based on e-commerce sites (webshops), manufacturers' websites

   - stock price information based on news articles

   - etc.

### Example

_"Gryffindor values courage, bravery, nerve, and chivalry. Gryffindor's mascot is the lion, and its colours are scarlet and gold. The Head of this house is the Transfiguration teacher and Deputy Headmistress, Minerva McGonagall until she becomes headmistress, and the house ghost is Sir Nicholas de Mimsy-Porpington, more commonly known as Nearly Headless Nick. According to Rowling, Gryffindor corresponds roughly to the element of fire. The founder of the house is Godric Gryffindor."_

- values(Gryffindor, courage)
- mascot(Gryffindor, lion)
- color(Gryffindor, scarlet)
- head(Gryffindor, Minerva_McGonagall)
- house_ghost(Gryffindor, Sir_Nicholas_de_Mimsy-Porpington)
- founder(Gryffindor, Godric_Gryffindor)

###  Rule-based approaches

Templates:
- _X dropped by Y points_ -> drop_by(X, Y)
- _X, CEO of Y_ -> ceo_of(X, Y)
- _X was born in Y_ -> born_in(X, Y)

If parsers/NER-taggers/Chunkers are available, templates can refer to their output:

- X_NP _dropped by_ Y_NUM _points_ -> drop_by(X, Y)

or

- X_PERSON, CEO _of_ Y_ORGANIZATION -> ceo_of(X, Y)
- X_PERSON _was born in_ Y_LOCATION -> born_in(X, Y)

#### Pros:

- simple and effective, yields fast results

- high precision

#### Cons:

- low recall

- limited, no capacity for generalization

- real-life systems may contain thousands of templates, and require continuous development by experts

- many companies still depend on such systems

### Supervised learning

Use parsed and annotated text to train text classifiers

E.g. decide for each pair of named entities (PERSON and ORGANIZATION) whether they are in the "ceo_of" relationship, based on context features

### Supervised learning

Features typically include:

- headwords

- word/POS ngrams with position information

- NER/Chunk tags, Chunk sequences

- Paths in parse trees among candidates (e.g. N - NP - S - VP - NP - N)

- gazetteer features: whether words/phrases appear on an external list of known entities

#### Pros:

-  Effective if large training sample is available and target texts are similar

#### Cons: 

- requires a fair amount of annotated data (costly to produce)

- doesn't generalize well across genres, domains

### Semi-supervised / unsupervised approaches

When little or no training data is available, we must use what we have to generalize:

- a few annotated examples -> some patterns -> more examples -> more patterns

- a few patterns -> some examples -> more patterns -> more examples

### Example

seed tuple: author(William_Shakespeare, Hamlet) 

found instances:
- _William Shakespeare's Hamlet_
- _the William Shakespeare play Hamlet_
- _Hamlet by William Shakespeare_
- _Hamlet is a tragedy written by William Shakespeare_

extracted patterns:
- X's Y
- the X play Y
- Y by X
- Y is a tragedy written by X

Finally, use these patterns to find new seeds

## 12.3 Sentiment analysis
<a id='12.3'></a>

Also called __opinion mining__ - the task of extracting opinions, emotions, attitudes from user-generated text, e.g. about __products__, __movies__, or __politics__

- Simplest version: is the attitude of a text positive or negative

- More complex: measure attitude on a scale (e.g. from 1 to 5)

- Most complex: detect __target__ of opinion (e.g. what product is it about) or __aspect__ (e.g. is it about the price, looks, or quality of a product)

### Baseline approach:

Use training data, extract standard features such as:

- bag-of-words

- ngrams

- emoticons

- numbers, dates

- gazetteer features (based on _Sentiment lexicons_)

Use these to train standard classifiers, e.g. Naive Bayes, SVM, MaxEnt, etc.

## Example

![title](media/sa.jpg)

### Advanced techniques

- use semi-supervised methods to learn sentiment lexicons

- model negation explicitly

## 12.4 Question answering
<a id='12.1'></a>

One of the oldest and most popular tasks in AI

Recent products include Apple Siri, Amazon Alexa, or IBM's Watson

[Watch IBM's Watson win the game show Jeopardy](https://www.youtube.com/watch?v=WFR3lOm_xhE)

## Major approaches:

   - IR-based: handle questions as search queries (e.g. Watson, Google)

![title](media/google_qa.jpg)

   - Knowledge-based: convert question into a Relation Extraction task (e.g. Watson, Siri, Wolfram Alpha)

![title](media/wolfram_qa.jpg)

## IR-based approaches:

- detect question type

- generate search queries from questions

- retrieve ranked documents

- extract relevant passages, rerank

- extract answer candidates

- rank answers

### Question processing:

_They’re the two states you could be reentering if you’re crossing
Florida’s northern border_

- Answer Type: US state
- Query: two states, border,Florida,north
- Focus: the two states
- Relations: borders(Florida,?x,north)

### Answer type detection

![title](media/answer_types.jpg)

Supervised learning can be used to train a classifier on annotated data. See also [Li & Roth 2002](https://dl.acm.org/citation.cfm?id=1072378)

### Keyword extraction

![title](media/keyword_extraction.jpg)

From a slide by [Mihai Surdenau](http://www.surdeanu.info/mihai/)

### Answer extraction

- matching question and answer type

- this'll yield several answer candidates that need to be ranked

### Answer ranking

Some features for learning to rank:

- Answer type match:  Candidate contains a phrase with the correct answer type.
- Pattern match: Regular expression pattern matches the candidate.
- Question keywords: # of question keywords in the candidate.
- Keyword distance: Distance in words between the candidate and query keywords 
- Novelty factor: A word in the candidate is not in the query.
- ...

Source: [Jurafsky-Manning slides](http://spark-public.s3.amazonaws.com/nlp/slides/qa.pdf)

Approaches to ranking can be:
- pointwise
- pairwise
- listwise

see [Agarwal et al. 2012](http://www.prem-melville.com/publications/ranking-Watson-QA-cikm2012.pdf) for a short survey of algorithms

## Knowledge-based approaches:

- Create semantic representation of query (understand what is being asked!)

_Whose granddaughter starred in E.T.?_

  (acted-in ?x “E.T.”)
  
  (granddaughter-of ?x ?y)

- query relevant databases and ontologies

## Hybrid systems

- candidate answers are generated with IR-based methods, using shallow semantic represenations

- answers are reranked using knowledge-based methods