# Prologue: formal semantics and natural languages

The traditional list of linguistic analysis steps does not stop at syntactic analysis: we are typically interested in the syntactic structure of a text's sentences because it can be used to get to their __semantic content__ (meaning).

## Formal semantics for formal languages

Starting from Gottlob Frege, in the (analytical) philosophical tradition a strong connection was assumed between formal logic and semantics: 

The assumption was that 
+ knowing the meaning of a proposition is to know its truth conditions (under which circumstances it would be true)

And the work of Frege, Tarski and others demonstrated that in the case of "well-behaving" formal languages studied by mathematical logic and model theory, these truth-conditions can be calculated recursively on the basis of the syntactic structure of a well formed sentence/formula, and, again for well behaved formal languages, precise mathematical definitions can given for the semantic and logical notions of
+ truth
+ consequence,
+ consistency,
+ inconsistency.

The above listed philosophers and logicians did not think that the same methods can be used to analyze the semantic content and relations of everyday natural language texts and utterances, because __they saw natural language as too paradoxical, ambiguous__ etc. for that. Their proposal (especially Frege's and Russell's) was rather to use the well understood formal languages for scientific purposes, e.g. for discussing mathematics and mathematical concepts:

<img src="https://writings.stephenwolfram.com/data/uploads/2010/11/principia_1b1.jpg">

(A page from Russell's [Principia Mathematica](https://en.wikipedia.org/wiki/Principia_Mathematica). Image source: [Stephen Wolfram's blog: 100 years from Principia Mathematica](https://writings.stephenwolfram.com/2010/11/100-years-since-principia-mathematica/comment-page-1/))

## First-order languages and logic

Although there is a plethora of formal languages, so-called first-order languages became the industry standard in many areas (e.g., in mathematics), because they have some nice and unique properties:

+ they are expressive enough to be able to formalize most of our mathematical and physical theories
+ their semantics is simple and well understood
+ and the there are simple rules of derivation (the first-order calculus) with which any logically valid argument can be derived in a finite number of steps.
+ consequently, the __inconsistency__ of a set of sentences can also be detected in a finite number of steps.

The syntax is probably familiar:

<img src="http://drive.google.com/uc?export=view&id=1BDUx0556g4EgVlBjDT67ogWPN0BASxGA" width="500px">

## Formal semantics for natural languages

In the early 1970's Richard Montague, David Lewis and others -- building on the development of modern syntactic theories for English and other natural languages by Chomsky and others -- proposed (pace Frege and others) that the methods of formal semantics that were so fruitful in the case of formal languages can applied to natural languages as well. The two most important approaches were

+ __treating natural languages as (rather complicated) formal languages__ and directly provide a formal semantics for them,
+ __"translating" natural language texts into formal language formulas__  with the same (or at least similar) content, and rely on the formal semantics of the formal language in question (e.g., on the semantics of first-order languages)

In AI, the second solution was more influential, so the following discussion will concentrate on semantic representations as formulas of a formal language with well defined formal semantics.

# Semantic representations in AI

## What is a semantic representation?
+ Formal structure representing the meaning of a text (fragment)
+ Represents literal meaning
+ Disambiguated (as far as possible)
+ Canonical (as far as possible) in the sense that a text meaning has a unique representation
+ There are efficient algorithms to determine their logical and semantic relationship to other semantic and knowledge representations

## Applications

-   Information retrieval ("semantic search")
-   Information extraction
-   Question answering
-   Any knowledge-based "intelligent" systems, in which there is a need to "mediate" between the internal knowledge representation and natural language input/output

## Typical semantic representations

-   __Logical formulas__

    -   First-order logic

<img src="http://languagelog.ldc.upenn.edu/myl/llog/HobbesLogic.gif">([Image source](http://itre.cis.upenn.edu/~myl/languagelog/archives/005380.html))

    -   Modal/temporal logics

    -   Higher-order logics (type-theory)

    -   Description logics (ontologies)
    
<img src="https://www.researchgate.net/profile/Andras_Simonyi/publication/220123382/figure/fig1/AS:305910397325314@1449946130603/Ontology-fragment-and-semantic-representation-corresponding-to-the-sentence-Tegnap-fajt-a.png" width="500px">

(Image source: [How to represent meanings in an ontology](https://www.researchgate.net/publication/220123382_How_to_Represent_Meanings_in_an_Ontology))

-   __[Feature structures/Attribute-value matrixes (AVMS)](https://en.wikipedia.org/wiki/Feature_structure)__
<img src="http://drive.google.com/uc?export=view&id=1W238entkBqpOCO-gNakBx2Y2ImzUrXYQ" width="250">(Image source: [Kallmeyer and Osswald: Syntax-driven
semantic frame composition
in Lexicalized Tree Adjoining Grammars](https://www.researchgate.net/publication/270496937_Syntax-driven_semantic_frame_composition_in_Lexicalized_Tree_Adjoining_Grammars))

-   __[Conceptual graphs](http://www.jfsowa.com/cg/)__
<img src="http://www.jfsowa.com/figs/suethink.gif" width="300">([Image source](http://www.jfsowa.com/cg/cgexampw.htm))

-   __Frame-representations__:
<img src="https://www.researchgate.net/profile/Christian_Chiarcos/publication/284548915/figure/fig2/AS:696747286360069@1543128906623/Representing-and-integrating-annotations-for-syntax-and-frame-semantics-in-a-directed.ppm" width="700px">(Image source: [Towards open data for linguistics: Lexical Linked Data](https://www.researchgate.net/publication/284548915_Towards_open_data_for_linguistics_Lexical_Linked_Data))

-   [__Discourse representation structures (DRS)__](https://plato.stanford.edu/entries/discourse-representation-theory/):<img src="http://drive.google.com/uc?export=view&id=1lfv5qiBKlbOKHaqFQahiHuxUAUXwq2iD" width="400">
(Image from [Blackburn and Bos (2004): Working with Discourse
Representation Theory](http://www.let.rug.nl/bos/comsem/book2.html))

(these categories are not necessarily mutually exclusive)


## Basic tasks for logic-based semantic representations

-    __Model checking__: Is the  $\varphi$ formula true in an 
    $\mathcal M$ model? Use case: e.g., querying a knowledge base to answer questions.
   (Solvable in first-order logic for finite models.)

-    __Consistency checking__: Is the $\Phi$ set of formulas consistent?
    Use case: e.g., detecting incorrect semantic representations. (Only partially solvable in first-order logic.)

-    __Checking informativity__: Is the $\varphi$ formula a tautology?
    Use case: e.g., detecting problematic semantic representations.
    (Only partially solvable in first-order logic.)

-   __Information content__: Is the $\varphi$ formula a logical consequence of the $\psi$ formula?
    Use case: e.g., detecting problematic semantic representations, question answering. (Only partially solvable in first-order logic.)

## Challenges for logic-based representations

-   *Tom visits Eve.* $\Longrightarrow$
    Visits$(\mathrm{Tom},\mathrm{Eve})$

-   **Time:** *Tom visited Eve.*

-   **Events:** *Tom visited Eve and this caused her divorce from John.*

-   **Modality:** *It is possible that Tom visited Eve.*

-   **Uncertainty:** *Tom probably visits Eve.*

-   **Indirect speech:** *John said that Tom visited Eve.*

-   **Propositional attitudes:** *John thinks that Tom visited Eve.*

-   **Facts:** *The fact that Tom visited Eve surprised John.*

-   **Groups and sets:** *The Smiths visited Eve.*

# Event semantics

##  Davidson's event semantics (1967)

Let's treat events as first class members of the quantification domain!

-   *John  visits Eve.* $\Longrightarrow$
    $\exists e~$ Visits$(e,\mathrm{John},\mathrm{Eve})$

Certain problems are solved:

-   *John visited  Eve.* $\Longrightarrow$
    $\exists e($Visits$(e,\mathrm{John},\mathrm{Eve})~\land~$Past$(e))$

-   *Tom visited Eve and this caused her divorce from John.* $\Longrightarrow$
    $\exists e_1,e_2($Visits$(e_1,\mathrm{Tom},$
    Eve$)~\land~$Past$(e_1)\land$\
    Divorces$(e_2$,
    Eve, John$)~\land~$Past$(e_2)\land \mathrm{Causes}(e_1,e_2))$
    
## Event semantics with semantic roles: Parsons's neo-Davidsonian program (1990)

Verbs correspond to one-argument predicates in the logical representation, verb arguments are connected to the event via semantic role relations.
-   *Joe visits Eve.* $\Longrightarrow$
    $\exists
        e~$(Visit$(e)~\land~\mathrm{Agent}(e,\mathrm{Joe})~\land~$Co-agent$(e,\mathrm{Eve}))$

In contrast to the original Davidsonian approach, different frames belonging to the same event type can be represented by the same predicate:

-   *Pete eats.* $\Longrightarrow$
    $\exists e ($Eating$(e)~\land~\mathrm{Agent}(e, \mathrm{Pete}))$

-   *Pete eats an apple.* $\Longrightarrow$
    $\exists e ($Eating$(e)~\land~\mathrm{Agent}(e, \mathrm{Pete})~\land~$Object$(e, \mathrm{apple}))$
    

## Semantic roles 

+ __Thematic roles__: A small set of domain-independent roles. (E.g.,
in Parsons (1990), Agent, Theme, Goal, Beneficent, Instrument, Patient)
There are more elaborate role sets, e.g. the [VerbNet](https://verbs.colorado.edu/verbnet/) verb-role lexicon project uses

<img src="https://image2.slideserve.com/4567763/verbnet-thematic-roles-l.jpg" width="500px">

([Image source](https://www.slideserve.com/idalia/propbank-verbnet-semlink))

+ __Proto-roles__: A *very* small set of "bundle concepts" (E.g. in  Dowty (1989),
Proto-agent (intentional, feels/perceives, causes the event, moves etc.)
and Proto-patient (its state changes, is causally effected etc.)

+ __Frame elements__: A large number of frame-specific
roles. E.g. in the [FrameNet lexicon](https://framenet.icsi.berkeley.edu/fndrupal/) (1998--) the Studying frame is characterized as



<img src="http://drive.google.com/uc?export=view&id=1OuyS0Ka-8E396n0ALCd-TB_f98rjl8zM" width="100%">

+ __Hybrid solutions__: E.g., in  the Proposition Bank (PropBank, 2005) corpus, which is a semantic role-
annotated corpus based on the Penn treebank, every verb meaning is mapped to specific, numbered semantic roles
(V.Arg1, V.Arg2 etc.), but Arg1 typically corresponds to prot-agent and Arg2 to proto-patient.

<img src="https://orbifold.net/default/wp-content/uploads/2018/06/Screen-Shot-2018-06-18-at-06.38.06.png" width="500">

# How to generate semantic representations?

## Compositional rules

+ __Compositionality principle (Frege)__: The meaning of an expression is determined by the meaning of its constituent expressions and their mode of composition.

+ __Rule-to-rule principle:__ For every syntactic construction rule there is a corresponding semantic rule which specifies the meaning of the output of the syntactic rule as a function of the inputs' meanings.

For instance, in a context-free grammar,
the semantic rule corresponding to the $$\alpha \rightarrow  \beta_1\dots\beta_n$$ syntactic rule will have the form 
$$\mathrm{Sem}(\alpha) = \Phi(\mathrm{Sem}(\beta_1),\dots,\mathrm{Sem}(\beta_n))$$.

Most rule-to-rule semantic systems are __homogeneous__: the semantic rules are all instances of one rule type or to very small number of types.

Two popular choices:

-  __Function application__: The semantic value of the output is calculated by __functionally applying__ the semantic value of one of the inputs to the semantic values of the other inputs. This is typically implemented using lambda expressions and lambda calculus.

- __Unification__: The semantic value of the output is the __unification__ (union) of the semantic values of the inputs. Typical implementation: attribute-value matrixes (AVMs) and AVM unification.
   
__Examples of rule-to-rule representation generation:__

1. __Function application with lambda expressions__

Syntax:

<img src="http://drive.google.com/uc?export=view&id=1_4b9DSJF-xOxzlMYsGEMMjeAmyih3-50" width="300px">

Semantics:

<img src="http://drive.google.com/uc?export=view&id=1c9OvmFcU6UAWLRakZ7vD4CoVW0oNtMSO" width="300px">

2. __AVM unification__

<img src="https://www.researchgate.net/publication/270496937/figure/fig4/AS:668909032194060@1536491749861/Syntactic-and-semantic-composition-for-John-eats-pizza.png" width="600px">(Image source: [Kallmeyer and Osswald: Syntax-driven
semantic frame composition
in Lexicalized Tree Adjoining Grammars](https://www.researchgate.net/publication/270496937_Syntax-driven_semantic_frame_composition_in_Lexicalized_Tree_Adjoining_Grammars))

## ML-based approaches: supervised semantic representation tasks

- __Word Sense Disambiguation__: Find the (fine-grained) dictionary/lexicon meaning in which a token is used. This task is obviously dictionary/lexicon dependent, a common choice is labeling words with WordNet synset ids.

- __Semantic role labeling__: Identify the events/actions and the participants and their roles ("who did what to whom"). Again, the details of the task depend on the used notion of event type and semantic role. There are separate tasks to label with PropBank roles and FrameNet frames and frame elements.

<img src="https://www.researchgate.net/profile/Sue_Kase/publication/268520798/figure/fig1/AS:295298170671121@1447415978508/A-FrameNet-based-Semantic-Role-Labeling-process-uncovers-three-frames-in-the-target.png" width="600px">

(Image from the paper [Kase: Accelerating Exploitation of Low-grade Intelligence Through Semantic Text Processing of Social Media](https://www.researchgate.net/figure/A-FrameNet-based-Semantic-Role-Labeling-process-uncovers-three-frames-in-the-target_fig1_268520798))

- __"SemBanks"__: In the last few years there have been projects to create supervised data sets with full semantic representations for natural language texts. The two most important are

    1. [__Abstract Meaning Representation Bank__](https://amr.isi.edu/) (USA, event semantic representations with PropBank semantic roles)
    
        <img src="http://drive.google.com/uc?export=view&id=1UvzH3QyC05jdYfH2FreRoked3ZjFYMfi" width="400px">(Image from the paper [Konstas et al: Neural AMR: Sequence-to-Sequence Models for Parsing and Generation](https://arxiv.org/pdf/1704.08381.pdf))
    
    2. [__Groningen Meaning Bank__](https://gmb.let.rug.nl/) (DRT, VerbNet and WordNet based)
    
<img src="https://d3i71xaburhd42.cloudfront.net/83c11b732c2521bf240853f241097521be9fdb63/7-Figure15-1.png" width="300">(Image source: [From Discourse Representation Structure to event semantics: A simple conversion?
](https://www.semanticscholar.org/paper/From-Discourse-Representation-Structure-to-event-A-Dakota-K%C3%BCbler/83c11b732c2521bf240853f241097521be9fdb63))

# Tasks and problems beyond sentence semantics

+ __Anaphora resolution__: 

> __John Smith__ started working in 1970. __He__ and his colleagues are forced into retirement now.

+ __Coreference resolution__:

>There was a huge interest in the trial of __Uli Hoeness__.
The __president of Bayern München__ is accused of tax evasion.

+ __"Bridging"__:

> Yesterday I have bought a __bicycle__. Unfortunately, the __handlebar__ is unusable.

+ __Discourse relations__:

>The man burst into the kitchen. He tore open the fridge door. He has not eaten for 24 hours.


## Further readings

+ [Representation and Inference by Blackburn and Bos](http://www.sfs.uni-tuebingen.de/~keberle/Lit/Blackburn_1997_RIN.pdf) is a classic introduction to logic-based computational semantics.
+ For a more recent, functional programming based alternative see: Van Eijck, Jan, and Christina Unger. Computational semantics with functional programming. Cambridge University Press, 2010.
+ Good shorter overview papers:
    + [Blackburn and Bos: Computational semantics (2003)](https://dialnet.unirioja.es/descarga/articulo/499187.pdf)
    + [Bos (2011): A Survey of Computational Semantics: Representation,Inference and Knowledge in Wide-Coverage TextUnderstanding](http://www.let.rug.nl/bos/pubs/Bos2011Compass.pdf)