# Semantics in general

"Semantics ... is the linguistic and philosophical **study of meaning**, in language, programming languages, formal logics, and semiotics. It is concerned with the relationship between **signifiers** — like words, phrases, signs, and symbols — and what they stand for, their **denotation**." [Wikipedia](https://en.wikipedia.org/wiki/Semantics)

In semantics, we step over the study of the linguistic signifiers, their structural properties, and we try to establish a **connection between the signs and the original entities and events they are intended to represent**. As such, semantics has strong connections with philosophy, since it tries to capture the inner processes of meaning creation (eg. Why and how do we get to the meaning of utterances?) and their "truth value" (eg. Is this a true proposition?).

It is in itself a huge problem in philosophy of language how can we establish any kind of connections between words and "meanings". In the early 20th century the rising school of analytic philosophy, and it's founding father **Gottlob Frege** elaborated a system, in which all meanings have to be **connected to "senses", which are in themselves atomic and factual**, thus enabling an analytic discussion of true propositions. (see [Frege's Theory of Sense and Denotation](https://plato.stanford.edu/entries/frege/#FreTheSenDen)) This view heavily influenced early semantic technologies, especially the "ontological" paradigm.

In short, this paradigm tells us: **a word's meaning is what it "points to"**.

On the other hand the in the **later works of Wittgenstein** we witness a shift even inside the analytical philosophy tradition towards a strong **emphasis on the conventional, contextual mechanisms** that underlie the creation of meaning. (see the concept of [language games](https://en.wikipedia.org/wiki/Language_game_(philosophy))) This position emphasizes, that context and (interpersonal) usage are the foundational aspects of meaning.

In short, this paradigm tells us: **a word's meaning is how we use it (in context)**.

This dichotomy of approaches will strongly inform the different modeling solutions for semantics.


# Lexical semantics

## Early days 

In it's simplest and naive form semantics presupposes, that based on the compositional structure of language, we can **trace back the meaning of complex propositions to individual signifiers, and some production rules** that govern the construction of meaning from these simple elements. 

Thus semantics in this simple sense is: **meaningful words + production rules**.

If we take on this view, it is all the more natural, that we propose: there is a fixed set of meaningful words, that form part of the **lexicon**, thus if we can capture the **definitions and relations** of lexical words, we created a base description of knowledge. This view can be traced back as to at least the work of great **encyclopedists** of the Enlightenment movement and MUCH before.

<a href="https://upload.wikimedia.org/wikipedia/commons/1/18/Pavao_Skali%C4%87%3B_Enciklopedija_ili_znanje_svijeta_svetih_i_svjetovnih_struka_%281559%29.jpg"><img src="https://drive.google.com/uc?export=view&id=1CAMS8bYmniJyRnAXJE6qrQDZgDfPT6Jf" width=40%></a>

(First known mention of the word "encyclopedia" in a book's title from 1559.)

Though - as the [Wikipedia](https://en.wikipedia.org/wiki/Encyclopedia), itself also THE encyclopedia notes:
"Encyclopedia entries are longer and more detailed than those in most dictionaries. Generally speaking, unlike dictionary entries — which focus on linguistic information about words, such as their etymology, meaning, pronunciation, use, and grammatical forms — encyclopedia articles focus on factual information concerning the subject named in the article's title."

It is a murky boundary between such linguistic and lexicographic resources, depending on how we understand "define", as well as some deeper paradoxes about the relationship between a definition of a word and the word itself (see **[Paradox of Analysis](https://en.wikipedia.org/wiki/Paradox_of_analysis))**.

## Lexical relations

The lexical semantic resources - beside the possible definitions of the "lexemes" themselves, importantly contain information about **relations** between individual units of meaning. 

Some of these are:

### Synonymy

Defines a relationship between to lexemes that mean (roughly) the same (in a given context).

Example:

<div class="alert alert-block alert-warning">
Love


Noun
    
S: (n) love (a strong positive emotion of regard and affection) "his love for his work"; "children need a lot of love"
S: (n) love, passion (any object of warm affection or devotion) "the theater was her first love"; "he has a passion for cock fighting"

...

S: (n) love (a score of zero in tennis or squash) "it was 40 love"
</div>

Please observe, that there is a really problematic entry also, which acts totally differently, though in the semantic resource quoted ([WordNet](http://wordnetweb.princeton.edu/perl/webwn?s=love&sub=Search+WordNet&o2=&o0=1&o8=1&o1=1&o7=&o5=&o9=&o6=&o3=&o4=&h=000000000000000000)) it is quite misleadingly marked.

### Homonymy

It is a quite frequent and annoying habit of us humans, to denote quite different entities and phenomena with the same names:  

<div class="alert alert-block alert-warning">
Bank

Noun
    
S: (n) bank (sloping land (especially the slope beside a body of water)) "they pulled the canoe up on the bank"; "he sat on the bank of the river and watched the currents"
S: (n) depository financial institution, bank, banking concern, banking company (a financial institution that accepts deposits and channels the money into lending activities) "he cashed a check at the bank"; "that bank holds the mortgage on my home"

...

S: (n) bank (a flight maneuver; aircraft tips laterally about its longitudinal axis (especially in turning)) "the plane went into a steep bank"
</div>

This is a great source of humor and an annoyance for us doing NLP, thus the lexicons usually include this relationship, but often in the form of different "meanings", eg. **bank(1), bank(2)** or **"synonym sets"**.


### Antonymy

Though we have a common sense understanding of dichotomies "good - bad" and so forth, the antonymy ("opposite of") relation does entail contradictions:

<div class="alert alert-block alert-warning">

"Is the opposite of a good laugh a bad laugh or a good cry?"

</div>


### [Hyponymy and hypernymy](https://en.wikipedia.org/wiki/Hyponymy_and_hypernymy)

In a strict philosophical sense the hypernymy - hyponymy relation should represent the "genus-species" relation, meaning: 

<div class="alert alert-block alert-warning">

dog (noun) -- is_a --> animal (noun)

</div>

"...Genus is that part of a definition which is also predicable of other things different from the definiendum. A triangle is a rectilinear figure; i.e. in fixing the genus of a thing, we subsume it under a higher universal, of which it is a species." [Wikipedia](https://en.wikipedia.org/wiki/Genus_(philosophy))

Usually we conceptualize this relation as **transferring all (???) the properties of the higher category** to the lower "instance". ("Inheritance relation")


### Meronymy

The "part of" relation should represent some taxative listing of the constituents for an entity. It is itself a distinct topic of philosophy called ["mereology"](https://en.wikipedia.org/wiki/Mereology), which struggles with the problems of how attributes of parts and wholes relate.

The definition of "part" is problematic, usually presupposing some kind of practical decomposability, thus we are not inclined to say:

<div class="alert alert-block alert-warning">
atom (noun) -- part_of --> dog (noun)
</div>

Much more likely:

<div class="alert alert-block alert-warning">
tail (noun) -- part_of --> dog (noun)
    
    
leg (noun) -- part_of --> dog (noun)

</div>

It is also unclear, what constitute essential parts of objects, thus is it always true, that (poor) dog has to have legs?


## Ontologies

Aka. semantic networks - aiming to be complete (???) and systematic (???) descriptions of a domain of knowledge. 

<a href="https://upload.wikimedia.org/wikipedia/commons/6/67/Semantic_Net.svg"><img src="https://drive.google.com/uc?export=view&id=1A2BoMZygXISOE9llXZ2FMgM1DA4s5LJ6" width=50%></a>

They typically utilize some additional, custom relations besides the ones listed above, which make sense in a given domain.  


## Rise and shine of ontologies

### Vision of the semantic web

Though from the perspective of the recent statistical revolution the ontological approach to semantics may seem outdated, we have to draw attention to a **"small side-effect"** of this research are, we call **"the World Wide Web"**

#### Sidenote: How the "web" came to be

The "Heavily Tagged Markup Language" - better known as **HTML**, which is the definitive document format enabling the "Web" (transported via the HTML Transport Protocol - aka. HTTP) - was an early product of the "knowledge description" approaches.

As such, HTML documents were created to use **"semantic tags"** to mark meaningful **structural parts** and **metadata** of documents, as well as connections between documents in the form of **"links"**. The goal was to capture the meaning and relatedness of pieces of knowledge in **machine readable form**.

Noone could foretell, that the original intended purpose - to codify the knowledge of physics at the laboratories of CERN and beyond - would be quickly eclipsed by the publication of "kitties", and thus forming the current web we know. 

<a href="https://d2908q01vomqb2.cloudfront.net/ca3512f4dfa95a03169c5a670a4c91a19b3077b4/2018/10/18/w3c_logo-800x400.jpg"><img src="https://drive.google.com/uc?export=view&id=16Akuz8OdYkftyyMA6r7ZPoFN0gPEmbd2" width=30%></a>

It is also worth mentioning, that the original **semantic** use of HTML was utterly corrupted by the early web's content creators, for example the use of **table tags** ("td", "tr"...) for visually positioning text for **display** purposes (instead of encoding the meaning of "an inserted table starts here") spiraled out of control, thus the entity defining the standards, the World Wide Web Consotrium (**"W3C"**) had to intervene. This marks the birth of [**"tableless design"**](https://en.wikipedia.org/wiki/Tableless_web_design), whereby the visual presentation properties of a document are clearly sparated by "cascading style sheets" **(CSS)** elements from the semantic structure marked in HTML. (And the executable logic represented by embedded JavaScript.)

<a href="https://upload.wikimedia.org/wikipedia/commons/thumb/9/93/CSS-shade.svg/275px-CSS-shade.svg.png"><img src="https://drive.google.com/uc?export=view&id=1UafyU4stv-xBIVI0drMu-xGnDdHUmyn0" width=25%></a>

### Extension of the web

The original concept of capturing and descibing knowledge in form of documents was greatly extended by the movement of the **"semantic web"**. The metadata initiative of the ["Dublin Core"](https://en.wikipedia.org/wiki/Dublin_Core) extended the description and mapping efforts to books, films and various other media (the notion of "multimedia" became prevalent).

The semantic web effort - aiming to "link" and "describe" all knowledge in a machine readable, processable form, lending itself to **"machine reasoning"** culminated in the creation of the [**Resource Description Framework" (RDF)**](https://en.wikipedia.org/wiki/Resource_Description_Framework). 

<a href="http://www.michelepasin.org/upload/researcher/software/CohereOnto.png"><img src="https://drive.google.com/uc?export=view&id=1QUweM5HgJU_pQa6YIc8Qxp4sCpfMYdjW" width=25%></a>

The RDF concept is a felxible, general purpose format - based on the [**"eXtended Markup Language" (XML)**](https://en.wikipedia.org/wiki/XML) - to describe conceptual relations in form of **triplets**, like "dog, is_a, animal" readable for computers. The grand vision of semantic web was to thoroughly annotate each online content element and computationally accessible **resource** with **"Universal Resource Identifiers" (URI)**, and build an all encompassing knowledge network, so that machines can easily reason over it. If the propositions and commands coming from "natural human language" can be mapped to structured queries over RDF stores [(**SPARQL Queries**)](https://en.wikipedia.org/wiki/SPARQL) retrieval of relevant content an execution of commands would be solved.


<a href="https://www.stardog.com/docs/img/starwars.png"><img src="https://drive.google.com/uc?export=view&id=1TtbOd0SydYYm5DzHKLhtCdl8axfgoiks" width=55%></a>

<a href="http://drive.google.com/uc?export=view&id=1An-sRb1o32nBcycqtBBO4Q4eZA6Hd3LZ"><img src="https://drive.google.com/uc?export=view&id=1r20AkxlxZPZ5v-69KAdkjQOb1TfOCQWp" width=55%></a>

For this to be achievable, we would need to define an overarching common sense **ontology** of things' meaning (as well as all subdomains). The original idea was to create a specific language that can serve as a basis for building up such knowledge base ontologies in form of the ["Web Ontology Language" (OWL)](https://en.wikipedia.org/wiki/Web_Ontology_Language). 

<a href="https://galaxyconsulting.weebly.com/uploads/3/9/7/8/3978170/42433.png?250"><img src="https://drive.google.com/uc?export=view&id=1F3spLnFNWgN-ZUMPHsilMPdycxXxvgJ6" width=10%></a>


Though OWL strived to become a standard for knowledge representation, it did not become as wide-spread as hoped for - partially because technical problems (see below), but partially based on lack or incentives (or presence of **destructive ones**). (For some illustration take the example of **"meta" tags** mentioned [here](https://twobithistory.org/2018/05/27/semantic-web.html). )

In short, the semantic web brings with it the stories of human failure.  

"In a 2004 paper, Ben Munat, then an academic at Evergreen State College, explains how search engines once experimented with using keywords supplied via the `<meta>` tag to index results, but soon discovered that unscrupulous webpage authors were including tags unrelated to the actual content of their webpage. As a result, search engines came to ignore the `<meta>` tag in favor of using complex algorithms to analyze the actual content of a webpage. Munat concludes that a general-purpose Semantic Web is unworkable, and that the focus should be on specific domains within medicine and science."

### Some widely used ontologies

Despite the "grand visions" not becoming reality, some ontology resources are seeing fairly widespread use.

#### [WordNet](https://wordnet.princeton.edu/)

<a href="https://byelka.files.wordpress.com/2013/07/wordnet.png"><img src="https://drive.google.com/uc?export=view&id=1VcvdQ6LfBoWlPr-JMjEVdJy2Idhd4biQ" width=30%></a>

"WordNet® is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts can be navigated with the browser. WordNet is also freely and publicly available for download. WordNet's structure makes it a useful tool for computational linguistics and natural language processing."

One of the most widely used semantic resources, it has default interfaces in many libraries, eg. in [NLTK](https://www.nltk.org/). 

Many of the illustrations above came from WordNet queries.

#### [(Open)Cyc](https://en.wikipedia.org/wiki/Cyc)

<img src="https://www.cyc.com/wp-content/uploads/2019/06/Cycorp_Logo_New.png" width=30%>

A gargantuan effort mainly by the group of Douglas Lenat who are firm proponents of "Good, Old-Fashinoed AI" (GOFAI), like for example [Gary Marcus](https://en.wikipedia.org/wiki/Gary_Marcus). 

"Cyc (/ˈsaɪk/) is the world's longest-lived artificial intelligence project, attempting to assemble a comprehensive ontology and knowledge base that spans the basic concepts and "rules of thumb" about how the world works (think common sense knowledge but focusing more on things that rarely get written down or said, in contrast with facts one might find somewhere on the internet or retrieve via a search engine or Wikipedia), with the goal of enabling AI applications to perform human-like reasoning and be less "brittle" when confronted with novel situations that were not preconceived."

Cyc has an open version, but the full version is a property of Cycorp Inc.


#### [DBPedia](https://wiki.dbpedia.org/about)

<a href="https://wiki.dbpedia.org/sites/default/files/DBpediaLogoFull.png"><img src="https://drive.google.com/uc?export=view&id=1mmG98du21wEcll9abnRURVGmXq2quq-B" width=25%></a>

The computer accessible version of Wikipedia in queryable database format.

"In addition, we provide localized versions of DBpedia in 125 languages. All these versions together describe 38.3 million things, out of which 23.8 million are localized descriptions of things that also exist in the English version of DBpedia. The full DBpedia data set features 38 million labels and abstracts in 125 different languages, 25.2 million links to images and 29.8 million links to external web pages; 80.9 million links to Wikipedia categories, and 41.2 million links to YAGO categories. DBpedia is connected with other Linked Datasets by around 50 million RDF links. Altogether the DBpedia 2014 release consists of 3 billion pieces of information (RDF triples) out of which 580 million were extracted from the English edition of Wikipedia, 2.46 billion were extracted from other language editions. Detailed statistics about the DBpedia datasets in 24 popular languages are provided at Dataset Statistics."

#### More

Many more ontologies are accessible, see for example:
[here](https://www.w3.org/wiki/Ontology_repositories), [here](http://www.obofoundry.org/) or [here](http://www.oor.net/)

## Limitations of ontologies

Though the grandiose promises of the above mentioned knowledge representation technologies were abundant, in due time (around 2006) severe limitations of the ontological paradigm became apparent.  

### Development costs

The development of detailed domain ontologies usually requires intense work from a committee of highly skilled experts who - when they are able to agree upon the exact terminology - will have to commit substantial resources just to cover even partially a domain.

In machine learning parlance, ontologies have a **huge recall / coverage problem**.

See more on ontology development below.

### Context dependence

Since the presence or absence of "edges" in a semantic network are general in nature, it is very difficult to denote relations which are only true in a definite context.

See for example the proposition below

<div class="alert alert-block alert-warning">
All fruits are healthy.


In the context of: 
- nutrition? maybe
- bio market? definitely
- trip to South-America? definitely not
</div>


And since nearly everything we would like to communicate has a certain context, this is marks a deeper problem that we would first want to acknowledge.

### "Rigidity"

#### "Binariness"

The customary way of representing semantic graphs as directed binary graphs ( dog -
Either I draw a link or I don't - too binary. But:

<div class="alert alert-block alert-warning">

(Spiderman) -- (enemy_of) --> (green-goblin)

Or in RDF Turtle syntax:

<http://example.org/#spiderman> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/#green-goblin> .

</div>

This means, that this relationship either holds or does not hold, we have no possibility to quantify "how much" (without additional notation), as well as to show probabilistic relations.

In some modern knowledge graph implementations - like Microsoft's Knowledge Graph - are storing a "confidence" or probability metric in from of an edge property so as to signify relationship "strength". This supposes a graph representation that can handle such information (see [property graph](http://graphdatamodeling.com/Graph%20Data%20Modeling/GraphDataModeling/page/PropertyGraphs.html)).


This "relaxation" of the model does not aid **interpretability** - since what is a 61.1% company anyhow?
On the other hand, if we consider, that all graphs can be represented in matrix form, can be understood as a "discretization" of some association matrix describing a space. We will come back to this in force at the next lecture.

#### "Resistance to change"

Updating a knowledge graph is in itself a considerable challenge, since if so meanings change, we may require a non-trivial restructuring of the whole representation, which can not always be carried out easily in an automatic manner. (We may have to define a complex ruleset of what happens with the graph, when for example a node gets split into two...)

Because of this **knowledge graphs are difficult to be kept up to date** without major manual effort.

Fro some discussion about problems see [this](http://www.cs.ox.ac.uk/ian.horrocks/Publications/download/2013/Horr13a.pdf) paper.

These problems combined with the breakthrough results achieved by deep learning methods drawn away the attention of the research community, so we are currently experiencing an **"ontology winter"**.


# Sidenote: Cognitive frames

Building on the contribution of [Charles Fillmore](https://en.wikipedia.org/wiki/Charles_J._Fillmore), one of the founders of cognitive linguistics a strand of research concerns itself with the study of cognitive structures underlying linguistic phenomena, especially the usage of ["semantic frames"](https://en.wikipedia.org/wiki/Frame_semantics_(linguistics))

"The basic idea is that one cannot understand the meaning of a single word without access to all the essential knowledge that relates to that word. For example, **one would not be able to understand the word "sell" without knowing anything about the situation of commercial transfer, which also involves, among other things, a seller, a buyer, goods, money, the relation between the money and the goods, the relations between the seller and the goods and the money, the relation between the buyer and the goods and the money and so on.**

Thus, a word activates, or evokes, a frame of semantic knowledge relating to the specific concept to which it refers (or highlights, in frame semantic terminology).

A semantic frame is a collection of facts that specify "characteristic features, attributes, and functions of a denotatum, and its characteristic interactions with things necessarily or typically associated with it." A semantic frame can also be defined as a coherent structure of related concepts that are related such that without knowledge of all of them, one does not have complete knowledge of any one; they are in that sense types of gestalt. Frames are based on recurring experiences. So the commercial transaction frame is based on recurring experiences of commercial transactions."

This view proposes the "groundedness" of linguistic phenomena in cognitive structures aquired by repeated experience and reflected by linguistic properties of especially verbs. The **["valency"](https://en.wikipedia.org/wiki/Valency_(linguistics))** of the verbs is in direct relationship with some set amount of cognitive frames it can evoke, so an "ontology" can be elaborated **on linguistic as well as cognitive basis**.    

FrameNet represents a big effort in this direction.

## [FrameNet](https://framenet.icsi.berkeley.edu/fndrupal/about)


<img src="https://framenet.icsi.berkeley.edu/fndrupal/sites/default/files/logo.jpg" width=25%>

"The FrameNet project is building a lexical database of English that is both human- and machine-readable, based on annotating examples of how words are used in actual texts. From the student's point of view, it is a dictionary of more than 13,000 word senses, most of them with annotated examples that show the meaning and usage. For the researcher in Natural Language Processing, the more than 200,000 manually annotated sentences linked to more than 1,200 semantic frames provide a unique training dataset for **semantic role labeling**, used in applications such as information extraction, machine translation, event recognition, sentiment analysis, etc."


## Wider outlook

It is worth mentioning in brief, that inside cognitive linguistics there is a strong strand of research that argues for the "groundedness" of language and cognition in bodily experience itself, thus maintaining, that the primitives of understanding, the basic "schemas" as well as the "frames" on top of them are in strong connection with body sensation and perception. 

This **"embodied cognitive science"** approach argues for the study of cognitive processes not as abstract representations, but situated in bodies and environments. (For further reference see the seminal paper of Geroge Lakoff on ["Cognitive linguistics"](https://cloudfront.escholarship.org/dist/prd/content/qt04086580/qt04086580.pdf 
) as well as the in depth book of [Várela, Thompson and Rosch](https://mitpress.mit.edu/books/embodied-mind-revised-edition))

Though some of these ideas are having an influence in robotics, widespread operationalization in NLP is still lacking.


# Usage of ontologies

We still can find much use for ontologies (for example) in the domains of:

- Knowledge retrieval ("Semantic search")
- Expert systems / Reasoning engines

## Semantic search

It is all the more natural that humans would expect to get relevant results for their queries over a given knowledge base, thus they expect:

- Disambiguation of **homonymy** cases ("I was searching for the financial institution, not the river part.")
- Inclusion of **synonyms** into the search ("I meant not just poodle, but doggy.")
- Possibility to move along **hypernyms and hyponyms** ("I just care about banks, not financial institutions in general, or vice-versa.")
- If feasible, also some information about **meronyms** ("What are the needed constituents for...")

<a href="http://drive.google.com/uc?export=view&id=1R_to0toK00vIWddcGqanhwXuLJRQUhMO"><img src="https://drive.google.com/uc?export=view&id=18tBGyh9q1P_S3S5BVAbxQuDYoG31rub4" width=45%></a>

Semantic search was and still is the eminent application area of ontologies, and in a well defined context of expert knowledge, it can represent a very strong baseline.

They also lend themselves well to **combination with statistical approaches**:

<a href="https://images.squarespace-cdn.com/content/v1/572d25ecd210b899879359a5/1462580290875-051VMEF868LARA4SKBY7/ke17ZwdGBToddI8pDm48kIKkglr3Q70cknLL-o2wz2YUqsxRUqqbr1mOJYKfIPR7-cSjRlYkxRIwAGDKGa5ishx9yxala7xiss0F540IdaVCRW4BPu10St3TBAUQYVKcDThydu1sUwopiAKzSF1MZjJ67NtUs71Nfluqs6LYzabBMcXV3oZspfyUDp2ywvph/Screen-Shot-2016-04-07-at-10.26.15-am.png"><img src="https://drive.google.com/uc?export=view&id=1WCapFIkqELvzO3_5DQGtj-bkPELqox5O" width=65%></a>


The main challenges in case of ontology based semantic search (besides the buildup of the ontology itself - elaborated later) are:

- Mapping of the natural language query to a formal representation
- Mapping of document (parts or elements) into the graph (aka. "tagging" some parts of texts)
- Weighting mechanism for presenting the results



## Expert systems

["Expert systems"](https://www.tutorialspoint.com/artificial_intelligence/artificial_intelligence_expert_systems.htm) have been considered prime forms of artificial intelligence (after the first AI winter).

The core notion of these systems was the capture and formal codification of expert knowledge in a specific field, say chemical engineering, and creation of systems capable of "reasoning" in the given domain. The knowledge representation could be a set of rules, but many times was a kind of ontology, describing the relationships between elements and entities of a domain. 

<a href="https://www.researchgate.net/profile/Md_Habib6/publication/308174425/figure/fig2/AS:407051659431937@1474060086324/Global-architecture-of-an-expert-system.png"><img src="https://drive.google.com/uc?export=view&id=1PCHuWOB4wV8eaR7eODUN6S-3cJx50-dE" width=50%></a>

A good example would be a medical self-diagnosis system, where the domain knowledge of doctors have been represented in a knowledge base, the user gives in a query with her symptoms, the system "traverses" possible avenues of reasoning, asks back if needed, then comes up with a possible diagnosis or suggestion for the user.

(The ethical dilemmas for this use case are significant.)



# Creation of ontologies

## The manual way

The traditional way for building up of knowledge bases presupposed, that though experts have the required domain knowledge, but lack the appropriate skills of formalizing it into a well structured knowledge base with strict format and content requirements. Thus the role of the **"knowledge engineer"** was born.

<a href="https://www.tutorialspoint.com/artificial_intelligence/images/expert_system.jpg"><img src="https://drive.google.com/uc?export=view&id=1xC6SQPHKISkKQehyWA-9xlY14MZIZ2bO" width=60%></a>

The knowledge engineer - one of the "rising" occupations of late 90s and early 2000s -  was responsibel for **facilitating** the knowledge extraction from domain experts, as well as for **codification fo knowledge"** into eg. OWL format.

Two major problems arise with this approach:
- The problem of consistency / methodological soundness
- The problem of scalability

### Trying to formalize a method: OntoClean

"[OntoClean](https://en.wikipedia.org/wiki/OntoClean) was the first attempt to formalize notions of ontological analysis for information systems. The idea was to justify the kinds of decisions that experienced ontology builders make, and explain the common mistakes of the inexperienced. Alan Rector, during a debate at the KR-2002 conference in Toulouse, said, "What you have done is reduce the amount of time I spend arguing with medics.""

OntoClean tried to establish the standards of methodology guiding the manual buildup of ontologies, as well as a set of criteria that define, **what is a good ontology**. 

There were strong discussions among the experts about some key notions, which - due to the "winter" of ontologies - were never settled. See for example [this](https://dl.acm.org/citation.cfm?id=2351611) paper. 

The main problem proved to be the scalability issue.

### Trying to use the power of the Crowd

One of the approaches trying to mitigate the scalability issues is "crowdsourcing", that is, breaking down the task of ontology building into elementary steps (like confirming or denying some relations) and trying to delegate these "micro tasks" to a crowd of people - usually not domain experts.

See for example [this](https://pdfs.semanticscholar.org/49d5/bfaebb9b5b0612bdbe85b6279e27593f1bdb.pdf) or [this](https://www.isi.edu/~gil/papers/gil-etal-iswc17.pdf) approach.

A tool set has been developed to integrate such crowdsourced workflows with the most wide-spread tool for ontology building, [Protegé](https://protege.stanford.edu/).

<a href="http://drive.google.com/uc?export=view&id=1Uf0fhcMSONRZThx3Jl9VNR8c45lOgD2-"><img src="https://drive.google.com/uc?export=view&id=1WQcBDgneUhCSnT7Gj6T5U6-W0N5bDKsj" width=60%></a>

This approach though has the drawback of noise coming from the non-expert user's inputs, so considerable amount of effort has to be dedicated to come up with quality assurance regimes preventing noise from being codified into the knowledge base.

Better solution would be to mine the behavior and content created "naturally" by the suers to come up with ontology suggestions.

## The machine (aided) way

Recently (2015-16) the big commercial players started to again look for ways of using machine learning techniques to construct knowledge graphs, trying to capitalize on vast amounts of available content and user data.

This approach can be regarded as **"concept mining"**, **"knowledge extraction"** (and various other names are present.

For the approach to be successful, the following questions have to be answered by models:


### What are the "nodes"? - Terminology (Keyword) exctraction

This task is equivalent to semantic tagging, that is, we have to predict, given a corpus of text, **what are the most relevant keywords and word combinations**. These keywords will be the core constituents of the ontology that we will try to build up. 

The main approaches for detecting keywords are:
- Part-of-Speech / dependency based
- Frequency based (**TFIDF** and its modifications)
- Co-occurence graph based (**TextRank** and it's many variants)
- "Vector space" based (like [this](https://arxiv.org/abs/1801.04470) paper)

This task strongly overlaps with the methods that are used for **frequency based semantics**, to be discussed in a separate lecture. (see: "normalization" approaches)


### What are the "relations"?

The task of building up ontologies is in most cases framed as a problem for "link prediction" or "knowledge graph completion". In these cases usually a partially built graph is used as a starting point, and a kind of "reasoning" task is to use a predictive model to come up with suggestions for new edges.

<a href="https://player.slideplayer.com/90/14651913/slides/slide_4.jpg"><img src="https://drive.google.com/uc?export=view&id=174K_jBe9xNOVNUVJl1OPX9qogOPOovCD" width=50%></a>

The input can be:
- A pair of "nodes"
    - Their textual representation
    - Externally learned "embedding" of them
    - Graph derived features
- The graph
    - A subgraph of it
    - The whole graph

The output can be:
- A candidate edge for given two nodes
- A set of candidate relationships
- A new graph 

The topic of knowledge graph completion is gaining some traction again recently. One of the reasons for this is the large scale effort of big companies (Google, Microsoft, IBM) to build up their general purpose graphs, the other reason is the potential coming from new embedding techniques as well as nerual network models operating directly on graphs as inputs.

Some overview of this field can be gained from the papers [here](https://arxiv.org/pdf/1503.00759.pdf), [here](https://academic.oup.com/database/article/doi/10.1093/database/bay101/5116160), [here](https://usc-isi-i2.github.io/slides/part-3.pdf) and [here](http://ranger.uta.edu/~cli/pubs/2018/kgcompletion-cikm18short-akrami.pdf).

Google even released an open source semantic frame tagger called [SLING](https://ai.googleblog.com/2017/11/sling-natural-language-frame-semantic.html).

And a quite recent and elegant solution can be found [here](https://arxiv.org/pdf/1901.09590.pdf).

it is important to note, that the recent advancement of neural architectures directly operating on graphs (graph convolutional / graph recurrent networks) can potentially give a huge boost to the efforts in link prediction. This space is worth watching. (More on merger of graphs and neural models later.)

## When DOES it work well?

Despite all the above mentioned criticism, there are definitely **some cases when ontologies work exceptionally well**.

Prerequisites for this are (amongst others):

- Well scoped domain (narrow and defined enough, has experts)
- Defined, well codified domain knowledge (experts can agree)
- Usage of terminology is consistent, can be easily mapped to eg. documents
- Fundamental relations do not change too frequently 


(remember: "There are only two hard things in Computer Science: cache invalidation and naming things." -- Phil Karlton)



