-
Notifications
You must be signed in to change notification settings - Fork 24
Brief Description of World Modelers Readers
This document summarizes in plain language the information that is produced by (most) World Modelers readers. Please note that a formal description of the corresponding JSON format is listed in this document.
In general, most readers produce concepts as well as the relations that connect them. We describe both types below.
A concept is either an entity (a thing that exists as itself, as a subject or as an object, actually or potentially, concretely or abstractly, physically or not) or an event (entities in time or instantiations of properties in objects. For example, in the sentence "The decrease in rainfall caused significantly increased poverty in South Sudan.", the phrases rainfall, poverty and South Sudan might be identified by a reader as relevant concepts.
Readers may rely on various strategies to identify relevant concepts. For example, Eidos, the University of Arizona reader, relies mostly on syntactic hints to identify concepts, e.g., any noun phrase can potentially be a concept. Eidos makes no particular distinction between the concept subtypes (entity and event). The BBN reader, on the other hand, does make a distinction between entities and events. They rely on named entity recognition, which processes text as a sequence of words, to identify entities and TODO: BBN to identify events.
However, independent of the strategy used to identify candidate concepts, all readers ground the recognized concepts to one or more taxonomies, i.e., they assign one or more labels that explain the type of each concept. For example, the type assigned to the concept South Sudan in the example above would be Geopolitical Entity (GPE), and the type assigned to rainfall would be Precipitation.
Currently, the readers in the World Modelers program use different taxonomies for grounding. For example, Eidos uses the following two taxonomies: a taxonomy of high-level concepts constructed semi-automatically from a collection of documents from the United Nations, available here, as well as a taxonomy of mid-level indicators adapted from the World Development Indicators, available here. Both taxonomies have the same YAML format, where the tree structure is captured through indentation, i.e., each node is the child of a node above that is indented less. Further, each of the terminal nodes (which are what Eidos uses for grounding), is listed as an OntologyNode
. For example, in Eidos's high-level ontology, this block:
- weather:
- OntologyNode:
name: precipitation
examples:
- precipitation
- rain
- rainfall
- snow
indicates that precipitation
is a child of (i.e., a type of) weather
. The examples
block under precipitation
lists examples of concepts that would be grounded as precipitation
. Note that this list does not have to be comprehensive. Eidos uses a statistical algorithm for grounding, which allows it to ground concepts it has not seen before such as hail because they are semantically similar to the examples it knows. Optionally, an OntologyNode
includes a description
field that provides a natural language description of the corresponding node. For example:
- OntologyNode:
name: Urban_land_area_where_elevation_is_below_5_meters_(sq._km)
description:
- "Urban land area below 5m is the total urban land area in square kilometers where the elevation is 5 meters or less."
Lastly, Eidos concepts have states, which aim to capture some quantification information about a concept. For example, in the sentence above, the state of rainfall is decreased, and the state of poverty is significantly increased. Please see our paper on grounding gradable adjectives to see how we further transform adjectives such as significant into numeric values, given an input distribution.
The BBN reader TODO: add content here
The Pitt/CMU reader TODO: add content here
Note: please note that all these readers are under very active development, so the content of the taxonomies listed above is very likely to change.
Relations are relationships that connect two or more concepts or other relations. For example, in the sentence above, the Eidos reader identifies a causal relation between rainfall and poverty. In general, we subclassify the relations extracted by World Modelers readers into several categories:
Eidos extracts three relation types in this category:
- Causation: this relation captures causal dependencies between two concepts or relations. The example sentence listed above is one such relation. Note that the polarity of the causation relation (i.e., promotion vs. inhibition) may be captured directly from the relation predicate, e.g., when using verbs such as promotes, advances, or indirectly from the states of the participating concepts, e.g., when the increase of one concept causes the increase of another, the polarity of the relation is, of course, positive.
- Influence: this relation is a weaker form of the above Causation relation, used when the polarity of the causal relationship cannot be inferred, e.g., from texts such as A influences B.
- Correlation: this relation captures correlations between two concepts or relations, e.g., from statements such as A correlates with B, or A grows with B.
As mentioned above, Eidos extracts these three types of relations when they occur between concepts or relations. When the participants in these relations are other relations, Eidos aims to "flatten" the extracted representation. For example, from a statement such as the increase of A due to B causes the decrease of C, Eidos extracts two causal relations conceptually represented as, first: B promotes A, and second: (B promotes A) inhibits C. The latter relation is flattened by using the effect of the inner relation (A) as the cause of the second: A inhibits C.
The BBN reader TODO: add content here
The Pitt/CMU reader TODO: add content here
The BBN reader TODO: add content here
The Pitt/CMU reader TODO: add content here
The University of Arizona reader currently does not extract explicit similarity or coreference relations. However, some of this information is captured implicitly through grounding. For example, both rain and snow are grounded to the precipitation
node in the Eidos taxonomy, which indicates that they are similar, according to this taxonomy.
The BBN reader TODO: add content here
The Pitt/CMU reader TODO: add content here
Eidos does not currently extract any relations in this category.