Skip to content
Keith Alcock edited this page Sep 28, 2018 · 75 revisions

The revised JSON-LD output format for Eidos is documented in the two tables below. An example follows to illustrate some of the notation, which just mimics JSON. [] indicates an array, {} indicates a map, and @id(Word) expands to something like "@id" : "_:Word_1".

Note: Recent changes are indicated with bold text. While many values are optional, they are written in italics where it is particularly important to keep that in mind. The example further below does not yet match the tables.

Name Property Type Description
Corpus @type "Corpus" A corpus is typed.
documents [Document] It has a list of documents
extractions [Extraction] and a set of mixed extractions.
Document @type "Document" A document is typed
@id IRI and provided an ID.
title string It has a title,
text string some text,
location string a human-interpretable indication of where the text was found,
sentences [Sentence] and a list of sentences.
Sentence @type "Sentence" A sentence is typed
@id IRI and provided an ID.
text string It has a text,
words [Word] a list of words,
dependencies [Dependency] a set of universal enhanced dependencies,
timexes [TimeExpression] and a list of time expressions.
Word @type "Word" A word is typed
@id IRI and provided an ID.
text string It has a text,
tag string a tag from the Penn Treebank tag set,
entity string an entity type,
startOffset integer an inclusive, 0-based index of the first letter of the word in the text,
endOffset integer an exclusive, 0-based index of the last letter of the word,
lemma string a lemma,
chunk string a chunk,
norm string and a norm, including "B-TIME" or "I-TIME"
Dependency @type "Dependency" A dependency is typed.
source {@id(Word)} It has a source ID referring to a Word,
destination {@id(Word)} a destination ID referring to a Word,
relation string and a relation.
Extraction @type "Extraction" An extraction is typed
@id IRI and provided an ID.
type string a description of type from the list below,
subtype string a description of subtype from the list below,
labels [string] It has a list of labels,
text [string] a text,
rule [string] a rule,
canonicalName string a canonical name,
groundings [Groundings] a list of groundings(es),
provenance [Provenance] a set of provenance values,
trigger Trigger a trigger,
states [State] and a set of states.
arguments [Argument] and a list of arguments.
Groundings @type "Groundings" A groundings is typed.
name string It has a name such as "un", "wdi", or "fao"
values [Grounding] and a list of grounding values.
Grounding @type "Grounding" A grounding is typed.
ontologyConcept string It has an ontology concept
value string and a matching value.
Provenance @type "Provenance" A "provenance" is typed.
document {@id(Document)} It has a document ID referring to a Document,
documentCharPositions Interval an interval for characters within the document,
sentence {@id(Sentence)} a sentence ID referring to a Sentence,
sentenceWordPositions Interval and an interval for words within the sentence.
Interval @type "Interval" An interval is typed.
start integer For sentenceWordPositions, this is an inclusive, 1-based index of the first word of the interval in the sentence. For documentCharPositions, it is an inclusive, 0-based index of the first character of the interval in the document
end integer For sentenceWordPositions, this is an inclusive, 1-based index of the last word of the interval in the sentence. For documentCharPositions, it is an inclusive, 0-based index of the last character of the interval in the document.
State @type "State" A state is typed.
type string It has an Eidos type such as INC, DEC, QUANT, TIMEX, or HEDGE, or a BBN type such as TENSE, POLARITY, and so on. This list is non-exhaustive. See details below.
text string a text,
value value_type a value of type detailed below,
provenance [Provenance] a set of provenance values, currently unused,
modifiers [Modifier] and a set of modifiers.
Modifier @type "Modifier" A modifier is typed.
text string It has a text,
provenance [Provenance] a set of provenance values, currently unused,
intercept double an intercept,
mu double a mu,
sigma double and a sigma.
Trigger @type "Trigger" A trigger is typed.
text string It has a text
text string Head word of the trigger.
provenance [Provenance] and a set of provenance values.
Argument @type "Argument" An argument is typed.
type string It has a description of type from the table below,
value {@id(Extraction)} and an ID referring to the extraction.
TimeExpression @type "TimeExpression" A time expression is typed
@id IRI and provided with an ID.
startOffset integer It has a starting, zero-based character offset into the sentence text,
endOffset integer and an ending, exclusive character offset,
text string along with the actual text determined by the offsets,
intervals [TimeInterval] and a list of time intervals.
TimeInterval @type "TimeInterval" A time interval is typed
@id IRI and provided with an ID.
start string It starts (in ISO-8601 format),
end string it ends (in ISO-8601 format),
duration integer and it lasts for a number of seconds.
Extraction type Extraction subtype Argument type Notes
"concept" This is just something being there.
"entity" none
"event" actor
place
time
"relation" Link between concepts
"causation" This is a directed relation.
"source" This source may appear multiple times.
"destination" This destination may appear multiple times.
"precondition" The "source" concept must have occurred/begun for the "destination" concept to happen. This is a directed relation.
"source" This source may appear multiple times.
"destination" This destination may appear multiple times.
"catalyst" The "source" concept increases the intensity of the "destination" concept, but does not cause the "destination" concept. This is a directed relation.
"source" This source may appear multiple times.
"destination" This destination may appear multiple times.
"mitigation" The "source" concept decreases the intensity of the "destination" concept, but does not cause the "destination" concept. This is a directed relation.
"source" This source may appear multiple times.
"destination" This destination may appear multiple times.
"prevention" The "source" concept will stop or prevent the 'destination' concept. This is a directed relation.
"source" This source may appear multiple times.
"destination" This destination may appear multiple times.
"temporallyPrecedes" The "source" concept occurs before the 'destination' concept. This is a directed relation.
"source" This source may appear multiple times.
"destination" This destination may appear multiple times.
"correlation" This is an undirected relation.
"argument" The argument must appear multiple times.
"unification" This is a directed relation.
"group" The argument must appear only once.
"member" The argument must appear only once.
"coreference" This is a directed relation.
"anchor" The argument must appear only once.
"reference" The argument must appear only once.
State type Value type Notes
INC Indicates that the Concept has been increased.
DEC Indicates that the Concept has been increased.
QUANT The gradable adjective associated with the Concept.
TIMEX {@id(TimeExpression)} This is an ID referring to a time expression.
HEDGE Indicates that the Extraction has been hedged.
polarity Negative when it is explicitly indicated that the Event did not occur. All other Events are Positive.
modality Asserted when the author or speaker makes reference to it as though it were a real occurrence. All other Events are Other.
genericity Specific if it is a singular occurrence at a particular place and time, or a finite set of such occurrences. All other Events are Generic.

This example shows valid JSON-LD syntax and links between elements, even though the linguistic analysis is fabricated.

{
  "@context" : {
    "Argument" : "https://w3id.org/wm/cag/Argument",
    "Corpus" : "https://w3id.org/wm/cag/Corpus",
    "Dependency" : "https://w3id.org/wm/cag/Dependency",
    "Document" : "https://w3id.org/wm/cag/Document",
    "Extraction" : "https://w3id.org/wm/cag/Extraction",
    "Interval" : "https://w3id.org/wm/cag/Interval",
    "Modifier" : "https://w3id.org/wm/cag/Modifier",
    "Provenance" : "https://w3id.org/wm/cag/Provenance",
    "Sentence" : "https://w3id.org/wm/cag/Sentence",
    "State" : "https://w3id.org/wm/cag/State",
    "Trigger" : "https://w3id.org/wm/cag/Trigger",
    "Word" : "https://w3id.org/wm/cag/Word"
  },
  "@type" : "Corpus",
  "documents" : [ {
    "@type" : "Document",
    "@id" : "_:Document_1",
    "title" : "Example Document",
    "sentences" : [ {
      "@type" : "Sentence",
      "@id" : "_:Sentence_1",
      "text" : "Hello, world!",
      "words" : [ {
        "@type" : "Word",
        "@id" : "_:Word_1",
        "text" : "Hello",
        "tag" : "UH",
        "entity" : "O",
        "startOffset": 0,
        "endOffset" : 5,
        "lemma" : "hello",
        "chunk" : "B-ADVP"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_2",
        "text" : ",",
        "tag" : ",",
        "entity" : "O",
        "startOffset" : 5,
        "endOffset" : 6,
        "lemma" : ",",
        "chunk" : "O"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_3",
        "text" : "world",
        "tag" : "NN",
        "entity" : "O",
        "startOffset" : 7,
        "endOffset" : 12,
        "lemma" : "world",
        "chunk" : "B-NP"
      }, {
        "@type" : "Word",
        "@id" : "_:Word_4",
        "text" : "!",
        "tag" : ".",
        "entity" : "O",
        "startOffset" : 12,
        "endOffset" : 13,
        "lemma" : "!",
        "chunk" : "O"
      } ],
      "dependencies" : [ {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_3"
        },
        "destination" : {
          "@id" : "_:Word_1"
        },
        "relation" : "discourse"
      }, {
        "@type" : "Dependency",
        "source" : {
          "@id" : "_:Word_3"
        },
        "destination" : {
          "@id" : "_:Word_2"
        },
        "relation" : "punct"
      } ]
    } ]
  } ],
  "extractions" : [ {
    "@type" : "Extraction",
    "@id" : "_:Extraction_1",
    "type" : "concept",
    "subtype" : "entity",
    "labels" : [ "NounPhrase", "Entity" ],
    "text" : "world",
    "rule" : "simple-np",
    "canonicalName" : "world",
    "grounding" : [ {
      "@type" : "Grounding",
      "ontologyConcept" : "/entities/human/livelihood",
      "value" : 0.47524851930210044
    }, {
      "@type" : "Grounding",
      "ontologyConcept" : "/entities/human/financial/economic/economy",
      "value" : 0.4713680118187502
    } ],
    "provenance" : [ {
      "@type" : "Provenance",
      "document" : {
        "@id" : "_:Document_1"
      },
      "sentence" : {
        "@id" : "_:Sentence_1"
      },
      "positions" : {
        "@type" : "Interval",
        "start" : 3,
        "end" : 3
      }
    } ],
    "states" : [ {
      "@type" : "State",
      "type" : "INC",
      "text" : "Hello",
      "modifiers" : [ {
        "@type" : "Modifier",
        "text" : "world",
        "intercept" : 0.6154,
        "mu" : 1.034E-5,
        "sigma" : -0.001123
      } ]
    } ]
  }, {
    "@type" : "Extraction",
    "@id" : "_:Extraction_2",
    "type" : "relation",
    "subtype" : "causation",
    "labels" : [ "Causal", "DirectedRelation", "EntityLinker", "Event" ],
    "text" : "Hello",
    "rule" : "dueToSyntax1-Causal",
    "canonicalName" : "hello",
    "provenance" : [ {
      "@type" : "Provenance",
      "document" : {
        "@id" : "_:Document_1"
      },
      "sentence" : {
        "@id" : "_:Sentence_1"
      },
      "positions" : {
        "@type" : "Interval",
        "start" : 1,
        "end" : 1
      }
    } ],
    "trigger" : {
      "@type" : "Trigger",
      "text" : "world",
      "provenance" : [ {
        "@type" : "Provenance",
        "document" : {
          "@id" : "_:Document_1"
        },
        "sentence" : {
          "@id" : "_:Sentence_1"
        },
        "positions" : {
          "@type" : "Interval",
          "start" :3,
          "end" : 3
        }
      } ]
    },
    "arguments" : [ {
      "@type" : "Argument",
      "type" : "source",
      "value" : {
        "@id" : "_:Extraction_1"
      }
    }, {
      "@type" : "Argument",  
      "type" : "destination",
      "value" : {
        "@id" : "_:Extraction_3"
      }
    } ]
  }, {
    "@type" : "Extraction",
    "@id" : "_:Extraction_3",
    "type" : "relation",
    "subtype" : "correlation",
    "labels" : [ "Correlation", "UndirectedRelation", "EntityLinker", "Event" ],
    "text" : "world",
    "rule" : "dueToSyntax1-Causal",
    "canonicalName" : "world",
    "provenance" : [ {
      "@type" : "Provenance",
      "document" : {
        "@id" : "_:Document_1"
      },
      "sentence" : {
        "@id" : "_:Sentence_1"
      },
      "positions" : {
        "@type" : "Interval",
        "start" : 3,
        "end" : 3
      }
    } ],
    "trigger" : {
      "@type" : "Trigger",
      "text" : "Hello",
      "provenance" : [ {
        "@type" : "Provenance",
        "document" : {
          "@id" : "_:Document_1"
        },
        "sentence" : {
          "@id" : "_:Sentence_1"
        },
        "positions" : {
          "@type" : "Interval",
          "start" : 1,
          "end" : 1
        }
      } ]
    },
    "arguments" : [ {
      "@type" : "Argument",
      "type" : "argument",
      "value" : {
        "@id" : "_:Extraction_2"
      }
    }, {
      "@type" : "Argument",
      "type" : "argument",
      "value" : {
        "@id" : "_:Extraction_1"
      }
    } ]
  } ]
}

Clone this wiki locally