Skip to content
Zheng Tang edited this page Aug 21, 2021 · 4 revisions

Finders

While Eidos is very much a causal information extraction framework, the various things that are extracted are somewhat modular and be be enabled/disabled. These are enabled by being included in the list in EidosSystem.finders in eidos.conf. The default (as of this writing) is:

EidosSystem.finders = ["gazetteer", "rulebased", "geonorm", "timenorm", "context", "causal", "seasons"]

With everything currently available enabled, Eidos will find:

  • GradableAdjectives (and adverbs), which we refer to as Quantifiers
  • Properties (e.g., price, weight, etc. These are fairly specific to the WorldModeler's use case
  • Locations (gazetteer)
  • Noun phrase and verb phrase "entities" (to serve as the potential causes and effects)
  • Geolocations from the geonorm model
  • Temporal mentions from the timenorm model
  • Rule-based temporal expressions
  • Causal relations
  • Season mentions

These are found through a series of Finders, which apply sequentially as specified in EidosSystem.finders mentioned above. Here are more details about each of the possible finders that can be included, as well as how to configure/extend them.

gazetteer

By including gazetteer in the enabled finders, you initialize a GazetteerEntityFinder, which utilizes the LexiconNER from clulab processors to load one or more lexicons and then creates an Odin mention for each found term. Multi-word expressions are supported. You can specify which lexicons or gazetteers you want to use by modifying gazetteers.lexicons in the config (currently, this is not in eidos.conf, as we don't override the default, which is set in reference.conf. Each entry in that list is a path to a lexicon in the resources directory. The name of the lexicon file will be the label of the corresponding Mentions. Currently, we use these lexicons to find Quantifiers, some Locations, and Properties (though these aren't really in use). Note: of all finders, this is the most trivial to extend, as you can either add to a lexicon or add new lexicons very straightforwardly.

rulebased

By including rulebased in the enabled finders, you initialize a RuleBasedEntityFinder. This is a customized implementation of the class by the same name found in processors (and could potentially be replaced with a dependency on the CustomRuleBasedEntityFinder there). It finds the noun and verb phrases that will serve as the potential causes and effects. Because of certain design aspects of Odin (can't write a rule that has a Mention point back at itself), we avoid certain lexical items (e.g., words we will later use as triggers). The grammars for this finder, both to specify what to extract as well as what to avoid, are in the resources.

geonorm

This enables the neural geonorm system which finds mentions of Geolocations and normalizes them to a geonames entry. Please see here for more detail.

timenorm

This enables the neural timenorm system which finds Temporal mentions and normalizes them to a timenorm entry. Please see here for more detail.

context

This enables a lightweight rule-based backoff for time and location mentions. It applies rules given in context.yml.

seasons

This finder initializes a SeasonFinder, which uses both lexical triggers and a knowledge base of season starts and ends (specific to a location) to find and normalize mentions of seasons in text. Please see this wiki for more information.

causal

Finally, right? This is the finder that uses the rules imported in master.yml to find causal relation mentions. Note that this is run last so it has access to all the mentions found by earlier finders.