Skip to content
Keith Alcock edited this page Aug 23, 2021 · 8 revisions

Exporters

One key element of any good information extraction pipeline is the ability to export the found information. In Eidos, there are several ways to export mentions, depending on your needs. Some of these are standalone apps, while others are implemented as Exporters and used within an app by being specified in apps.conf. Here are some of the more useful apps/formats.

Apps

By far my favorite two apps for general purpose needs are ExtractAndExport and ReconstituteAndExport. The former is used when you want to go from text to mentions, the latter when you already have mentions (as jsonld) and need to re-process and re-export them (in the same or different format). For each of these, you can specify one or more export formats and they will magically appear. See format descriptions below.

Other apps are plentiful and useful. Keith has done a fantastic job adding READMEs that briefly explain the purpose of each. Peruse and enjoy!

Exporter / formats

There are several ways you can export mention information, and each has its own emphasis. These can be selected for usage in the apps mentioned above through the apps.conf, by including them in apps.exportAs = [...].

  • jsonld: the go-to export format, which is a proper serialization. This verbose output contains all the information about the mentions and the document from whence they came.
  • serialized: produces a binary file with the odin mentions serialized with java serialization. This (a) doesn't include the EidosMentions, which carry a lot of the metadata and (b) is prone to versioning issues as the code changes frequently, so for a serialization that is more reliable you should use jsonld.
  • grounding: produces a csv file that can be used for evaluating system groundings. Note that this is likely only currently compatible with the flat groundings.
  • ground: extends the jsonld exporter, but prior to export grounds the mentions to the desired/specified ontology(ies). It optionally creates a debug log for information about the groundings that were produced.
  • debugGrounding: produces a text log of the groundings produced by the SRLCompositionalGrounder. Less information provided than the groundingInsight exporter
  • groundingInsight: produces a verbose text log of the groundings produced by the SRLCompositionalGrounder, including the semantic roles for the sentence, etc. Used for in-depth analysis of what groundings are produced and why.