Skip to content
Ben Newsom edited this page Jun 10, 2013 · 4 revisions

The advances made in artificial intelligence—under the guise of expert systems, machine learning, the Semantic Web, and so forth—have made an undeniable impact on our relationship with data. Yet the oppressive volume of data continues to overwhelm us. One reason (perhaps the most important reason) is that computational modeling from bottom-up representations and the contextual structures that best approximate humans use to interpret the data currently suffer from the oft-lamented “semantic gap.” That is, systems that attempt to perform inference from bottom-up data flows, generally through machine learning techniques, are unable to map into to the complex cognitive frameworks that end users apply to problem solving. Conversely, symbolic systems have formal logical languages that are so far removed from statistical models that translation from one to the other remains a mystery. Textual data, found in abundance on the Web, is perhaps the most vivid illustration of this challenge: despite the advances in automated information extraction, keyword search is the prevailing method for end users to work this data into their reasoning processes.

Consider the example of an analyst in search of evidence of relief claim fraud during a disaster response operation. In this type of event, people are recruited to submit false claims to relief organizations, who give the recruiters a kick-back when they cash their relief checks. Clearly, searching on terms such as “kick-back” will fail to yield the information sought, for the evidence in this case is concealed (often deliberately) in a mountain of unrelated documents. Nevertheless, this type of fraud scheme follows patterns of roles, relationships, and activities that are well-understood. The analyst requires a method for semantically targeting his search for these patterns.

Next Century Corporation is developing text analytic technology that crosses the semantic gap at the shallow (but broad) area of event representation. The Event Representation and Structuring of Text (EVEREST) system will search for mappings to a semantic event model, interactively suggesting evidence for the occurrence of whole or partial events for human analysis and reporting. Our semantic targeting approach extends the ideas of Open Information Extraction, Event Web, Semantic Web, and the Ozone Widget Framework. We believe that an event-centric approach will be critical for generating narratives that confer meaning upon large, complex, uncertain, and incomplete data sets.

Clone this wiki locally