energid-nlp

This is an open source version of Energid's natural language processing (NLP) Python code. It provides the following components:

A DMAP-style parser, energid_nlp.parser.ConceptualParser. Parses text into concepts based on a fixed grammar.
An ICP-style parser, energid_nlp.parser.IndexedConceptParser. Parses text into best-matching concepts with more grammatical flexibility.
A simple conceptual memory implemented as a propositional knowledge base, energid_nlp.logic.PropKB, similar to a classic frame system. This is where the concepts that can be returned by the parsers are defined.

Installation and usage

To run unit tests:

python setup.py test

To install:

python setup.py install

To try the interactive parser with a test grammar and knowledge base:

$ python -m energid_nlp.parser energid_nlp/tests/test.fdl
? hog let us restart the talk on
CP:
[{'base': 'c-action-request',
  'slots': {'action': {'base': 'c-restart'},
            'addressee': {'base': 'c-hog'},
            'object': {'base': 'c-talkon'}}}]

To run the grammar and KB tests:

$ PYTHONPATH=. python -m energid_nlp.parser -t energid_nlp/tests/test.fdl
Class: c-action-request
  hog let us restart the talk on                                         ==> ok
  hog, let's start over with the talkon                                  ==> ok
Class: c-please-action-request
  please, hog, restart talk on                                           ==> ok
Ran 3 parse tests, 0 failed.

Glossary

Description

Descriptions (energid_nlp.logic.Description) are returned as the results of parsing. You can think of them as being like "frame literals". They can represent a single frame/concept without actually existing in memory. Descriptions consist of a class and a set of slots and slot values.

One thing you can do with Descriptions is find matching concepts in memory. Here's an example where the memory contains a concept representing Petunia, who is a small, gray cat and we find her by creating a description of small cats:

kb = logic.PropKB()
kb.tell(logic.expr('ISA(Petunia, Cat)'))
kb.tell(logic.expr('Size(Petunia, Small)'))
kb.tell(logic.expr('Color(Petunia, Gray)'))

d = logic.Description('Cat', {'Size': 'Small'})
d.find_all(kb)
==> [<Expr: 'Petunia'>]

Direct Memory Access Parsing (DMAP)

An approach to parsing in which phrasal patterns are attached to memory structures representing concepts. Text is parsed by a recursive process of recognizing phrasal patterns in the text and constructing descriptions of the corresponding concepts. See Charles Martin's 1990 Ph.D. dissertation, Direct Memory Access Parsing.

Frame

A data structure used to represent concepts in memory. Frames have a class (type), and a set of slots and slot values. Slot values can be any data type, including other frames.

IS-A relationships are supported, and slots can be defined to be inheritable.

Indexed Concept Parsing (ICP)

An approach to parsing that looks for references to to concepts in the input text and returns concepts that match the text best. Practically, this means that you can get get successful parses even if not every word you expect to be present is in the input text or if the input text contains unexpected words.

See Will Fitzgerald's 1994 Ph.D. dissertation, Building Embedded Conceptual Parsers.

Frame Description Language (FDL)

A way of defining frames and attaching language to them in an XML format. Here's a trivial example of defining a frame that represents a cat named Petunia, with an attached phrasal pattern that lets the input text "petunia" parse into the concept:

<frame id="i-petunia">
  <parent id="c-cat" />
  <slot name="name" value="petunia" />
  <phrase>petunia</phrase>
</frame>

See energid_nlp/fdl.xsd for the full XML schema. energid_nlp/tests/test.fdl has a more detailed example.

License

Written by John Wiseman and Michael Hannemann.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
energid_nlp		energid_nlp
.gitignore		.gitignore
LICENSE.md		LICENSE.md
README.md		README.md
coverage.sh		coverage.sh
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

energid-nlp

Installation and usage

Glossary

Description

Direct Memory Access Parsing (DMAP)

Frame

Indexed Concept Parsing (ICP)

Frame Description Language (FDL)

License

About

Releases

Packages

Languages

License

Energid/energid_nlp

Folders and files

Latest commit

History

Repository files navigation

energid-nlp

Installation and usage

Glossary

Description

Direct Memory Access Parsing (DMAP)

Frame

Indexed Concept Parsing (ICP)

Frame Description Language (FDL)

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages