Skip to content

nalourie/parsem

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pars' 'Em!

Semantic parsing for the browser.

Pars' 'Em!, or parsem, is a minimal semantic parsing framework written in javascript and designed for the web browser. Semantic parsing converts natural language to a machine interpretable format. Pars' 'Em! currently uses a chart parsing based approach which is well-tailored for small domain applications that need to run in resource constrained environments. If you're looking to tackle a larger domain and have ample compute power available to your application, you may want to consider a deep learning approach.

The key design characteristics around Pars' 'Em! are:

  • composability: parsers should be re-usable and composable so that a parser for one application may be reused in another or to build another parser.
  • lightweight: Pars' 'Em! targets small use cases that need a prototype that can be built fast and will run fast, rather than targeting accuracy or breadth of coverage.
  • in the client: Pars' 'Em! is built in javascript so that it can run in a web client or even on a static page.

To dive in check out the Quickstart guide down below. If you want to test out what the library can currently do, see the Demo section. To contribute see Contributing. For other queries take a look at Contact. Finally, if you'd like a more in-depth introduction to semantic parsing, check out the SippyCup jupyter notebooks and if you want a more fully featured semantic parsing framework take a look at SemPre.

Demo

You can find a demo of Pars' 'Em! here.

Quickstart

If you want to use this library, you'll have to vendor and read some source code; however, this section will give you a quick taste for what it's like.

The key ideas in Pars' 'Em! are parses, parsers, and rankers. Parsers take strings and produce parses, and rankers can take all the parses generated by a parser and re-rank them so the best parses come at the top. We'll walk more in-depth through these topics.

Parses and Parsers

The Parse and Parser classes are defined in parse.js. See the source code for their documentation.

An instance of Parse represents one interpretation for a piece of text, and has a computeDenotation method that computes something based on how the parse interprets the text.

An instance of Parser exposes a method, Parser.parse, that maps strings to parses.

Pars' 'Em! comes with a few example semantic parsers in the parsers directory. For example, arithmeticParser parses natural language arithmetic expressions:

import {
  arithmeticParser
} from './parsem/parsers/arithmetic';

const parses = arithmetricParser.parse("What is one plus one?");
const denotations = parses.map(p => p.computeDenotation());

In the above code, parses is an array of Parse instances for the sentence "What is one plus one?" while denotations is an array containing all the values that the sentence might denote (in this case, the correct denotation is 2).

Usually, but not always, parsers are defined by a grammar. arithmeticParser, numberParser, and ordinalParser provide examples of parsers defined by a grammar, while digitParser and ignorableParser give examples of parsers defined from scratch.

To build a parser with a grammar, you can use the Grammar and Rule classes, for example we could encode the above example as:

const parser = new Grammar(
  ["$Root"],
  basicTokenizer,
  [],
  [
    new Rule(
      'root',
      '$Root', '$Num',
      x => x
    ),
    new Rule(
      'one',
      '$Num', 'one',
      () => 1
    ),
    new Rule(
      'plus',
      '$Plus', 'plus',
      () => (x, y) => x + y
    ),
    new Rule(
      'applyPlus',
      '$Num', '$Num $Plus $Num',
      (x, y, z) => y(x, z)
    )
  ]
);

Each rule consists of a tag identifying that rule, a left hand side symbol to output, a right hand side sequence of symbols to match to, and a function that defines the semantics of that rule, where the function maps the children of the produced parse to the value of the parse itself.

See the parsers directory for more examples, parse for documentation and source code for the Parse and Parser base classes, and grammar for source code and documentation on the Grammar and Rule classes.

Rankers

The second major concept for Pars' 'Em! is that of a ranker. Because parsers tend to over-generate parses for a piece of text, it's necessary to re-rank or score the parses generated by a parser. The Ranker base class defines the interface a ranker should make available. In particular, rankers can learn from data to rank parses using their fit method, and then given a piece of text produce either the highest scoring parse (Ranker.topParse) or the full list of ranked parses (Ranker.scoresAndParses).

See LinearRanker for an example.

Contributing

This library was built as a side project, and so I have no formal plans to maintain it. If you write a pull request that has properly tested code and solves a problem that fits well with the framework, there's a good chance I'll merge it (and you can always make a fork if I don't).

If you're looking to make a pull request with the aim of merging it, please open an issue to discuss before you put in the work. All code contributed must be MIT licensed, and by making a pull request you are assigning the copyright of the code to me, so that I can incorporate it into parsem under the MIT license.

Contact

Need to get in touch? Reach out by emailing contactnick at my website.

About

Semantic parsing for the browser.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors