Semantic parsing for the browser.
Pars' 'Em!, or parsem, is a minimal semantic parsing framework written
in javascript and designed for the web browser. Semantic parsing
converts natural language to a machine interpretable format. Pars' 'Em!
currently uses a chart parsing based approach which is well-tailored for
small domain applications that need to run in resource constrained
environments. If you're looking to tackle a larger domain and have ample
compute power available to your application, you may want to consider
a deep learning approach.
The key design characteristics around Pars' 'Em! are:
- composability: parsers should be re-usable and composable so that a parser for one application may be reused in another or to build another parser.
- lightweight: Pars' 'Em! targets small use cases that need a prototype that can be built fast and will run fast, rather than targeting accuracy or breadth of coverage.
- in the client: Pars' 'Em! is built in javascript so that it can run in a web client or even on a static page.
To dive in check out the Quickstart guide down below. If you want to test out what the library can currently do, see the Demo section. To contribute see Contributing. For other queries take a look at Contact. Finally, if you'd like a more in-depth introduction to semantic parsing, check out the SippyCup jupyter notebooks and if you want a more fully featured semantic parsing framework take a look at SemPre.
You can find a demo of Pars' 'Em! here.
If you want to use this library, you'll have to vendor and read some source code; however, this section will give you a quick taste for what it's like.
The key ideas in Pars' 'Em! are parses, parsers, and rankers. Parsers take strings and produce parses, and rankers can take all the parses generated by a parser and re-rank them so the best parses come at the top. We'll walk more in-depth through these topics.
The Parse and Parser classes are defined in
parse.js. See the source code for their documentation.
An instance of Parse represents one interpretation for a piece of
text, and has a computeDenotation method that computes something based
on how the parse interprets the text.
An instance of Parser exposes a method, Parser.parse, that maps
strings to parses.
Pars' 'Em! comes with a few example semantic parsers in the
parsers directory. For example, arithmeticParser parses
natural language arithmetic expressions:
import {
arithmeticParser
} from './parsem/parsers/arithmetic';
const parses = arithmetricParser.parse("What is one plus one?");
const denotations = parses.map(p => p.computeDenotation());
In the above code, parses is an array of Parse instances for the
sentence "What is one plus one?" while denotations is an array
containing all the values that the sentence might denote (in this
case, the correct denotation is 2).
Usually, but not always, parsers are defined by a grammar.
arithmeticParser, numberParser,
and ordinalParser provide examples of parsers defined
by a grammar, while digitParser and
ignorableParser give examples of parsers defined from
scratch.
To build a parser with a grammar, you can use the Grammar and Rule
classes, for example we could encode the above example as:
const parser = new Grammar(
["$Root"],
basicTokenizer,
[],
[
new Rule(
'root',
'$Root', '$Num',
x => x
),
new Rule(
'one',
'$Num', 'one',
() => 1
),
new Rule(
'plus',
'$Plus', 'plus',
() => (x, y) => x + y
),
new Rule(
'applyPlus',
'$Num', '$Num $Plus $Num',
(x, y, z) => y(x, z)
)
]
);
Each rule consists of a tag identifying that rule, a left hand side symbol to output, a right hand side sequence of symbols to match to, and a function that defines the semantics of that rule, where the function maps the children of the produced parse to the value of the parse itself.
See the parsers directory for more examples,
parse for documentation and source code for the Parse
and Parser base classes, and grammar for source code
and documentation on the Grammar and Rule classes.
The second major concept for Pars' 'Em! is that of a ranker. Because
parsers tend to over-generate parses for a piece of text, it's necessary
to re-rank or score the parses generated by a parser. The
Ranker base class defines the interface a ranker should
make available. In particular, rankers can learn from data to rank
parses using their fit method, and then given a piece of text produce
either the highest scoring parse (Ranker.topParse) or the full list of
ranked parses (Ranker.scoresAndParses).
See LinearRanker for an example.
This library was built as a side project, and so I have no formal plans to maintain it. If you write a pull request that has properly tested code and solves a problem that fits well with the framework, there's a good chance I'll merge it (and you can always make a fork if I don't).
If you're looking to make a pull request with the aim of merging it, please open an issue to discuss before you put in the work. All code contributed must be MIT licensed, and by making a pull request you are assigning the copyright of the code to me, so that I can incorporate it into parsem under the MIT license.
Need to get in touch? Reach out by emailing contactnick at my website.