Switch branches/tags
Nothing to show
Find file History
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


Lexicon learning

This package implements the PAL algorithm for lexicon learning.

Runnable Example

See the Geoquery Semantic Parsing Experiment for an example that runs PAL and trains a semantic parser with the generated lexicon.


There are three steps required to run PAL on a data set. Each step is illustrated with example code applying PAL to Geoquery.

  1. Create training examples -- Training data for PAL is provided as instances of AlignmentExample. These can be generated from questions paired with sets of logical forms, as in GeoqueryInduceLexicon.readTrainingData. When generating these examples, there are various parameters that can be adjusted to control the possible splits of logical forms.

  2. Train PAL -- This step creates a ParametricCfgAlignmentModel and running EM to estimate its parameters. See example code in GeoqueryInduceLexicon.trainAlignmentModel. The parameters of this step can be adjusted to run the concave or coupled models from the paper. The output of this step is a CfgAlignmentModel.

  3. Generate a lexicon -- Run CfgAlignmentModel.generateLexicon on the training examples to generate lexicon entries. These lexicon entries can be used to train a CCG semantic parser, as in GeoqueryInduceLexicon.runFold.


If you use the PAL algorithm, please cite the following paper:

Jayant Krishnamurthy. Probabilistic Models for Learning a Semantic Parser Lexicon. NAACL 2016