StringBasedPCFG

This is a simple module for inducing probabilistic context-free grammars (PCFGs) from a treebank. It also provides features to evaluate a tree against another one.

All its features rely on string parsing. That's where the module got its name.

Tree syntax

Traditionally, parse trees are written in lisp-like syntax. A tree for the sentence "The dog bites John" might look like this (The tagset used here is the Penn Treebank tagset):
(S (NP (DT 'The')(NN 'dog'))(VP (VBD 'bites')(NP (NNP 'John'))))

The module StringBasedPCFG natively uses a slighlty different syntax in that it puts parentheses around the leaves as well. The above tree would look like that:
(S (NP (DT ('The'))(NN ('dog')))(VP (VBD ('bites'))(NP (NNP ('John')))))

This problem will be addressed in the future.

Inducing a PCFG

The script extract_rules.pl provides an interface for inducing a PCFG. To induce a PCFG call extract_rules.pl <treebank>. If you only care about the rules and don't care about their probabilities, you can also call
extract_rules.pl --no-probabilities <treebank>
to omit the probabilities.

Evaluating a tree

To evaluate your grammar, you can use the script evaluateTrees.pl. Pass the trees as strings like so:
extract_rules.pl <auto-annotated-tree> <gold-annotated-tree>

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
StringBasedPCFG.pm		StringBasedPCFG.pm
evaluateTrees.pl		evaluateTrees.pl
extract_rules.pl		extract_rules.pl
getProbability.pl		getProbability.pl
printBrackets.pl		printBrackets.pl
test_lisp_tree_conversion.pl		test_lisp_tree_conversion.pl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

StringBasedPCFG

Tree syntax

Inducing a PCFG

Evaluating a tree

About

Releases

Packages

Contributors 2

Languages

License

Simon-Will/StringBasedPCFG

Folders and files

Latest commit

History

Repository files navigation

StringBasedPCFG

Tree syntax

Inducing a PCFG

Evaluating a tree

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages