Skip to content

rycolab/wpda

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Algorithms for Weighted Pushdown Automata

Code accompanying the EMNLP 2022 publication "Algorithms for Weighted Pushdown Automata".


Installation

$ git clone https://github.com/rycolab/wpda.git
$ cd wpda
$ pip install -e .

Example

from rayuela.base.symbol import Sym
from rayuela.base.semiring import Real
from rayuela.fsa.state import State
from rayuela.cfg.nonterminal import NT, S

Random WPDA

Create a random WPDA:

from rayuela.pda.random_pda import random_pda

Sigma = [Sym("a"), Sym("b"), Sym("c")]
Gamma = [S, NT("X"), NT("Y"), NT("Z")]
Q = [State('0'), State('1'), State('2')]

pda = random_pda(Sigma=Sigma, Gamma=Gamma, Q=Q, R=Real, num_transitions=3)
print(pda)

Output:

1 -- a --> 0	[Y, S]	[X, S, X, Z]	0.036
2 -- b --> 0	[X]	[S]	0.135
2 -- a --> 0	[S, Z, X, Y]	[]	0.092

Stringsums

from rayuela.pda.pda import PDA
from rayuela.pda.parser import Parser

pda = PDA(R=Real)
# initial and final states
pda.set_I(State('0'), Real.one)
pda.set_F(State('2'), Real.one)
# add transitions
pda.add(Real(0.18), State('0'), Sym("a"), State('1'), (NT("X"),))
pda.add(Real(0.23), State('1'), Sym("a"), State('2'), (S,), *(NT("X"),))

print(pda)
# compute stringsum
parser = Parser(pda)
print(parser.parse("aa", strategy="bottom-up"))

Output:

0 -- a --> 1	[]	[X]	0.18
1 -- a --> 2	[X]	[S]	0.23

0.0414

Binarization

pda = PDA(R=Real)
# initial and final states
pda.set_I(State('0'), Real.one)
pda.set_F(State('2'), Real.one)

# add transitions
pda.add(Real(0.18), State('0'), Sym("a"), State('1'), (NT("X"),))
pda.add(Real(0.23), State('1'), Sym("a"), State('2'), (S,), *(NT("X"),NT("Y"),NT("Z"),NT("W")))

# binarize
npda = pda.binarize(strategy="bottom-up")
print(npda)

Output:

0 -- a --> 1	[]	[X]	0.18
1 -- a --> @3	[X, Y]	[Y]	0.23
@3 -- ε --> @4	[Y, Z]	[Z]	1.0
@4 -- ε --> @5	[Z, W]	[W]	1.0
@5 -- ε --> 2	[Z, W]	[S]	1.0

Unary Removal

pda = PDA(R=Real)
# initial and final states
pda.set_I(State('0'), Real.one)
pda.set_F(State('3'), Real.one)

# add transitions
pda.add(Real(0.18), State('0'), Sym("a"), State('1'), (NT("Y"),))
pda.add(Real(0.35), State('1'), ε, State('2'), (NT("X"),), *(NT("Y"),))
pda.add(Real(0.23), State('2'), Sym("a"), State('3'), (S,), *(NT("X"),))

# remove unaries
npda = pda.remove_unary(strategy="bottom-up")
print(npda)

Output:

0 -- a --> 2	[]	[X]	0.063
2 -- a --> 3	[X]	[S]	0.23

Cite

Alexandra Butoi, Brian DuSell, Tim Vieira, Ryan Cotterell, and David Chiang. 2022. Algorithms for weighted pushdown automata. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.

@inproceedings{butoi+al.emnlp22,
     abstract = {Weighted pushdown automata (WPDAs) are at the core of many natural language processing tasks, like syntax-based statistical machine translation and transition-based dependency parsing. As most existing dynamic programming algorithms are designed for context-free grammars (CFGs), algorithms for PDAs often resort to a PDA-to-CFG conversion. In this paper, we develop novel algorithms that operate directly on WPDAs. Our algorithms are inspired by Lang's algorithm, but use a more general definition of pushdown automaton and either reduce the space requirements by a factor of |Γ|(the size of the stack alphabet) or reduce the runtime by a factor of more than |Q| (the number of states). When run on the same class of PDAs as Lang's algorithm, our algorithm is both more space-efficient by a factor of |Γ| and more time-efficient by a factor of |Q|⋅|Γ|.},
     address = {Abu Dhabi, United Arab Emirates},
     arxiv = {https://arxiv.org/abs/2210.06884},
     author = {Butoi, Alexandra and 
    DuSell, Brian and 
    Vieira, Tim and 
    Cotterell, Ryan and 
    Chiang, David},
     booktitle = {Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing},
     code = {https://github.com/rycolab/wpda},
     month = {December},
     publisher = {Association for Computational Linguistics},
     title = {Algorithms for Weighted Pushdown Automata},
     url = {https://arxiv.org/abs/2210.06884},
     venue = {EMNLP},
     year = {2022}
}

Contact

To ask questions or report problems, please open an issue or email alexandra.butoi@inf.ethz.ch.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages