Skip to content

Commit

Permalink
Validate inputs to ACE to prevent deadlocks.
Browse files Browse the repository at this point in the history
Also, fix a bug that caused the "Malformed output from ACE" error to
be reported excessively.

Fixes #155
  • Loading branch information
goodmami committed Aug 20, 2018
1 parent ee920de commit 78a0bd9
Show file tree
Hide file tree
Showing 2 changed files with 70 additions and 18 deletions.
29 changes: 17 additions & 12 deletions CHANGELOG.md
Expand Up @@ -25,29 +25,34 @@
hierarchy when looking for candidate nodes
* `delphin.mrs.xmrs.Xmrs.from_xmrs()` and `delphin.mrs.eds.Eds.from_xmrs()`
now take a `**kwargs` argument to facilitate the `convert` command (#160)
* The following delphin.tdl functions are now private: `delphin.tdl._parse_avm()`,
`delphin.tdl._parse_affixes()`, `delphin.tdl._parse_typedef()`,
`delphin.tdl._parse_attr_val()`, `delphin.tdl._parse_cons_list()`,
`delphin.tdl._parse_conjunction()`, `delphin.tdl._parse_diff_list()` (#81)
* The following `delphin.tdl` functions are now private:
`delphin.tdl._parse_avm()`, `delphin.tdl._parse_affixes()`,
`delphin.tdl._parse_typedef()`, `delphin.tdl._parse_attr_val()`,
`delphin.tdl._parse_cons_list()`, `delphin.tdl._parse_conjunction()`,
`delphin.tdl._parse_diff_list()` (#81)

### Fixed

* Converting to PENMAN via the `convert` command should no longer crash for
disconnected graphs, but print a log message to stderr, print a blank line
to stdout, and then continue (#161)
* Updated the docstrings for `delphin.mrs.xmrs.Xmrs.args()`,
`delphin.mrs.xmrs.Xmrs.outgoing_args()`, and `delphin.mrs.xmrs.Xmrs.incoming_args()`,
from "DMRS-style undirected links" to "MOD/EQ links" and updated the Return
value of `Xmrs.args()` and `Xmrs.outgoing_args` from `{nodeid: {}}` to
`{role: tgt}`(#133)
`delphin.mrs.xmrs.Xmrs.outgoing_args()`, and
`delphin.mrs.xmrs.Xmrs.incoming_args()`, from "DMRS-style undirected links"
to "MOD/EQ links" and updated the return value of `Xmrs.args()` and
`Xmrs.outgoing_args` from `{nodeid: {}}` to `{role: tgt}` (#133)
* `delphin.mrs.compare.isomorphic()` compares predicates using a normalized form
* Updated the code and the docstrings for references to 'string' and 'grammar'
predicates to refer to 'surface' and 'abstract' predicates (#117)
* `delphin.tdl.parse()` now accepts either a file or a filename as its argument (#104)
* The following dump methods now allow either a file or filename as their arguments like
`delphin.mrs.penman.dump()`: `delphin.mrs.eds.dump()`, `delphin.mrs.simplemrs.dump()`, `delphin.mrs.simpledmrs.dump()`, `delphin.mrs.mrx.dump()`, `delphin.mrs.dmrx.dump()`,
* `delphin.tdl.parse()` now accepts either a file or a filename argument (#104)
* The following dump methods now allow either a file or filename as their
arguments like `delphin.mrs.penman.dump()`: `delphin.mrs.eds.dump()`,
`delphin.mrs.simplemrs.dump()`, `delphin.mrs.simpledmrs.dump()`,
`delphin.mrs.mrx.dump()`, `delphin.mrs.dmrx.dump()`,
`delphin.mrs.prolog.dump()` (#152)

* `delphin.interfaces.ace` now validates parser, transfer, and generator inputs
and refuses to process invalid inputs (#155)
* `delphin.interfaces.ace` handles whitespace in s-expressions a bit better

### Deprecated

Expand Down
59 changes: 53 additions & 6 deletions delphin/interfaces/ace.py
Expand Up @@ -255,17 +255,28 @@ def interact(self, datum):
This is the recommended method for sending and receiving data
to/from an ACE process as it reduces the chances of
over-filling or reading past the end of the buffer. If input
item identifiers need to be tracked throughout processing, see
:meth:`process_item`.
over-filling or reading past the end of the buffer. It also
performs a simple validation of the input to help ensure that
one complete item is processed at a time.
If input item identifiers need to be tracked throughout
processing, see :meth:`process_item`.
Args:
datum (str): the input sentence or MRS
Returns:
:class:`~delphin.interfaces.ParseResponse`
"""
self.send(datum)
result = self.receive()
validated = self._validate_input(datum)
if validated:
self.send(validated)
result = self.receive()
else:
result, lines = _make_response(
[('NOTE: PyDelphin could not validate the input and '
'refused to send it to ACE'),
'SKIP: {}'.format(datum)],
self.run_info)
result['input'] = datum
return result

Expand Down Expand Up @@ -315,6 +326,11 @@ class AceParser(AceProcess):
task = 'parse'
_termini = [re.compile(r'^$'), re.compile(r'^$')]

def _validate_input(self, datum):
# valid input for parsing is non-empty
# (this relies on an empty string evaluating to False)
return datum.strip()

def _default_receive(self):
lines = self._result_lines()
response, lines = _make_response(lines, self.run_info)
Expand Down Expand Up @@ -353,6 +369,9 @@ def __init__(self, grm, cmdargs=None, executable=None, env=None,
tsdbinfo=False, **kwargs
)

def _validate_input(self, datum):
return _possible_mrs(datum)

def _default_receive(self):
lines = self._result_lines()
response, lines = _make_response(lines, self.run_info)
Expand All @@ -371,6 +390,9 @@ class AceGenerator(AceProcess):
_cmdargs = ['-e', '--tsdb-notes']
_termini = [re.compile(r'NOTE: tsdb parse: ')]

def _validate_input(self, datum):
return _possible_mrs(datum)

def _default_receive(self):
show_tree = '--show-realization-trees' in self.cmdargs
show_mrs = '--show-realization-mrses' in self.cmdargs
Expand Down Expand Up @@ -544,6 +566,31 @@ def _ace_version(executable):
return version


def _possible_mrs(s):
start, end = -1, -1
depth = 0
for i, c in enumerate(s):
if c == '[':
if depth == 0:
start = i
depth += 1
elif c == ']':
depth -= 1
if depth == 0:
end = i + 1
break
# only valid if neither start nor end is -1
# note: this ignores any secondary MRSs on the same line
if start != -1 and end != -1:
# only log if taking a substring
if start != 0 and end != len(s):
logging.debug('Possible MRS found at <%d:%d>: %s', start, end, s)
s = s[start:end]
return s
else:
return False


def _make_response(lines, run):
response = ParseResponse({
'NOTES': [],
Expand Down Expand Up @@ -575,7 +622,7 @@ def _sexpr_data(line):
if len(expr.data) != 2:
logging.error('Malformed output from ACE: {}'.format(line))
break
line = expr.remainder
line = expr.remainder.lstrip()
yield expr.data


Expand Down

0 comments on commit 78a0bd9

Please sign in to comment.