Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
alternator: limit expression length and recursion depth
DynamoDB limits of all expressions (ConditionExpression, UpdateExpression, ProjectionExpression, FilterExpression, KeyConditionExpression) to just 4096 bytes. Until now, Alternator did not enforce this limit, and we had an xfailing test showing this. But it turns out that not enforcing this limit can be dangerous: The user can pass arbitrarily-long and arbitrarily nested expressions, such as: a<b and (a<b and (a<b and (a<b and (a<b and (a<b and (...)))))) or ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((( and those can cause recursive algorithms in Alternator's parser and later when applying expressions to recurse very deeply, overflow the stack, and crash. This patch includes new tests that demonstrate how Scylla crashes during parsing before enforcing the 4096-byte length limit on expressions. The patch then enforces this length limit, and these tests stop crashing. We also verify that deeply-nested expressions shorter than the 4096-byte limit are apparently short enough for our recursion ability, and work as expected. Unforuntately, running these tests many times showed that the 4096-byte limit is not low enough to avoid all crashes so this patch needs to do more: The parsers created by ANTLR are recursive, and there is no way to limit the depth of their recursion (i.e., nothing like YACC's YYMAXDEPTH). Very deep recursion can overflow the stack and crash Scylla. After we limited the length of expression strings to 4096 bytes this was *almost* enough to prevent stack overflows. But unfortunetely the tests revealed that even limited to 4096 bytes, the expression can sometimes recurse too deeply: Consider the expression "((((((....((((" with 4000 parentheses. To realize this is a syntax error, the parser needs to do a recursive call 4000 times. Or worse - because of other Antlr limitations (see rants in comments in expressions.g) it's actually 12000 recursive calls, and each of these calls have a pretty large frame. In some cases, this overflows the stack. The solution used in this patch is not pretty, but works. We add to rules in alternator/expressions.g that recurse (there are two of those - "value" and "boolean_expression") an integer "depth" parameter, which we increase when the rule recurses. Moreover, we add a so-called predicate "{depth<MAX_DEPTH}?" that stops the parsing when this limit is reached. When the parsing is stopped, the user will see a special kind of parse error, saying "expression nested too deeply". With this last modification to expressions.g, the tests for deeply-nested but still-below-4096-bytes expressions (test_limits.py::test_deeply_nested_expression_*) would not fail sporadically as they did without it. While adding the "expression nested too deeply" case, I also made the general syntax-error reporting in Alternator nicer: It no longer prints the internal "expression_syntax_error" type name (an exception type will only be printed if some sort of unexpected exception happens), and it prints the character position where the syntax error (or too deep nested expression) was recognized. Fixes #14473 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes #14477
- Loading branch information
Showing
6 changed files
with
355 additions
and
47 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.