Skip to content

Commit

Permalink
syntax: fix bug when parsing ((?x))
Browse files Browse the repository at this point in the history
This fixes yet another bug with our handling of (?flags) directives in
the regex. This time, we try to be a bit more principled and
specifically treat a (?flags) directive as a valid empty sub-expression.
While this means we could remove errors reported from previous fixes for
things like `(?i)+`, we retain those for now since they are a bit weird.
Although `((?i))+` is now allowed, which is equivalent. We should
probably allow `(?i)+` in the future for consistency sake.

Fixes #527
  • Loading branch information
BurntSushi committed Mar 30, 2019
1 parent 7b1599f commit 2316432
Show file tree
Hide file tree
Showing 3 changed files with 22 additions and 5 deletions.
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,8 @@ Performance improvements:

Bug fixes:

* [BUG #527](https://github.com/rust-lang/regex/issues/527):
Fix a bug where the parser would panic on patterns like `((?x))`.
* [BUG #555](https://github.com/rust-lang/regex/issues/555):
Fix a bug where the parser would panic on patterns like `(?m){1,1}`.
* [BUG #557](https://github.com/rust-lang/regex/issues/557):
Expand Down
19 changes: 14 additions & 5 deletions regex-syntax/src/hir/translate.rs
Original file line number Diff line number Diff line change
Expand Up @@ -240,11 +240,6 @@ impl<'t, 'p> Visitor for TranslatorI<'t, 'p> {
type Err = Error;

fn finish(self) -> Result<Hir> {
if self.trans().stack.borrow().is_empty() {
// This can happen if the Ast given consists of a single set of
// flags. e.g., `(?i)`. /shrug
return Ok(Hir::empty());
}
// ... otherwise, we should have exactly one HIR on the stack.
assert_eq!(self.trans().stack.borrow().len(), 1);
Ok(self.pop().unwrap().unwrap_expr())
Expand Down Expand Up @@ -287,6 +282,16 @@ impl<'t, 'p> Visitor for TranslatorI<'t, 'p> {
}
Ast::Flags(ref x) => {
self.set_flags(&x.flags);
// Flags in the AST are generally considered directives and
// not actual sub-expressions. However, they can be used in
// the concrete syntax like `((?i))`, and we need some kind of
// indication of an expression there, and Empty is the correct
// choice.
//
// There can also be things like `(?i)+`, but we rule those out
// in the parser. In the future, we might allow them for
// consistency sake.
self.push(HirFrame::Expr(Hir::empty()));
}
Ast::Literal(ref x) => {
self.push(HirFrame::Expr(self.hir_literal(x)?));
Expand Down Expand Up @@ -1547,6 +1552,10 @@ mod tests {
hir_group_name(2, "foo", hir_lit("b")),
hir_group(3, hir_lit("c")),
]));
assert_eq!(t("()"), hir_group(1, Hir::empty()));
assert_eq!(t("((?i))"), hir_group(1, Hir::empty()));
assert_eq!(t("((?x))"), hir_group(1, Hir::empty()));
assert_eq!(t("(((?x)))"), hir_group(1, hir_group(2, Hir::empty())));
}

#[test]
Expand Down
6 changes: 6 additions & 0 deletions tests/regression.rs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,12 @@ fn regression_invalid_repetition_expr() {
assert!(regex_new!("(?m){1,1}").is_err());
}

// See: https://github.com/rust-lang/regex/issues/527
#[test]
fn regression_invalid_flags_expression() {
assert!(regex_new!("(((?x)))").is_ok());
}

// See: https://github.com/rust-lang/regex/issues/75
mat!(regression_unsorted_binary_search_1, r"(?i)[a_]+", "A_", Some((0, 2)));
mat!(regression_unsorted_binary_search_2, r"(?i)[A_]+", "a_", Some((0, 2)));
Expand Down

0 comments on commit 2316432

Please sign in to comment.