-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantifiers add unnecessary non-capturing group #70
Comments
This is an issue of the current implementation which checks for more than a single character (https://github.com/yoav-lavi/melody/blob/main/crates/melody_compiler/src/regex/utils.rs#3) which is always correct but not ideal, it should be changed to something that can identify sequences that can directly be quantified without grouping. Thanks for the report! |
@trezy I think this should solve it, are you aware of any other cases that are directly quantifiable? Originalpub fn wrap_quantified(value: String) -> String {
let is_grouped = value.starts_with('(') && value.ends_with(')');
if !is_grouped && value.chars().count() > 1 {
format!("(?:{value})")
} else {
value
}
} Updatedpub fn wrap_quantified(value: String) -> String {
if !directly_quantifiable(&value) {
format!("(?:{value})")
} else {
value
}
}
fn directly_quantifiable(value: &str) -> bool {
let value_char_count = value.chars().count();
// missing (and currently unsupported):
// \p{...}
// \P{...}
// \xYY
// \ddd
// \uYYYY
match value_char_count {
// single char values
1 => true,
// escaped single char values
2 => value.starts_with('\\'),
// groups and character classes
_ => match value.chars().next() {
Some('(') => value.ends_with(')'),
Some('[') => value.ends_with(']'),
_ => false,
},
}
} Note that this will only ever receive a single expression (i.e. not This produces the correct output: #[test]
fn directly_quantifiable() {
let output = compiler(
r#"
5 of <word>;
"#,
);
assert_eq!(output.unwrap(), r"\w{5}");
} |
Fixed |
Describe the bug
When using
some of ...
with symbols, a non-capturing group is added unconditionally. This is unnecessary for most individual symbols (e.g.<word>
).To Reproduce
Steps to reproduce the behavior:
some of ...
with a single symbol, e.g.:Expected behavior
RegEx output should only add wrap things with non-capturing groups if necessary or if implicitly required (via
match {}
).Examples
The text was updated successfully, but these errors were encountered: