New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
malformed boolean query for valid regex search that includes () #1500
Comments
I just noticed a similar quirk when trying to filter for a filename that included parens: The filter |
The relevant source files are:
Halo @esm7, I imagine any time you might have for tasks would go on CSS stuff right now, but if you had any insights on this in the future, they would be appreciated. |
Interesting... This suggests that the |
I'm on mobile currently so can't debug this. If anyone were to work on it, I would suggest adding more info to the error messages along the way. |
Note to self:
Which adds quotes around the component expressions. And may be applied to the entire line??? I wonder whether the quotes should only be added around the individual filters that are obtained by the results from applying I also wonder what will happen if there are |
Interesting issue. I can work on it if you prefer (probably next week), or just point out that what I'd do next to understand the issue is to add some debug prints to see how the line looks like after the preprocessing (e.g. the |
Thank you for the reply @esm7 - all helpful. If you can work on it, that would be great - but absolutely no pressure. 😄 |
I'm having an initial look at this. |
I added this test: it('should work with filter containing parenthesis', () => {
// This tests the fix for
// https://github.com/obsidian-tasks-group/obsidian-tasks/issues/1500
// malformed boolean query for valid regex search that includes ()
const filter = createValidFilter(
'( description regex matches /(buy|order)/i ) OR ( path includes Home/Shopping )',
);
testWithDescription(filter, 'buy stuff', true);
}); It failed because filter was undefined, as expected from this bug. I added this debug output: Index: src/Query/Filter/BooleanField.ts
IDEA additional info:
Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
<+>UTF-8
===================================================================
diff --git a/src/Query/Filter/BooleanField.ts b/src/Query/Filter/BooleanField.ts
--- a/src/Query/Filter/BooleanField.ts (revision 8f3612552c5f8605e36fc0e6971e9b54b430c44b)
+++ b/src/Query/Filter/BooleanField.ts (date 1675617479202)
@@ -112,7 +112,11 @@
// Prepare the query to be processed by boon-js.
// Boon doesn't process expression with spaces unless they are surrounded by quotes, so replace
// (due today) by ("due today").
- return line.replace(/\(([^()]+)\)/g, '("$1")');
+
+ const result_old = line.replace(/\(([^()]+)\)/g, '("$1")');
+ console.log(`preprocessExpression: in ${line}
+ : -> ${result_old}`);
+ return result_old;
}
/* It produced this output for the query in the new test:
So the double-quotes have been put inside the regular expression, instead of around the whole description... Other examples of the output look like this, where you can see that the
I had thought that this would be fixable by applying the That's all I've got for now. I can't see a general fix for this, given that |
Indeed the preprocessing code is too simplistic to handle these internal regular expressions. |
Yes I do agree with that. I couldn’t get my head around how to implement it when there are nested filters - the matching brackets could be a long way apart. |
I am seeing the same issue with filenames containing parentheses: But using I was really confused as to why a working Tasks query from months ago (granted, I didn't test it since February) suddenly didn't work anymore... |
Moved to #1852 |
Just leaving this here, in case it is useful to my future self.... I recently learned about Lazy quantifiers, from: More succinct reference: More detailed reference: JavaScript article: I am wildly speculating that esm7's suggestion (quoted above) in combination with lazy quantifiers to match the fewest number of times, may help with this. Caveat: I have not checked the code to see if it is already using lazy quantifiers... |
Another option is to adopt a proper parsing library: |
I've been thinking about this a lot, and have an alternative idea that I think will be easier to reason about and to explain to users - and easier on developers too. (Also, I remain of the feeling that complex regular expressions usually mask bugs... such as this recently discovered example: 1be22e3 - so would prefer not to create any more 🤣 ) New proposalChange the pre-processing step so that instead of adding quotes it does the following: 1 Split the line at the Boolean boundariesSplit it the operators and include all adjacent parens. So for example, divide this:
Into this:
2 Give some kind of name or symbol to each of the non-operator linesSo the example becomes something like:
and save a lookup table to map f1, f2 and f3 to the corresponding filter 3 Reassemble the line and give it to boon-js
And this is what gets parsed to boon-js - so there are not quotes and no 4 Adjust the search codeThis kind of adds another level of indirection, so the Boolean search code will need to lookup which filter to apply. 5 Help users with lines that still fail to parseThere will still be cases that go wrong, for example:
will be split as something like
6 If there are still parsing errors after this fix...Add a help message telling the user:
|
If I understand correctly, your suggestion is equivalent to mine here, in its basic notion of gluing the parenthesis to their adjacent operators. |
Yes, my proposal goes:
If it’s not hard to implement, I think it more goes from 80% or 90% to > 99%… As in, I doubt many real world search strings will contain those Boolean operators. There is another option I am considering. I really want to provide syntax highlighting in Tasks code blocks, and so I have been looking at how that works. This has involved looking at code that others have written to parse programming languages for syntax highlighting. There is a small chance I may eventually understand enough about the CodeMirror parsing mechanisms to come up with a better parsing solution for this issue too. |
PS You’re correct, I hadn’t spotted that I was restating your earlier suggestion. Thanks. |
Hi @aubreyz, Moving #1852 (comment) to here... |
Hi @aubreyz, Re the following causing error messages because of this issue, when the file name contains
Sure.
|
Wow, thank you so much Clare (didn't work with the slashes though - but great if all on one line, at least with the current non-beta plugin version). \ not either |
Hi @aubreyz,
(There aren't supposed to be any beta versions... if you can see one, could you please send a link?) That's weird. I just:
and it worked fine each time. If you're inclined to investigate:
If it still doesn't work, and you have time, a new bug report, including the error message, would be much appreciated. |
Oops my bad. I just realised that particular vault had not updated to 5.0.0... |
Please check that this issue hasn't been reported before.
Expected Behavior
That the following search should work:
Similar issue
This is similar to #1068, but the workaround there was to ignore tasks blocks in template files, whereas in this case the search is meant to be valid.
Current behaviour
It gives:
Steps to reproduce
Paste the following
And preview the results. The error will be seen.
Note: This will work, which is why I believe that the problem is in the boolean-parsing code:
Which Operating Systems are you using?
Obsidian Version
1.1.9
Tasks Plugin Version
1.22.0
Checks
Possible solution
It seems that the regex-parsing code is finding brackets inside search strings.
A workaround is to break down the regular expression to:
But with my example above, with more query strings, that gets a bit onerous.
The text was updated successfully, but these errors were encountered: