New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add boolean "not" query operator #1728
Conversation
* Add support for user friendly "not" operator in queries without requiring to use regular expressions, via adding NotQuery to beets.dbcore.query. * Update the prefix list at library.parse_query_parts() and the query parsing mechanisms in order to create a NotQuery when the prefix is found. * Add two TestCases, NotQueryMatchTest as the negated version of MatchTest, and the more generic NotQueryTest for testing the integration of the NotQuery with the rest of the existing Query subclasses.
@@ -447,6 +447,24 @@ def match(self, item): | |||
return any([q.match(item) for q in self.subqueries]) | |||
|
|||
|
|||
class NotQuery(MutableCollectionQuery): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As hinted on the main comment, the reason for being a subclass of MutableCollectionQuery
is rather weak, and mainly due to laziness. Conceptually, I can't really think of an argument for NotQuery
having more than one subquery (and if there is a case for that, one could always use a NotQuery(AndQuery([...]))
or similar).
The "only one subquery" is not enforced at the moment, and has some impact on readability (self.clause_with_joiner(joiner)
never uses the joiner, self.match()
iterates over only one element, constructor receives a list of one element as its first parameter, etc). I'm still revising the implications, but seems like preserving the rest of niceties and special methods of MutableCollectionQuery
(or at least CollectionQuery
) would be desirable. Any hints on what would be the best option? May just be a case of overriding __init__
, accepting only one subquery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can see why this was easy but might feel awkward. Probably the most straightforward approach is to not inherit from anything special (i.e., make Query itself the superclass). The __init__
method would take a single subquery, and clause
and match
would work more or less as they do now.
Awesome! Thank you for tackling this! You've made a lot of progress already. ✨ On the specific syntax: Personally, I would find a prefix outside of the query term the most intuitive. For example, you would write
Also, on the concrete character used: It would be nice to have an alternative when ¬ is hard to type. It's perfectly easy to get on my Mac, of course, but I'm sure it's tricky on Windows or whatever. The natural thing to use would be a minus - sign, but that has clear problems with being mistaken for command-line options. So here's an idea: let's allow either ¬ or -, and then on the command line use the old This will also work well with other interfaces, like the web plugin, where - doesn't pose a problem. |
Thanks for the feedback and for sharing your thoughts, in particular with the syntax issue! As mentioned, the choice was rather arbitrary and meant to be changed indeed. I'd be open to dropping |
* Modify NotQuery so it subclasses Query instead of MutableCollectionQuery. * Update instances where NotQuery objects are created on tests and queryparse, as NotQuery expects a single Query as a parameter to the constructor instead of a list of Queries.
* Update the negation query sintax so that the negation prefix ("-" or "¬") is always the first part of a query, instead of after the ":" separator. * Modify queryparse, so that the detection of the negation is done inside parse_query_part() (using the modified PARSE_QUERY_PART_REGEX, and returning the negation flag) instead of construct_query_part. * Revert the prefixes dict on beets.library to the original dict (only one item, with the RegexpQuery prefix).
* Add tests to NotQueryTest for testing the results of using queries with negation. * Fix issue on test_dbcore due to the modifications on the tuple returned by parse_query_part (the number of elements was changed from 3 to 4).
* Fix issue with an assertion that was order-sensitive and caused a problem on some tox runs.
Updated the code with the suggested changes:
Also, I added a handful of new tests checking the behaviour of |
Awesome! This is looking great. I'll do a full code review now. On the syntax: here's one other idea for a magic character. The carat, |
if clause: | ||
return 'not ({0})'.format(clause), subvals | ||
else: | ||
# special case for RegexpQuery, (None, ()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In fact, this is actually the case for any "slow" query. For explicitness, I might suggest putting this in __init__
:
self.fast = subquery.fast
And then here:
if self.fast:
clause, subvals = ...
return ...
else:
# A slow query.
return None, ()
as in the clause
method of FieldQuery
, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After playing a bit with the suggestion and inspecting the code more carefully, I'm wondering if the suggested change for explicitness would actually decrease readability? In particular:
- only the
FieldQuery
subclasses have thefast
attribute, and ideallyNotQuery
should work fine with any 'Query' (for example,-foo
results in aNotQuery(AnyQuery(...))
). RegexpQuery
is still a bit problematic, as it is the onlyFieldQuery
descendant that does not implementcol_clause()
. As a result, I can't seem to find a way to avoid the explicit checking forclause != None
at 458-462, otherwiseNotQuery(RegexpQuery(..., fast=True).clause()
ends up returning(u'not (None)', ())
, which breaks many test cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point; I forgot that reading the fast
flag would assume a FieldQuery.
Let's keep it like this. I suggest a comment something like this:
If there is no clause, there is nothing to negate. All the logic is handled by
match()
for slow queries.
The tests are looking awesome, by the way! Extremely thorough. |
* Revise the NotQuery syntax, replacing the '¬' character with '^'. Fix tests to conform to this change, and cleanup the PARSE_QUERY_PART_REGEX. * Modify parse_query_part() docstring to mention the negate parameter on the returned tuple, and added an example.
Thanks for the review! Just pushed some of the changes: the main difference is that now the syntax uses |
@@ -447,6 +447,34 @@ def match(self, item): | |||
return any([q.match(item) for q in self.subqueries]) | |||
|
|||
|
|||
class NotQuery(Query): | |||
"""A query that matches the negation of its `subquery`, as a shorcut for | |||
performing `not(subquery)` without using regular expressions.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny style nit: PEP8 suggests putting the closing """
on its own line (for some reason).
Looking great! The only loose ends from here are a changelog entry and documentation on the page about queries. Then, please feel free to merge at will! Afterward, we can make sure |
* Add changelog and query.rst documentation entries for the usage of negated queries. * Cleanup NotQuery class as suggested during code review (PEP conforming docstring, clarification on empty clause match behaviour).
Updated the code with the suggested changes, and added an entry to As for the next steps, I'd be up for taking a look at them during next week (via a new pull request, I assume?), seems like a logical follow-up. I did not test the |
Add boolean "not" query operator
- Use a marginally more realistic example in the changelog. - The -- convention is actually not the purview of shells; it's just a de facto standard for command-line parsers. Fortunately, argparse supports it out of the box.
Wonderful! Thank you again---this is an awesome addition! ✨ Much gratitude on behalf of the entire beets community. I merged with just a few tiny changes to the docs. |
A tentative effort for tackling #819, which hopefully will be welcome by users such as myself whose brain decides to forget regular expression syntax more often than not.
It started as a small test on my end (basically, wrap the existing SQL queries with
NOT ()
) and might have turned into a rough but working prototype for several scenarios, and even though the formal tests are far for complete, user should be able to do things like:I'll add some line comments and clarifications within a day, but the main points are:
NOT
), negating thematch()
of the subquery)? I was a bit hesitant at first, but kept adding tests and glancing at the sqlite SQL specs and, well, haven't found large issues yet.NotQuery
definition is currently a bit messy: it is a subclass ofMutableCollectionQuery
for ... historical reasons (ie. it was the quickest way for me to play with a Query that had subqueries). I'm not really sure if there is a valid case where more than one subquery will be needed, and tests have gone in that direction.field:¬value
, as it was the only option (the others beingfield.not:value
andnot field:value
, unless I missed some comments) that allowed quickly reusing theprefixes
mechanism, and avoid introduce extra code for resolving ambiguities (specially in the case of the third option). I'm wondering if a consensus has been reached on what would be the preferred syntax?NotQuery
plays with the differentQuery
subclasses, and comparing it to existing tests ontest_query.py
- I did not run tests for other modules yet.