Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce two filters to select documents with null and empty fields #3571

Merged
merged 21 commits into from
Apr 27, 2023

Conversation

Kerollmops
Copy link
Member

@Kerollmops Kerollmops commented Mar 8, 2023

Pull Request

Related issue

This PR implements the X IS NULL, X IS NOT NULL, X IS EMPTY, X IS NOT EMPTY filters that this comment is describing in a very detailed manner.

What does this PR do?

IS NULL and IS NOT NULL

This PR will be exposed as a prototype for now. Below is the copy/pasted version of a spec that defines this filter.

  • IS NULL matches fields that EXISTS AND = IS NULL
  • IS NOT NULL matches fields that NOT EXISTS OR != IS NULL
  1. {"name": "A", "price": null}
  2. {"name": "A", "price": 10}
  3. {"name": "A"}

price IS NULL would match 1
price IS NOT NULL or NOT price IS NULL would match 2,3
price EXISTS would match 1, 2
price NOT EXISTS or NOT price EXISTS would match 3

common query : (price EXISTS) AND (price IS NOT NULL) would match 2

IS EMPTY and IS NOT EMPTY

  • IS EMPTY matches Array [], Object {}, or String "" fields that EXISTS and are empty
  • IS NOT EMPTY matches fields that NOT EXISTS OR are not empty.
  1. {"name": "A", "tags": null}
  2. {"name": "A", "tags": [null]}
  3. {"name": "A", "tags": []}
  4. {"name": "A", "tags": ["hello","world"]}
  5. {"name": "A", "tags": [""]}
  6. {"name": "A"}
  7. {"name": "A", "tags": {}}
  8. {"name": "A", "tags": {"t1":"v1"}}
  9. {"name": "A", "tags": {"t1":""}}
  10. {"name": "A", "tags": ""}

tags IS EMPTY would match 3,7,10
tags IS NOT EMPTY or NOT tags IS EMPTY would match 1,2,4,5,6,8,9
tags IS NULL would match 1
tags IS NOT NULL or NOT tags IS NULL would match 2,3,4,5,6,7,8,9,10
tags EXISTS would match 1,2,3,4,5,7,8,9,10
tags NOT EXISTS or NOT tags EXISTS would match 6

common query : (tags EXISTS) AND (tags IS NOT NULL) AND (tags IS NOT EMPTY) would match 2,4,5,8,9

What should the reviewer do?

  • Check that I tested the filters
  • Check that I deleted the ids of the documents when deleting documents

@Kerollmops Kerollmops added this to the v1.2.0 milestone Mar 8, 2023
@github-actions
Copy link

github-actions bot commented Mar 8, 2023

Uffizzi Ephemeral Environment deployment-18470

☁️ https://app.uffizzi.com/github.com/meilisearch/meilisearch/pull/3571

📄 View Application Logs etc.

The meilisearch preview environment contains a web terminal from where you can run the
meilisearch command. You should be able to access this instance of meilisearch running in
the preview from the link Meilisearch Endpoint link given below.

Web Terminal Endpoint :
Meilisearch Endpoint : /meilisearch

@Kerollmops Kerollmops changed the title Introduce an IS NULL filter Introduce a filter to select documents with NULL fields Mar 8, 2023
@Kerollmops Kerollmops requested a review from dureuill March 9, 2023 09:07
@Kerollmops Kerollmops marked this pull request as ready for review March 9, 2023 09:07
Copy link
Contributor

@dureuill dureuill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! Looking good (I trust you on the grenad stuff as I couldn't get all it is doing, but it is evidently similar to the existing code)

Regarding tests:

  • doesn't appear in milli/src/snapshot_tests. Similarly we could have a db_snap with facet_ids_null_docids in the tests of delete_documents.rs
  • in milli/src/update/index_documents/mod.rs we have an index_documents_check_exists_database, do we want same thing for the new db? index_documents_check_null_database?
  • I see the tests in search reimplement a way to extract the expected results from the filter, I'm a bit less comfortable than with tests against known and hardcoded expected results since there could be a bug in that implementation. It looks correct upon reading it though.

Regarding the correct deletion of the docids it looks similar to the exists db, so if there are no bugs there there are no bugs here.

I have a few suggestions/comments (see below)

Copy link
Member

@irevoire irevoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything looks good to me except for the name on one method 🙆

milli/src/update/delete_documents.rs Outdated Show resolved Hide resolved
Co-authored-by: Louis Dureuil <louis@meilisearch.com>

aaa
dureuill
dureuill previously approved these changes Mar 9, 2023
Copy link
Contributor

@dureuill dureuill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lovely ❤️

@Kerollmops
Copy link
Member Author

So I added the tests you were talking about @dureuill. Thank you for pointing that. I fixed the changes you asked for.

I am now ready to create a tag for the prototype. Could you please 👍 on this comment if you think I can create it right now?

@Kerollmops
Copy link
Member Author

I just created a prototype-null-filter-0 tag, and the prototype CI is currently running.

@ahmednfwela
Copy link

Thanks for this great PR! I just tested it on dart client and it works perfectly

ahmednfwela added a commit to Bdaya-Dev/meilisearch-dart that referenced this pull request Mar 13, 2023
@Kerollmops Kerollmops changed the title Introduce a filter to select documents with null fields Introduce two filters to select documents with null and empty fields Mar 15, 2023
@ahmednfwela
Copy link

should IS EMPTY match empty strings too? if they can be matched already using "", can't users just do
(tags EXISTS) AND (tags IS NOT NULL) AND (tags IS NOT EMPTY) AND (tags != "")

@Kerollmops
Copy link
Member Author

Hey @ahmednfwela,

should IS EMPTY match empty strings too? if they can be matched already using "", can't users just do (tags EXISTS) AND (tags IS NOT NULL) AND (tags IS NOT EMPTY) AND (tags != "")

Indeed IS EMPTY matches empty strings, empty arrays, and empty objects. Meilisearch doesn't take empty strings into account, empty facets are not possible. The main reason is mostly because we want to search by prefix of facets and the empty one ("") will add more problems than anything.


For your information, I pushed a new prototype version named prototype-null-empty-filters-0 in which I added the new IS EMPTY filter and the prototype CI is currently running.

I updated the main message accordingly to reflect the new syntax.

dureuill
dureuill previously approved these changes Mar 16, 2023
Copy link
Contributor

@dureuill dureuill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should maybe add automated tests on the failed parsing of "value NULL", "value NOT NULL", "value IS", "value IS NOT", "value IS EXISTS", etc.

Still would be interested in this kind of tests, but otherwise the code still looks good to me after the IS EMPTY addition. Thank you.

@Kerollmops
Copy link
Member Author

I wonder if we should maybe add automated tests on the failed parsing of "value NULL", "value NOT NULL", "value IS", "value IS NOT", "value IS EXISTS", etc.

Still would be interested in this kind of tests, but otherwise the code still looks good to me after the IS EMPTY addition. Thank you.

Just did the tests for that, tell me what you think about my insta tests 🔬

Copy link
Contributor

@dureuill dureuill left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭 beautiful thank you

@ahmednfwela
Copy link

what do you think about adding filters for field types as well ?
e.g.:
IS OBJECT
IS ARRAY
IS STRING
IS NUMBER
IS BOOL

@curquiza
Copy link
Member

curquiza commented Mar 23, 2023

Hello @ahmednfwela
Thanks for the suggestion! To avoid losing track of them, we recommend not putting them into PRs, but more on the product repository discussion.
You can create a new discussion if has not already been created. You can let us know about your use case 😊

Thanks again for your involvement in this feature! ❤️

@ahmednfwela
Copy link

Thanks for informing me, I created the discussion meilisearch/product#634

This was linked to issues Mar 28, 2023
@tonivega
Copy link

I would be nice to have this merged, @irevoire is there something else needed?

@irevoire
Copy link
Member

Hey @tonivega, we're just checking a few last things on our side. But this feature should be land in the next release 👍

@curquiza
Copy link
Member

Hello @tonivega

To be more transparent, we were expecting more tests and feedback from the users before merging it, to ensure this is what the users want.
Have you tested it?

As @irevoire said, we indeed plan to merge it, so any feedback would be appreciated

Copy link
Member

@irevoire irevoire left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bors merge

@bors
Copy link
Contributor

bors bot commented Apr 27, 2023

Build succeeded:

@bors bors bot merged commit 414b3fa into main Apr 27, 2023
@bors bors bot deleted the filter-is-null-fields branch April 27, 2023 14:04
@meili-bot meili-bot added the v1.2.0 PRs/issues solved in v1.2.0 released on 2023-06-05 label Jun 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v1.2.0 PRs/issues solved in v1.2.0 released on 2023-06-05
Projects
None yet
Development

Successfully merging this pull request may close these issues.

IS NULL filter operator IS EMPTY filter operator
7 participants