Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force exhaustive search when faceting is requested #3453

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

dureuill
Copy link
Contributor

@dureuill dureuill commented Feb 2, 2023

Pull Request

Related issue

Related to #3426.

What does this PR do?

When not using pagination mode, the search is not exhaustive. This can result in incomplete results for facet distribution and wrong results for facet stats (as implemented in #3423).

This PR forces the search to be as exhaustive as possible1, even when not using pagination mode, as soon as a facet distribution (or facet stats) is required.

Tests

Unfortunately, I was not able to come up with a test case that demonstrates how the search is not exhaustive by default. I'm still opening the PR should the issue pop up.

Footnotes

  1. My understanding is that we can still have differences in distribution/stats when distinct attributes are used in the query.

@dureuill
Copy link
Contributor Author

dureuill commented Feb 2, 2023

cc @ManyTheFish

@ManyTheFish
Copy link
Member

Hello @dureuill,

thank you for the PR, I'd personally wait for proof or an issue showing that the facet distribution is not sufficiently exhaustive to merge this PR, I'm a bit concerned by the performance impact. 😕

Did you see any significant difference in the search performances?

@dureuill
Copy link
Contributor Author

dureuill commented Feb 6, 2023

Did you see any significant difference in the search performances?

Testing on my dataset was insufficient to show anything one way or the other. We'd need tests or some more realistic use cases.

thank you for the PR, I'd personally wait for proof or an issue showing that the facet distribution is not sufficiently exhaustive to merge this PR, I'm a bit concerned by the performance impact. 😕

Sure! Let's keep this PR as draft for now

@dureuill dureuill marked this pull request as draft February 6, 2023 12:16
@gmourier
Copy link
Member

gmourier commented Feb 6, 2023

@dureuill We could test it on the dataset @Strift will use to make an e-commerce demo

@Strift
Copy link
Contributor

Strift commented Feb 6, 2023

Hey there. Here is the link to the aforementioned ecommerce dataset.

To be honest, I haven't taken much time to challenge the relevance of the data. So if you see anything weird, you can let me know!

Cheers,

@Kerollmops
Copy link
Member

Kerollmops commented Mar 2, 2023

I am also very concerned about the performance impact it can have. I want to avoid falling into this kind of performance issue. Having estimated counts is enough for some of our users that search performance over accuracy, but it can be an issue for other users.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants