Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large quantity of facets causing degradation of Meilisearch performance #2349

Closed
shivaylamba opened this issue Apr 25, 2022 · 5 comments
Closed
Labels
bug Something isn't working as expected milli Related to the milli workspace performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption v0.28.0 PRs/issues solved in v0.28.0
Milestone

Comments

@shivaylamba
Copy link
Contributor

shivaylamba commented Apr 25, 2022

Describe the bug
I am building an e-commerce demo using an Amazon Dataset that contains more than 1 million items. It has 300k brands, 100k tags which are added as part of filterable attributes. I am using Instant Meilisearch and thus these are also part of the Refinement list.
The time required to calculate the facets will increase as the number of facets increases, hence degrading search performance. Since there are 300K brands and 100K tags, it can take a lot of time to load them initially.

To Reproduce
Steps to reproduce the behavior:

  1. Visit https://medusajs-storefront.vercel.app/
  2. Click on Inspect element to open Developer tools
  3. Go to the networks section and click on the search request
  4. Click on the Response tab
  5. You can see the response.

Expected behavior
Performance should be quick.

Screenshots
image

image

Meilisearch version: v0.26.0

Additional context
Additional information that may be relevant to the issue.
[e.g. architecture, device, OS, browser]
Browser: Chrome
OS: MacOS
Device: Mac Book Pro

@curquiza curquiza added the performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption label Apr 25, 2022
@Kerollmops
Copy link
Member

Kerollmops commented Apr 25, 2022

Hey @shivaylamba,

I ran the query that was taking a lot of time and it is indeed related to the number of facets returned in the results. More specifically we are returning 172700 facet values associated with their counts, which is a lot.

You can run this jq query on the file I linked, here. Make sure to unzip it first.

output.json.zip

cat output.json | jq '. | .facetsDistribution.brand | length'

I remember a PR that I have done that removed the limit on the number of facets returned by the engine, this should probably be reintroduced.

@shivaylamba
Copy link
Contributor Author

Thanks for your response!

So for now would you recommend reducing the number of facets value @Kerollmops

@curquiza curquiza added this to the v0.28.0 milestone Apr 26, 2022
@curquiza
Copy link
Member

Discussed with @gmourier and @Kerollmops -> this will be fixed in v0.28.0 by introducing an hard limit (will be customizable in the future, but not for v0.28.0)
I will open an issue regarding this next week

@shivaylamba
Copy link
Contributor Author

Alright thank you @curquiza

@curquiza curquiza added the bug Something isn't working as expected label May 17, 2022
@Kerollmops Kerollmops changed the title Large quantity of facets causing degradation of Meilisearch peformance. Large quantity of facets causing degradation of Meilisearch performance May 18, 2022
bors bot added a commit to meilisearch/milli that referenced this issue Jun 1, 2022
535: Reintroduce the max values by facet limit r=ManyTheFish a=Kerollmops

This PR reintroduces the max values by facet limit this is related to meilisearch/meilisearch#2349.

~I would like some help in deciding on whether I keep the default 100 max values in milli and set up the `FacetDistribution` settings in Meilisearch to use 1000 as the new value, I expose the `max_values_by_facet` for this purpose.~

I changed the default value to 1000 and the max to 10000, thank you `@ManyTheFish` for the help!

Co-authored-by: Kerollmops <clement@meilisearch.com>
@Kerollmops Kerollmops added the milli Related to the milli workspace label Jun 2, 2022
@curquiza
Copy link
Member

curquiza commented Jun 8, 2022

Closed by #2468

@curquiza curquiza closed this as completed Jun 8, 2022
@curquiza curquiza added the v0.28.0 PRs/issues solved in v0.28.0 label Aug 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected milli Related to the milli workspace performance Related to the performance in term of search/indexation speed or RAM/CPU/Disk consumption v0.28.0 PRs/issues solved in v0.28.0
Projects
None yet
Development

No branches or pull requests

3 participants