Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Meilisearch crashes when using a lot of filter clauses #2163

Closed
martijnderidder opened this issue Feb 10, 2022 · 9 comments
Closed

Meilisearch crashes when using a lot of filter clauses #2163

martijnderidder opened this issue Feb 10, 2022 · 9 comments
Labels
bug Something isn't working as expected feature request & feedback Go to https://github.com/meilisearch/product/

Comments

@martijnderidder
Copy link

Describe the bug
When using Meilisearch in production i want to exclude a lot of id's. But there is a limit when adding a lot of filters clauses.
Is there maybe another way of fixing this? I have a large set of invoices in one index. And i want to select only the documents thats have a relation_id from the user. But in our production website, some users already have 3000 more relations.

To Reproduce
When using a basic Laravel (PHP) envoirement i'm typing something like this. (Its a crazy example, but lets say i only want to exclude all id's except id 73)...

Route::get('/', function () {
    $ids = range(1, 2400);
    unset($ids[73]);

    $user = User::search('', function ($meilisearch, $query, $options) use ($ids) {
        $options['filter'] = '';
        foreach ($ids as $id) {
            $options['filter'] .= 'id != "'. $id .'" AND ';
        }

        $options['filter'] = substr($options['filter'], 0, -4);
        $options['sort'] = ['name:asc'];
        return $meilisearch->search($query, $options);
    })->paginate(20);

    dd($user);
});

I get an error;

cURL error 52: Empty reply from server (see https://curl.haxx.se/libcurl/c/libcurl-errors.html) for http://meilisearch:7700/indexes/users/search

Expected behavior
I would expect that i can use a "unlimited" amount of filter clauses.

Meilisearch version: v0.25.2

@irevoire irevoire added bug Something isn't working as expected feature request & feedback Go to https://github.com/meilisearch/product/ labels Feb 10, 2022
@irevoire
Copy link
Member

irevoire commented Feb 10, 2022

Ok, so there are two things.
First, you’re not supposed to get an empty reply; that’s supposed to be fixed by meilisearch/milli#421 in the v0.25.xx.

Second, since you’re supposed to be getting an error, we can consider this:

I would expect that I can use an "unlimited" amount of filter clauses.

As a feature request, you are currently limited to a certain amount of terms in your filters.

One fix we thought would be to create an IN operation that would allow you to write things like:

NOT id IN [0, 1, 2, 3, 4, ... ]

That would make your request appear as one operation and thus work.

You can read more about it here.


Now on another note, I’m wondering, isn't there a better way to do it?
Because even with this operator, you would be sending something like 12.000 bytes per request (12ko). That’s really a lot and doesn’t seem like a good long term solution to me 🤔

@martijnderidder
Copy link
Author

Ok i guess if there is no IN option, or the filters are limited to a max length, we can not use this search engine for our company in production. I really love Meilisearch, but it doesn't look quite finished yet (it isn't ofc, there is no v1.0). Keep me posted when you guys are ready with an IN function that can have a couple thousand of id's.

@irevoire

Now on another note, I’m wondering, isn't there a better way to do it?
Because even with this operator, you would be sending something like 12.000 bytes per request (12ko). That’s really a lot and doesn’t seem like a good long term solution to me thinking

I don't know how; we use have a large database with milions of records; just look at this example:

Users table:

  • id: 1, name: John, type: entrepreneur
  • id 2, name: Melissa, type: employee
  • id 3, name: Peter, type: employee

Relation table:

  • id 1: user 1 and user 2
  • id 2: user 1 and user 3

Invoices table:

  • Invoice 1: Relation 1
  • Invoice 2: Relation 1
  • Invoice 3: Relation 2

So John can see 3 invoices. But i dont want to include john user id in the invoices table. The whole system is based on the idea of something is owned by a relation (beteen 2 users). So i need a query like: `SELECT id, name FROM invoices WHERE relation = 1 OR relation = 2) ... but some users have thousands of relations.

We are now using Elasticsearch, and that is working. But its very complicated, and not supported by Laravel 9 (with the current package we are using).

@irevoire
Copy link
Member

In reality the filter are limited to a certain depth. And each time you write a OR or AND the depth increase by one.
So once we implement the IN filter you'll have a solution equivalent to how you do it in elastic search I guess 👍

@martijnderidder
Copy link
Author

@irevoire In the time-beeing i have found a solution to our problem. Its is so easy that it is almost stupid i did not think about it haha. But i can ofcourse always add extra fields during the indexing of documents. I can just add a entrepreneur_id and employee_id in the index.

This problem can only be solved if a fix amount of users (in our case 2 users: the person who owns a relation, and who is connected) can be add in our index. If a document can be viewed by a dynamic group of lets say 50 users, its no good option to add extra user_id in the indexes.

@gmourier
Copy link
Member

gmourier commented Feb 14, 2022

Hi @martijnderidder, indeed Meilisearch is not based on a relationship model.

It forces to denormalize the SQL structure in some way. The most direct solution I see would be to deport the many-to-many relationship on each invoice document by integrating an array of user ids that would have access to it.

e.g invoice 1 JSON document
"user_ids": [1, 2]

this way, if user 1 is logged in and wants to be able to find the invoices he can access, a simple filter clause user_ids = 1 should suffice.

Does your many-to-many relationship only accept two users per invoice?

@MarinPostma
Copy link
Contributor

@gmourier this is not a good idea, adding a new user requires re-indexing the database, and this is very inefficient storagewise

@martijnderidder
Copy link
Author

@gmourier this is not a good idea, adding a new user requires re-indexing the database, and this is very inefficient storagewise

This is true. Thats why i already was warning that this solution can only be used with a fix amount of users that can be viewed. So when you add the document, you already know wich users this document can be viewed.

@francoism90
Copy link

I'm also interested in an IN filter as I'm now forced to limit them (e.g. pick 500 IDs).

For now I'm using a hybrid solution; SQL for making lists and Meili when doing filtering with limited results.

@curquiza
Copy link
Member

curquiza commented Mar 2, 2022

Hello everyone here.

I see two concerns

  • Meilisearch crashes when there are too many filters -> this issue will be fixed with v0.26.0 (released 14.03.2022). Here is the related issue: Stack overflow error with huge list of filters #1967
  • The IN operator is definitely a feature request and should be handled on the product side. I opened a ticket regarding it here: IN operator product#407. Feel free to share your use case and your need in this discussion. This will definitely help the product team consider this request.

I close this issue since the bug will be fixed, and the ticket is open on the product side :)

@curquiza curquiza closed this as completed Mar 2, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working as expected feature request & feedback Go to https://github.com/meilisearch/product/
Projects
None yet
Development

No branches or pull requests

6 participants