Enable filtering on populate parameter in em.find() #567

followben · 2020-05-12T07:08:06Z

Is your feature request related to a problem? Please describe.
We have Fee and Discount entities with a nullable (and inversed) 1:many relation of fee.discounts. Each of these also has a nullable many:1 relation with Organisation. I need to select only fees and associated discounts that (among other things) match a particular organisation or none at all, i.e. excluding all other organisations.

As it stands I can construct a "match or null" where clause on em.getRespository(Fee).find() to retrieve the fee entities, but I can only populate all discounts for a fee or none at all.

Describe the solution you'd like
I'd like to be able to filter the relation populated like I can filter the primary entity.

Describe alternatives you've considered
We've encountered this requirement numerous times while using mikro-orm. As a workaround we currently either:

populate all entities in the relation and filter in the application code, or
execute multiple queries, or
use QueryBuilder to select multiple rows per primary entity (each with the relation matching a where on the join) and map result columns to entities manually

1 & 2 are suboptimal performance-wise. 3 is better for the db but feels unwieldy and unnecessary - constructing a (psuedo-)sql query and deserialising results is... why we have an ORM :)

Additional context
LLBL achieves this via the PrefetchPath to which you can attach a predicate, sort etc.

Sequelize enables the same via an include statement during eager loading. IMHO, a similar syntax wouldn't look out of place here.

Looks like work is well underway on LoadStrategy at #556. Could this be achieved via the subquery option suggested in #440?

The text was updated successfully, but these errors were encountered:

B4nan · 2020-05-12T11:06:50Z

I'd like to be able to filter the relation populated like I can filter the primary entity.

As discussed on slack, this is already possible, the partial condition that apply to child/populated entity will be propagated to the query that loads it.

const res = await orm.em.find(Author2, { books: { title: { $in: ['b1', 'b2'] } } }, ['books']);
// [query] select "e0".* from "author2" as "e0" left join "book2" as "e1" on "e0"."id" = "e1"."author_id" where "e1"."title" in ($1, $2) [took 6 ms]
// [query] select "e0".*, "e0".price * 1.19 as "price_taxed" from "book2" as "e0" where "e0"."author_id" in ($1) and "e0"."title" in ($2, $3) order by "e0"."title" asc [took 5 ms]

1 & 2 are suboptimal performance-wise. 3 is better for the db but feels unwieldy and unnecessary - constructing a (psuedo-)sql query and deserialising results is... why we have an ORM :)

First of all, the approach with multiple queries can be actually faster when the DB is really large. Having too complex queries can result in real slowness, I've been there. I don't know why everybody thinks that more simple queries are always slower, that is simply not true, and its not true especially for large databases (and we are talking about those anyway, right?).

Why do you think we need subqueries? How is that different from joined query? (from my experience, subqueries are usually slower)

Sequelize enables the same via an include statement during eager loading. IMHO, a similar syntax wouldn't look out of place here.

I said it somewhere already, I can't understand why we should have so complex api, why not use a query builder for that? QB does work with entities, and support mapping results to entities. Also supports auto-joining, so the API is really similar to the one from EM. And with #556 it will also support mapping of multiple (nested) entities from single (joined) query.

followben · 2020-05-13T04:28:42Z

this is already possible the partial condition that apply to child/populated entity will be propagated to the query that loads it

That's great news. Alas, I'm struggling to get this working - I've created a test case to demonstrate. Be great if you could set me straight on what we're doing wrong :)

I don't know why everybody thinks that more simple queries are always slower, that is simply not true

You're right of course - I meant to say "1 and 2 are potentially suboptimal". OTOH, complex single queries can be faster the multiples if they're correctly designed, planned and optimised. If em.find() supports both then we have the option to choose.

Why do you think we need subqueries?

I don't. You're right: joins can be (although aren't always) better optimised. I was just asking if that's how we'd achieve populate filtering using the forthcoming changes on #556, given that's what Will had mentioned originally at #440.

why not use a query builder... it will also support mapping of multiple (nested) entities from single (joined) query

Because it doesn't support nested mapping yet :) Once it does, we can certainly use QB to achieve this.

I'm arguing filter on populate is sufficiently useful it should be incorporated in higher-level find methods. If it already is and we're just using it wrong... great! If not, why not add it?

QB is essentially SQL re-written in a DSL: powerful and great to have at hand, but verbose and somewhat cumbersome for a useful feature that could have a simpler api.

B4nan · 2020-05-13T08:38:39Z

Your problem with first query is the { foos: null } in the query condition, this is not yet supported (not even sure if we will support it like this, I was thinking about adding new operators for this use case, $empty and $size). Here is an issue about it #548.

You can do this:

const result1 = await orm.em.find(Bar, { $or: [{ foos: { prop: 'a' } }, { foos: { id: null } }] }, ['foos']);

which does produce this query:

select "e0".* from "bar" as "e0" left join "foo" as "e1" on "e0"."id" = "e1"."bar_id" where ("e1"."prop" = 'a' or "e1"."id" is null)

Still not what you want, as it will return only one entity. In fact there are relations set in both your entities.

So the question is more like - what SQL are you expecting to solve this?

In the comment you have:

Goal: find all bars but only populate the foo relation if foo.prop is 'a', i.e.

Well but this is not what you are doing in the query above. You are requesting bars that either have a relation with foo.prop = a, or there is no relation at all. The condition is first applied to the root entity, so you can't get all bars this way.

If you want to have all bars but populate only something, correct solution (and I am pretty sure that also the fastest when it comes to big data) is to do one query to load the roots and another to populate:

const result1 = await orm.em.find(Bar, {});
await orm.em.populate(result1, ['foos'], { foos: { prop: 'a' } } as any); // there are some issues in v4 typings of the populate method, so had to cast as any otherwise the type of result1 was not used as T
expect(result1).toHaveLength(2);
expect(flatMap(result1, b => b.foos.getItems())).toHaveLength(1);

Doing it otherwise just seems overcomplicated. Also your links to other solutions are about something different I think, I don't think they can do this in single query (and let me say it one more time, the solution in typeorm looks terrible, I would never let this pass my code review , if we want to optimise for performance, its much easier to use raw sql instead).

Let's put it this way - if you want to do it in single query, how do you think the query should look like?

If you are fine with multiple queries, then my approach with em.populate() feels much more readable.

Btw I don't understand what are you expecting from having 1:m nullable. Its an inverse side, collection, it can't be nullable, there is nothing to be nulled (you can have it there, but its a no-op, nullable have a meaning only on owning sides).

QB is essentially SQL re-written in a DSL: powerful and great to have at hand, but verbose and somewhat cumbersome for a useful feature that could have a simpler api.

QB in MikroORM is much more powerful, it handles auto-joins, it is aware of entity metadata. I would argue about it being verbose, it can be used almost the same way as with the EM API. You can use the very same query condition here and there, no need to manually join things. And what you are proposing here would basically make the EM api more close to the QB in terms of complexity.

B4nan · 2020-06-16T07:56:33Z

Closing due to inactivity. As said above, this is already possible, conditions are propagated to the populate queries, and in v4 it is possible to use the joined strategy too. I don't see a value in having another strategy, introducing the joined one ended up being the second most complex thing in the whole ORM development (right after composite keys).

followben added the enhancement New feature or request label May 12, 2020

B4nan closed this as completed Jun 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable filtering on populate parameter in em.find() #567

Enable filtering on populate parameter in em.find() #567

followben commented May 12, 2020 •

edited

B4nan commented May 12, 2020

followben commented May 13, 2020 •

edited

B4nan commented May 13, 2020 •

edited

B4nan commented Jun 16, 2020

Enable filtering on populate parameter in em.find() #567

Enable filtering on populate parameter in em.find() #567

Comments

followben commented May 12, 2020 • edited

B4nan commented May 12, 2020

followben commented May 13, 2020 • edited

B4nan commented May 13, 2020 • edited

B4nan commented Jun 16, 2020

followben commented May 12, 2020 •

edited

followben commented May 13, 2020 •

edited

B4nan commented May 13, 2020 •

edited