Offer Result.intoMap(), intoGroups(), intoSet() functionality as Collector #9288

lukaseder · 2019-09-27T07:20:20Z

We should have a composable, possibly Collector based API to collect complex results into nested DTO projections, etc. An example is this support request here:
#8597

The goal (among others) would be to produce nested POJOs, such as for example:

class Author {
  String firstName;
  String lastName;
  List<Book> books;
}

class Book {
  String title;
}

Perhaps, some inspiration can be found in SimpleFlatMapper: https://simpleflatmapper.org/0106-getting-started-jooq.html

This issue will be split into a set of subtasks as we evaluate what steps are necessary to implement what users find most missing in this area, right now. It is also used as a host for discussions around the issue.

Tasks

To avoid conflicts with other statically imported collectors from java.util.stream.Collectors, such as toList(), ours will be prefixed into, e.g. intoMap(), just like the Result.intoMap() methods they represent / replace.

Interactions with `ExecuteListener`

While the Result refactorings are backwards compatible, refactoring existing ResultQuery implementations are not, if we skip firing the ExecuteListener::resultStart and ExecuteListener::resultEnd events. To allow for reverting to the current behaviour, let's offer a new Settings that governs whether we create the intermediate Result data structure: #11892, the default being not to create it if not explicitly requested

The text was updated successfully, but these errors were encountered:

knutwannheden · 2019-09-27T07:29:23Z

Let me first see if I got this straight:

While some people regard jOOQ as an ORM, it is certainly foremost a query builder. And since ORMs (or FRMs) have their place, it would also make sense for jOOQ to provide more value in this area, as currently jOOQ more or less only makes it easy to map a single tuple to a single Java object. (Mapping a tuple representing a chain of one-to-one relations to a corresponding chain of connected Java objects also works quite well.)

AFAICT what customers most want is a simple way to load a given set of rows / tuples into a "corresponding" collection of "connected" Java objects (e.g. an aggregate with a one-to-many relation like customer-address). And for simple use cases (e.g. only one-to-many relations (with foreign keys) and one-to-one mappings between tables and classes), jOOQ should make this dead simple, while it should also be more or less easy to achieve this goal for more complex scenarios.

In jOOQ we want to implement this functionality on top of the Collector API (and ResultQuery#collect(Collector)) and thus provide some kind of DSL API to create the required Collector.

Since the new API will somehow need to capture how the rows and columns get mapped to objects and properties (also references), I am wondering if we should also design for the use case of allowing the actual SELECT to be derived from this "specification", rather than just a Collector, which processes the results of a user-defined query. Or would we be tapping too far into ORM-land with this? If not, could this then in the future even allow deriving INSERT, UPDATE (and UPSERT) queries for such aggregates?

lukaseder · 2019-09-27T07:50:24Z

While some people regard jOOQ as an ORM, it is certainly foremost a query builder

The paradigm I think suits most here is: jOOQ is SQL. The fact that there's a query builder is secondary, even if it does appear to be the primary feature. But I think it's better to think of jOOQ = SQL, and derive all other discussion from that paradigm.

And since ORMs (or FRMs) have their place, it would also make sense for jOOQ to provide more value in this area

jOOQ does add more value in this area where we do not oppose the philosophies of SQL. I personally think the "third manifesto" didn't get enough traction outside of Oracle and PostgreSQL yet, and even there, it didn't find the adoption it deserves.

The ideal way to implement what I believe most customers want is to support MULTISET and related operators: #3884, that would allow for nesting collections in SQL. Related paradigms that have seen broader adoption are JSON and XML for nesting data structures.

Apart from that, additional client side convenience that works in similar fashions is definitely desireable as well, as long as it does not oppose SQL. By that, I mean we're mainly working on sets of tuples (values), not graphs of entities (objects). As long as we avoid introducing the notion of a record or object "identity", we will be able to "keep it simple".

Having said so, yes, we definitely need much more convenience to produce hierarchical data structures from queries.

AFAICT what customers most want is a simple way to load a given set of rows / tuples into a "corresponding" collection of "connected" Java objects (e.g. an aggregate with a one-to-many relation like customer-address). And for simple use cases (e.g. only one-to-many relations (with foreign keys) and one-to-one mappings between tables and classes), jOOQ should make this dead simple, while it should also be more or less easy to achieve this goal for more complex scenarios.

Yes, of course. But the difficulty will always be the mismatch between the denormalising join operator, and what customers actually wanted, the multiset operator. I don't think it is possible to re-connect denormalised tuples to what they were intended to be. I would really like to avoid any cleverness around that, because it breaks apart very easily.

I think it is important to make sure everyone using jOOQ (or SQL) knows that a join is a filtered cartesian product, not a way to nest collections.

However, as the popularity of fetchGroups() and third party mappers has shown, providing convenience around the nesting of single one-to-many relationships can be useful, as long as users don't think this is a good approach in general (e.g. when there are more than 1 one-to-many relationships, or several levels of nested collections)

In jOOQ we want to implement this functionality on top of the Collector API (and ResultQuery#collect(Collector)) and thus provide some kind of DSL API to create the required Collector.

Yes, Collector is a very good abstraction for this task. jOOλ provides a variety of useful collectors, and we can push this even further, maybe move some of jOOλ into jOOQ:
https://www.jooq.org/products/jOOλ/javadoc/latest/org/jooq/lambda/Agg.html

I think that we should still also offer convenience on ResultQuery and Result for the most common collection scenarios, as such API is much more discoverable than collectors. But the strategic goal is to embrace collectors.

This will limit new API features to the Java 8+ distributions, which I think is OK.

Since the new API will somehow need to capture how the rows and columns get mapped to objects and properties (also references), I am wondering if we should also design for the use case of allowing the actual SELECT to be derived from this "specification", rather than just a Collector, which processes the results of a user-defined query. Or would we be tapping too far into ORM-land with this? If not, could this then in the future even allow deriving INSERT, UPDATE (and UPSERT) queries for such aggregates?

You just re-invented my MULTISET operator idea :) Yes, we should be able to derive some collections from the projection automatically. Ideally, using operators that can also be implemented in SQL.

But some emulations are possible as well. E.g. embeddable types (#2530), which are going to be worked on as well in jOOQ 3.13 are nothing but an emulation of a nested record.

Another recent idea was this one: #9126. Another way to declare a nested record.

This is not opposed to the ideas of the SQL language. In PostgreSQL and Oracle, we can create "object types" / "record types", which then allow for things like:

-- Oracle
SELECT my_object_type(a, b, c) FROM t

-- PostgreSQL
SELECT (a, b, c)::my_object_type FROM t

It is perfectly reasonable to 1:1 map Java DTO / POJO types onto such nested record types in SQL, and provide convenience around that, using reasonable API.

The main risk here is that we'll be doing this in the SELECT clause, which is not the last clause in the logical order of operations in a SELECT:
https://blog.jooq.org/2016/12/09/a-beginners-guide-to-the-true-order-of-sql-operations/

After SELECT, there's also DISTINCT, UNION (and the others), and ORDER BY, which all interact with the projection from SELECT, so we must make sure that we're not limiting the possible SQL constructs too much by whatever features we introduce.

Summary

This issue was originally about more powerful client side data collection constructs (does not affect or interact with the SQL query, but operates on the org.jooq.Result, without knowledge on how it came to be)

There are other issues that talk about deriving such collection algorithms from a more powerful way to express the SELECT clause. These mainly include:

Nesting collections (e.g. using MULTISET)
Nesting records (e.g. using object types / embeddable types / etc)

knutwannheden · 2019-10-19T20:24:05Z

I have been thinking if it could make sense to define the projection as part of the jOOQ DSL API, using some kind of mini DSL extension. Possibly something along the lines of this:

project(BOOK.ID, BOOK.TITLE)
    .using(jpaProjection(Book.class))
    .from(BOOK);

Possibly even allow projecting nested records:

project(BOOK.ID, BOOK.TITLE,
    project(AUTHOR.FIRST_NAME, AUTHOR.LAST_NAME)
        .using(jpaProjection(Author.class)))
    .using(jpaProjection(Book.class))
    .from(BOOK)
        .join(AUTHOR).on(BOOK.AUTHOR_ID.eq(AUTHOR.ID));

The project() clause would basically just render the same SQL as a select(), but would then know how to project the tuples client-side into Java objects.

lukaseder · 2019-10-21T08:55:59Z

People like jOOQ because it mimicks SQL almost perfectly (unless the Java language prevents it, e.g. with CTE or derived tables). Replacing the SELECT clause by something "clever" is going to drastically increase the learning curve and feed the arguments that jOOQ is not exactly SQL.

This is just me criticising the suggested execution here. We could definitely profit from supporting additional projection capabilities. See also the suggestion here: #9126

In other words, we're not gaining anything from avoiding the SQL keywords select() and row(), by replacing them by project(). Nested projections are already possible today using row or rowField() (the Field suffix being a workaround for an IDE / compiler performance problem, see #5233)

I'm sure we can implement this functionality entirely by enhancing the existing select() or from() clause, without introducing alternative, synthetic clauses

Besides, I really think we should not get distracted too much by what JPA is offering here. JPA has a major flaw:

Defaulting to eager fetching is silly, because we're always fetching too much data when we don't need it
Defaulting to lazy fetching is silly, because we're always producing N+1 queries when we do need it

The workaround is to use JOIN FETCH in JPQL or HQL queries, but those syntaxes are not available as annotations (luckily!) in the JPA meta model. Hence, we should not reproduce the mess produced by JPA's defaulting to produce queries from 2 equally bad default choices of eager/lazy fetching nested collections.

While I'm not excluding adding value in this area in the future, our main focus here should not be on JPA, but on the SQL view of how collections can be nested.

knutwannheden · 2019-10-21T13:49:00Z

People like jOOQ because it mimicks SQL almost perfectly (unless the Java language prevents it, e.g. with CTE or derived tables). Replacing the SELECT clause by something "clever" is going to drastically increase the learning curve and feed the arguments that jOOQ is not exactly SQL.

Makes sense. Of course some kind of project() expression could also be passed as an argument to select().

This is just me criticising the suggested execution here. We could definitely profit from supporting additional projection capabilities. See also the suggestion here: #9126

I have seen that feature request, which is of course related. One of the proposals there was to add a project() method to Table.

In other words, we're not gaining anything from avoiding the SQL keywords select() and row(), by replacing them by project(). Nested projections are already possible today using row or rowField().

Right. Using RVEs to help map to nested Java objects sounds interesting. I have never seen this done before, but it could be a good fit.

Besides, I really think we should not get distracted too much by what JPA is offering here.

The jpaProjection(MyPojo.class) was only meant as an example. In #9126 the proposal is to allow something like select(MyPojo.class) and then internally register / do record-mapping and record-mapping as appropriate. I am however wondering if it wouldn't make sense to make this more transparent and easier for the user to control by adding some kind of Projection type / SPI. The user would thus write something like select(jpa(MyEntity.class)) or select(pojo(MyPojo.class)) or similar. Maybe tools like JOLO could also profit from this.

It however seems like my comment should instead be on #9126.

lukaseder · 2019-10-22T10:26:55Z

Makes sense. Of course some kind of project() expression could also be passed as an argument to select().

Yes, exactly. I think that's where we might be heading. But the more cleverness we try to implement, the more brittle this will be in edge cases that depend on the actual SELECT clause (DISTINCT, UNION, ORDER BY, OFFSET .. FETCH, FOR UPDATE), so it's really a very slippery slope :)

I have seen that feature request, which is of course related. One of the proposals there was to add a project() method to Table.

Yes, I already regret that API proposition...

Right. Using RVEs to help map to nested Java objects sounds interesting. I have never seen this done before, but it could be a good fit.

There's native support in Oracle (with named types) and PostgreSQL. I've done this quite a few times in Oracle, myself...

The jpaProjection(MyPojo.class) was only meant as an example. In #9126 the proposal is to allow something like select(MyPojo.class) and then internally register / do record-mapping and record-mapping as appropriate. I am however wondering if it wouldn't make sense to make this more transparent and easier for the user to control by adding some kind of Projection type / SPI. The user would thus write something like select(jpa(MyEntity.class)) or select(pojo(MyPojo.class)) or similar. Maybe tools like JOLO could also profit from this.

All potential cleverness here will ultimately run against a wall (again), when users expect nested collections to be supported. Both of these APIs, as well as your reference to JOLO, suggest that such nested collections would be supported, just from reverse engineering some client data structure. I think that's just not going to be good enough. The same is true for #9126, btw.

If we do support nested collections, we should go "all in" ans support MULTISET: #3884, because that's the only way we can implement the full potential of the desired feature. Deriving "80%" usefulness from a client model is not good enough.

Irrespective of this discussion...

It however seems like my comment should instead be on #9126.

... yes indeed. This issue here can be implemented regardless of how we start supporting nested records and collections in the jOOQ API.

This issue here is only about post-processing whatever the SQL query and jOOQ's built in mapping algorithms produce.

lukaseder · 2021-05-10T13:04:19Z

The scope of this issue here becomes more clear in the context of numerous improvements that have been done for jOOQ 3.15, including the most important ones:

Let Row[N]<T1, T2, ..., TN> extend SelectField<Record[N]<T1, T2, ..., TN>> #11812 Ability to nest Row[N] expressions as SelectField<Record[N]>
Add <U> Field<T>.convert(Binding<T, U>), convert(Converter<T, U>), convert(Class<U>, Function<T, U>, Function<U, T>) #7100 Ability to create ad-hoc conversions directly on projected fields, including the above Row[N] types

It is now trivial to see that all of the current convenience mapping on ResultQuery (fetchXYZ()) and Result (intoXYZ()) can be achieved via a new set of composable Collector utilities, which could be placed in the new org.jooq.Records, which also hosts some of the ad-hoc conversion utilities.

For example:

public static final <K, V, R extends Record2<K, V>> Collector<R, ?, Map<K, V>> toMap() {
    return toMap(Record2::value1, Record2::value2);
}

public static final <K, R extends Record> Collector<R, ?, Map<K, R>> toMap(Function<? super R, ? extends K> keyMapper) {
    return toMap(keyMapper, r -> r);
}

public static final <K, V, R extends Record> Collector<R, ?, Map<K, V>> toMap(
    Function<? super R, ? extends K> keyMapper,
    Function<? super R, ? extends V> valueMapper
) {
    return Collectors.toMap(
        keyMapper,
        valueMapper,
        (k1, k2) -> {
            throw new InvalidResultException("Key " + k1 + " is not unique in Result");
        },
        LinkedHashMap::new
    );
}

These are just simple utilities built on top of JDK Collectors::toMap with the following properties:

Some capture <K, V, R extends Record2<K, V>> types to avoid repetition of field expressions (see example below)
The InvalidResultException semantics is maintained (similar to the JDK's default Collectors::toMap option of disallowing merging conflicting values)
The result is a LinkedHashMap, which preserves any SQL ORDER BY semantics

It's not more than that. An example:

record Book(int id, String title) {}
record Author(int id, String firstName, String lastName) {}

Map<Book, Author> books =
create().select(
            row(
                T_BOOK.ID,
                T_BOOK.TITLE
            ).mapping(Book::new),
            row(
                T_BOOK.author().ID,
                T_BOOK.author().FIRST_NAME,
                T_BOOK.author().LAST_NAME
            ).mapping(Author::new)
        )
        .from(T_BOOK)
        .orderBy(T_BOOK.ID)
        .collect(Records.toMap());

All of the types and mappings are declared only once. There is no more need to re-declare columns as Map keys and values outside of the SELECT clause.

One additional benefit of using collect() is that we can very easily skip creating a ResultImpl as an intermediate container of records.

All of these ideas also work very well with the current working drafts of creating nested collections (ARRAY or MULTISET)

functionality as Collector

lukaseder · 2021-05-10T20:18:04Z

The naming conflicts with Collectors.toList etc. are unfortunate. As it is right now, users can static import only one or the other.

lukaseder · 2021-05-10T20:20:10Z

Also, when someone reads the code that has collect(toList()), it may be confusing to immediately understand whether the result is a List<Record1<T>> or a List<T>

lukaseder · 2021-05-17T08:25:48Z

I'll mark this as an incompatible change (of behaviour), because if we re-use the new Collector API via ResultQuery.collect(), then we will no longer go via the ExecuteListener::resultStart and ::resultEnd lifecycle events (see #11871)

lukaseder · 2021-05-17T08:48:00Z

Or even better, we can be backwards compatible by creating the intermediary Result only in the presence of ExecuteListener instances, and not otherwise, making this configurable: #11892

New users will then have the performance benefit, and power users who require an ExecuteListener will have the current penalty, but they can opt out of creating intermediary Result instances explicitly.

- In Result.intoSet() - In Result.getValues() - In ResultQuery.fetch() - In ResultQuery.fetchSet()

This includes: - [#3619] Result.intoGroups() and intoMap() do not run through RecordListener

lukaseder added T: Enhancement C: Functionality P: Medium E: All Editions labels Sep 27, 2019

lukaseder added this to the Version 3.13.0 milestone Sep 27, 2019

Mordavolt mentioned this issue Sep 28, 2019

SELECT clause from mapped classes PicnicSupermarket/jolo#3

Closed

This was referenced Oct 2, 2019

Add fetchGroups() also on Cursor #6046

Closed

Add ResultQuery.fetchOneGroup(), fetchOptionalGroup(), fetchSingleGroup() #7208

Closed

lukaseder modified the milestones: Version 3.13.0, Version 3.14.0 Jan 20, 2020

lukaseder modified the milestones: Version 3.14.0, Version 3.15.0 Aug 11, 2020

lukaseder mentioned this issue Oct 22, 2020

Remove @Nullable annotation from ResultQuery.fetchSingleInto(Class) and Record.into(Class) #10774

Closed

lukaseder added this to To do in 3.15 Other improvements via automation May 10, 2021

lukaseder changed the title ~~Implement more reusable ResultQuery.fetch() methods~~ Offer Result.intoMap(), intoGroups(), intoSet() functionality as Collector May 10, 2021

lukaseder added a commit that referenced this issue May 10, 2021

[#9288] Offer Result.intoMap(), intoGroups(), intoSet()

b50ec71

functionality as Collector

lukaseder added a commit that referenced this issue May 10, 2021

[#9288] Make Mappable internal API for now

b35c11d

lukaseder added a commit that referenced this issue May 10, 2021

[#9288] Add some Javadoc to hint to the new functionality

2d3c6a8

lukaseder mentioned this issue May 11, 2021

Fetched result is not logged when using collect(), only with fetch() #10019

Closed

lukaseder added a commit that referenced this issue May 17, 2021

[#9288] Avoid naming conflicts with j.u.s.Collectors

02296fc

lukaseder added the T: Incompatible change label May 17, 2021

lukaseder mentioned this issue May 17, 2021

Add Settings.fetchIntermediateResult #11892

Closed

5 tasks

lukaseder removed the T: Incompatible change label May 17, 2021

lukaseder added a commit that referenced this issue May 17, 2021

[#9288] Avoid naming conflicts with j.u.s.Collectors

8fcb14e

lukaseder added a commit that referenced this issue May 17, 2021

[#9288] More refactorings to reuse collect()

bb3f3e3

- In Result.intoSet() - In Result.getValues() - In ResultQuery.fetch() - In ResultQuery.fetchSet()

lukaseder added a commit that referenced this issue May 17, 2021

[#9288] Restore IllegalArgumentException

5024f76

lukaseder mentioned this issue May 18, 2021

java.lang.ClassCastException: class java.lang.Class cannot be cast to class java.lang.reflect.ParameterizedType when custom data type extends List #10910

Closed

lukaseder added the T: Incompatible change label May 19, 2021

This was referenced May 19, 2021

Result.intoGroups() and intoMap() do not run through RecordListener #3619

Closed

Change Result.intoMap(Field[], Class) to return Map<Record, E> instead of Map<List<?>, E> #2601

Closed

Implement more optimal ResultQuery.forEach() method #11901

Closed

lukaseder added a commit that referenced this issue May 19, 2021

[#9288] Added mappers for int[], String[], Name[], Field<?>[]

8d39ea9

This includes: - [#3619] Result.intoGroups() and intoMap() do not run through RecordListener

lukaseder added a commit that referenced this issue May 19, 2021

[#9288] Added mappers for int[], String[], Name[], Field<?>[]

c93903f

lukaseder added a commit that referenced this issue May 20, 2021

[#9288] Added Records.intoArray() collectors

541a768

lukaseder added the R: Fixed label May 26, 2021

lukaseder closed this as completed May 26, 2021

3.15 Other improvements automation moved this from To do to Done May 26, 2021

lukaseder mentioned this issue Jun 7, 2021

Add some static RecordMapper construction methods for common use-cases #6326

Closed

lukaseder mentioned this issue Oct 15, 2021

Question regarding best practice for mapping optional nested records via row().mapping() #12512

Closed

lukaseder mentioned this issue Mar 21, 2022

Let fetchGroups() and fetchMap() return ResultMap #7763

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offer Result.intoMap(), intoGroups(), intoSet() functionality as Collector #9288

Offer Result.intoMap(), intoGroups(), intoSet() functionality as Collector #9288

lukaseder commented Sep 27, 2019 •

edited

knutwannheden commented Sep 27, 2019

lukaseder commented Sep 27, 2019

knutwannheden commented Oct 19, 2019

lukaseder commented Oct 21, 2019

knutwannheden commented Oct 21, 2019

lukaseder commented Oct 22, 2019

lukaseder commented May 10, 2021

lukaseder commented May 10, 2021

lukaseder commented May 10, 2021

lukaseder commented May 17, 2021

lukaseder commented May 17, 2021

Offer Result.intoMap(), intoGroups(), intoSet() functionality as Collector #9288

Offer Result.intoMap(), intoGroups(), intoSet() functionality as Collector #9288

Comments

lukaseder commented Sep 27, 2019 • edited

Tasks

Interactions with ExecuteListener

knutwannheden commented Sep 27, 2019

lukaseder commented Sep 27, 2019

Summary

knutwannheden commented Oct 19, 2019

lukaseder commented Oct 21, 2019

knutwannheden commented Oct 21, 2019

lukaseder commented Oct 22, 2019

lukaseder commented May 10, 2021

lukaseder commented May 10, 2021

lukaseder commented May 10, 2021

lukaseder commented May 17, 2021

lukaseder commented May 17, 2021

lukaseder commented Sep 27, 2019 •

edited

Interactions with `ExecuteListener`