Support joins with scala collections using temporary tables #799

hvesalai · 2014-05-08T09:05:24Z

Traditionally the way ORMs have supported joins over external data is the use of the SQL IN keyword, i.e.

select * from "foo" where "bar" IN ( /* list of values */ );

This is equivalent to creating a temporary table and doing an inner join on that table.

create temporary table "t1" ( "bar" text ) on commit drop;
-- insert the list of values to t1
select * from "foo" natural join "t1";

While it is true, the latter has a more complex SQL syntax, it has the following benefits:

it has potentially better performance
the select statement can be compiled and prepared
it is much more versatile

As to the last point, here is something you cannot do with the IN keyword.

create temporary table "t2" ( "bar" text, "zot" number ) on commit drop;
-- insert the list of tuples to t2
select * from "foo" natural join "t2";

I.e. selecting all the rows from foo where bar = t2.bar and zot = t2.zot.

What I'm requesting is that slick support a nice DSL for doing a join with a scala collection in which the user of the API wouldn't have to handle (or even know about) the createion of the temporary table and the insertion of the values to that table.

E.g. with the current inner join syntax:

val names = List("Heikki" -> "Vesalainen", "Stefan" -> "Zeiger")
val q = persons join names on { (p, n) => p.firstName === n._1 && p.lastName === n._2 }

would result in

create temporary table "t1" ( "x1" text, "x2" text ) on commit drop;
-- insert names to t1
select * from "persons" as "p" inner join "t1" as "n" on "p"."firstName" = "n"."x1" and "p"."lastName" = "n"."x2";

The text was updated successfully, but these errors were encountered:

cvogt · 2014-05-13T17:55:38Z

An alternative to this would be fetching all persons and doing the join in-memory. Which one is more efficient depends on which collections is larger, the in-memory one or the (permanent) database table. We will probably need an api for exposing this decision to the user, not only for temporary tables, but also for distributed queries joining tables from multiple dbs. An alternative could be an automatic decision based on profiling or heuristics or something like that.

hvesalai · 2015-03-09T09:07:23Z

I would have needed this again today.

KLBonn · 2015-07-10T09:48:14Z

Me too.
For now it seems I have to do it in-memory.

Side note: I know of close to zero cases where the in-memory (application input) collection would exceed the respective database table size. By far the most cases of collection comparisons (should) have been located on db side.

mboogerd · 2015-08-04T06:58:54Z

@cvogt, you're of course correct in that efficiency depends on the size of the scala set vs. the table size, however I agree with @KLBonn that the majority of use cases the former will be smaller. In our current project we do quite a bit of micro-batching, which would be helped tremendously if this was a slick feature!

Pyppe · 2015-08-14T10:17:02Z

👍

thirdy · 2016-04-22T08:39:46Z

Would have helped me too. Any updates on this?

francescopellegrini · 2016-05-18T15:59:42Z

👍

cvogt · 2016-05-18T21:49:11Z

maybe something @radsaggi can look at later this summer during GSOC

alexander-myltsev · 2016-06-14T13:25:11Z

👍

virusdave · 2016-10-29T02:23:12Z

👍
This is something i run into all the time as well.

kemmar · 2019-10-09T10:40:00Z

I also would love this feature

tobiatesan · 2019-12-02T13:47:01Z

Just pointing out that in absence of this feature, people are more likely to store chunks of results inside a Scala variable, which carries a performance penalty, and potentially try to use it in an IN SET clause.

Which might then involve running into #1739

WayneWang12 · 2019-12-13T08:06:50Z

it seems a quite good feature to me.

hvesalai · 2024-03-25T20:40:06Z

damn I've had some great ideas in the past. 10 years ago.

nafg · 2024-03-27T00:16:27Z

Couldn't it be done with a VALUES expression? At least in Postgres I'm pretty sure it could. Sent with Shortwave <https://www.shortwave.com?utm_medium=email&utm_content=signature&utm_source=bmFmdG9saWd1Z0BnbWFpbC5jb20=>

…

On Mon Mar 25, 2024, 08:40 PM GMT, Heikki Vesalainen ***@***.***> wrote: damn I've had some great ideas in the past. 10 years ago. — Reply to this email directly, view it on GitHub <#799 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAAYAUHDIGQFZ5X5JSZ6CRDY2CDT5AVCNFSM4APEEPT2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBRHA4DONJTGQ3Q>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

szeiger added the improvement label May 20, 2014

szeiger added this to the Future milestone May 20, 2014

cvogt added the UPVOTED label May 18, 2016

cvogt assigned radsaggi May 18, 2016

cvogt added the community label Nov 4, 2016

cvogt unassigned radsaggi Nov 4, 2016

hvesalai mentioned this issue Mar 1, 2018

tuple support for inSet #517

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support joins with scala collections using temporary tables #799

Support joins with scala collections using temporary tables #799

hvesalai commented May 8, 2014

cvogt commented May 13, 2014

hvesalai commented Mar 9, 2015

KLBonn commented Jul 10, 2015

mboogerd commented Aug 4, 2015

Pyppe commented Aug 14, 2015

thirdy commented Apr 22, 2016

francescopellegrini commented May 18, 2016

cvogt commented May 18, 2016

alexander-myltsev commented Jun 14, 2016

virusdave commented Oct 29, 2016

kemmar commented Oct 9, 2019

tobiatesan commented Dec 2, 2019

WayneWang12 commented Dec 13, 2019

hvesalai commented Mar 25, 2024

nafg commented Mar 27, 2024 via email

Support joins with scala collections using temporary tables #799

Support joins with scala collections using temporary tables #799

Comments

hvesalai commented May 8, 2014

cvogt commented May 13, 2014

hvesalai commented Mar 9, 2015

KLBonn commented Jul 10, 2015

mboogerd commented Aug 4, 2015

Pyppe commented Aug 14, 2015

thirdy commented Apr 22, 2016

francescopellegrini commented May 18, 2016

cvogt commented May 18, 2016

alexander-myltsev commented Jun 14, 2016

virusdave commented Oct 29, 2016

kemmar commented Oct 9, 2019

tobiatesan commented Dec 2, 2019

WayneWang12 commented Dec 13, 2019

hvesalai commented Mar 25, 2024

nafg commented Mar 27, 2024 via email