Support real life SQL #21

niner · 2018-08-21T07:41:45Z

A lesson taught by all existing attempts at creating ORMs (in any language) but one that has yet to be learned is that while most SQL statements are rather simple, there will be use cases that require some real SQL mastery and if the ORM does not allow for the latter, it's value is greatly diminished.

It seems like every ORM starts out with the simple stuff. How can we make fetching one row from a table simple? How do we allow for boolean expressions in the where clause? Let's make joining simple tables by id key trivial.

These (and a couple more) are the things that ORMs make really simple but then real users come along and ask for more and more complicated things which then get bolted on top of the simple mechanisms. And suddenly you reach the point, where writing the SQL query is the easy part, but getting your ORM to generate that query starts getting really hard or downright impossible. Then you end up with an application that's a crude mixture of ORM code and hand crafted SQL statements combining the disadvantages of both.

The conclusion is that for the API design one should start with the absolute worst and most complicated SQL statement imaginable and find an API that allows one to express this. This statement should at least contain:

Common Table Expressions (preferably recursive)
Window Functions
Correlated subqueries in the select clause
Subqueries in the from clause
(Nested) calls to SQL functions with calling conventions that do not like C (i.e. extract(hour from some_timestamp_column)
Subqueries as arguments to function calls
Unions
Type coercion ("foo::int" or "interval '1 day'")
Grouping and ordering
Special operators like #> (Get JSON object at specified path in PostgreSQL)
Conditionals (CASE foo WHEN bar THEN baz ELSE qux END)

I'm sure there are a couple more interesting aspects of SQL, but this should get you started. They are all features that are in use in our code base. The trick will be to make these possible while keeping the trivial cases simple.

To finish, here's an example of a real life query from our production code that DBIx::Class is completely unequipped to handle:

with product_active(product_id, active, changetime, active_end) as (
    select product_id,
        active,
        changetime,
        lead(product_active_history.changetime, 1, now()::timestamp without time zone)
            OVER (
                PARTITION BY product_active_history.product_id
                ORDER BY product_active_history.changetime
            )
            as active_end
    from product_active_history
        join products on product_active_history.product_id = products.id
        join customers on products.customer_id = customers.id
    where products.article_id = 62
        and country_id in (1, 2, 3, 4)
        and products.customer_id not in (885, 840, 246, 362, 3233, 6378)
    order by product_active_history.product_id, product_active_history.changetime
),
valuation_date as (
    select (generate_series('2000-01-01'::date, now(), interval '1 year'))::date as day
)
select *,
    100.0 * (cancelled - lag(cancelled) over (order by day)) / online as cancelled_percent,
    100.0 * (redesign - lag(redesign) over (order by day)) / online as redesign_percent
from (
    select
        day,
        count(nullif(product_active.active_end > valuation_date.day, false)) as finished,
        count(nullif(product_active.active_end > valuation_date.day and product_active.active, false)) as online,
        count(nullif(product_active.active = false and product_active.active_end > valuation_date.day, false)) as offline,
        count(nullif(product_active.active = false and not exists (
            select * from products successor where successor.predecessor_id = product_active.product_id
        ) and product_active.active_end > valuation_date.day, false)) as cancelled,
        count(nullif(
            product_active.active = false
                and exists (select * from products successor where successor.predecessor_id = product_active.product_id)
                and product_active.active_end > valuation_date.day,
            false)) as redesign
    from
        valuation_date
        join product_active on product_active.changetime <= valuation_date.day
    group by day
) as data
order by day

The text was updated successfully, but these errors were encountered:

FCO · 2018-08-21T12:29:02Z

I think if we find a generic and composable way to represent each SQL feature it will be possible to create large and complex query.

So, I’ll put it here some thoughts about some of the “special” features you used on your example. Not all at once. And please comment every one you’d like to!

FCO · 2018-08-21T12:29:31Z

I don’t know if it is possible but the way I think to do something like this

WITH RECURSIVE t(n) AS (
    VALUES (1)
  UNION ALL
    SELECT n+1 FROM t WHERE n < 100
)
SELECT sum(n) FROM t;

Is with something like this

(Red.select({ 1 }), * + 1 ... * < 100).sum

FCO · 2018-08-21T13:33:50Z

WITH RECURSIVE included_parts(sub_part, part, quantity) AS (
    SELECT sub_part, part, quantity FROM parts WHERE part = 'our_product'
  UNION ALL
    SELECT p.sub_part, p.part, p.quantity
    FROM included_parts pr, parts p
    WHERE p.part = pr.sub_part
  )
SELECT sub_part, SUM(quantity) as total_quantity
FROM included_parts
GROUP BY sub_part

I know, it’s not complete... and I don’t know if it is possible (yet). But the way I’m thinking of taking is something like:

(IncludedParts.where({ .part eq “our_product” }), { .^where: .part == IncludedParts.sub-part } ... *).classify: *.sub-part, :reduce-field{ .quantity.sum }

FCO · 2018-08-24T13:35:40Z

I was thinking about that and Im starting to disagree...
The way Im thinking now is that the Red user should be able to do with Red any thing it could with local data, what SQL would it generate isn't a problem. What do you guys think?
I mean: we should map language syntax to sql, and not sql tho language syntax...

MattOates · 2018-09-03T22:47:11Z

Having recently spent a week fighting hand and fist against SQLAlchemy to get it to just produce a union query with selects from subqueries with aggregate and window functions, that took me ten minutes to actually write by hand. You want to be able to go from SQL into language syntax thats like SQL, as well as language syntax thats idiomatic to whatever crazy rendered SQL that can get the job done. Personally I am a fan of the escape hatch abstraction. Something that's very thinly abstracting the SQL so you avoid vendor lock in, but close enough I can translate almost directly. Might not have to be related to the ORM really, just work relatively nicely with SQL generation in concert with the ORM. So perhaps a subquery is from an ORM expression, the other is some vendor specific directly optimised by hand query close to raw SQL, and the two are nicely combined to be rendered together with a parent query like object referencing either as if it was more ORM like.

FCO · 2018-12-19T20:47:44Z

I think Red is tracking a way to transform perl6 code to SQL, something like:https://github.com/FCO/Red/wiki/Better-map-grep and https://gist.github.com/FCO/87831dd190e1bb12adb138ce866acb73

FCO added help wanted Extra attention is needed question Further information is requested labels Aug 21, 2018

This was referenced Jan 4, 2019

Support for 'in' expressions in generated SQL #87

Closed

Support for infix '&&' and '||' in SQL generation #86

Closed

Support for union of resultseqs in SQL generation. #88

Closed

FCO closed this as completed Feb 25, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support real life SQL #21

Support real life SQL #21

niner commented Aug 21, 2018

FCO commented Aug 21, 2018 •

edited

Loading

FCO commented Aug 21, 2018 •

edited

Loading

FCO commented Aug 21, 2018 •

edited

Loading

FCO commented Aug 24, 2018 •

edited

Loading

MattOates commented Sep 3, 2018 •

edited

Loading

FCO commented Dec 19, 2018

Support real life SQL #21

Support real life SQL #21

Comments

niner commented Aug 21, 2018

FCO commented Aug 21, 2018 • edited Loading

FCO commented Aug 21, 2018 • edited Loading

FCO commented Aug 21, 2018 • edited Loading

FCO commented Aug 24, 2018 • edited Loading

MattOates commented Sep 3, 2018 • edited Loading

FCO commented Dec 19, 2018

FCO commented Aug 21, 2018 •

edited

Loading

FCO commented Aug 21, 2018 •

edited

Loading

FCO commented Aug 21, 2018 •

edited

Loading

FCO commented Aug 24, 2018 •

edited

Loading

MattOates commented Sep 3, 2018 •

edited

Loading