Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computed overload with "returning" #1473

Merged

Conversation

wolfgangwalther
Copy link
Member

While working on #1464 I understood the codebase a bit more. Even though mutate queries do have a similiar query to the RPC regarding the pgrst_source CTE, overloaded computed columns work there properly.

While this is the basic structure of an RPC query:

WITH pg_source AS (
  SELECT * FROM search(1)
)
SELECT pg_source.id, pg_source.computed_overload FROM pg_source;

A mutation query looks like this

WITH pg_source AS (
  MUTATE anything RETURNING anything.id, anything.computed_overload
)
SELECT pg_source.id, pg_source.computed_overload FROM pg_source;

Note that the outer select is created by the {selectQuery} - the same as in the RPC case. It's just that the column names - including the computed one - are repeated on the inner query (RETURNING). So while on the inner query anything.computed_overload is really the function call, for the outer select it's a "real" column now.

Compared to #1464 there is now a solution that is a lot easier: Just use the same code that generates the RETURNING column names and put those into the inner query for RPC calls, so that the result is:

WITH pg_source AS (
  SELECT search.id, search.computed_overload FROM search(1)
)
SELECT pg_source.id, pg_source.computed_overload FROM pg_source;

The first commits in here are just the tests taken from the other PR. The last commit implements that. Probably needs some improvements here and there, but it works great.

@wolfgangwalther
Copy link
Member Author

wolfgangwalther commented Apr 6, 2020

Just a note on this:
I think my current way of getting "returnings" separately in App.hs and in mutateQuery is not the nicest way to do it. I think it would be way better to actually compose those columns in readRequest and provide them as part of the ReadRequest. I tried that at first, but failed, because of my limited Haskell experience ;)

That would actually make sense, because those columns are exactly what will be read from the query, no matter whether its read, write or proc. And the readRequest will be available in every case anyways: in mutateRequest (and therefore in mutateRequestToQuery), in readRequestToQuery and possibly in requestToCallProcQuery. Although the latter should probably move "out" to where readSqlParts and mutateSqlParts are and have a similiar thing like invokeSqlParts or something.

Doing this would also allow to give the read request a similiar structure: so just have a SELECT returnings WHERE conditions in the source CTE and then have the same {selectQuery} part on the outer part of th query. This would be very similar to the write and proc cases then and could lead to more similarity between all those queries and possibly some reduction of code redundancy. Maybe the three functions in Statements.hs could at one point be joined together, because the actual statement would be very similar for all cases.

Just some thoughts on how to maybe make all those cases more similiar to each other, as that difference between those queries was the source for the problem with the computed columns in the first place.

@steve-chavez
Copy link
Member

@wolfgangwalther Nice and succinct fix! Also, congrats on learning Haskell for making this PR!

Your ideas about simplifying Statements.hs make sense and less code is always good. I also was thinking that the call proc query should be in a type similar to ReadQuery/MutateQuery. That way the returningCols wouldn't have to go out of the DbRequestBuilder module.

However your change is pretty minimal so I think a refactor could be done in a next PR.

Could you add an entry to the CHANGELOG? I'll merge after that.

Really appreciate the wording corrections btw :)

@steve-chavez steve-chavez merged commit c524531 into PostgREST:master Apr 6, 2020
@ruslantalpa
Copy link
Contributor

@wolfgangwalther @steve-chavez the absence of CTEs (as much as possible) in read request was by design, a lot of the time they introduce performance problems (at least in older versions of PG). Their presence (CTEs) in the write requests are because of a "need", not "want", so trying to make the "read" query more like "write" query is the opposite of the original intention.

The primary objective is to have performant queries, nice haskell code structure is secondary to that.
I am not saying your suggestions are wrong, i am saying the exploration in that direction should start with rigorously testing the performance of the new query structure and compare it to the old one. Only after that step has good results (at least the same performance as the old one) should work on the haskell side start.

@steve-chavez
Copy link
Member

The https://github.com/PostgREST/postgrest/blob/master/test/QueryCost.hs test suite is already in place for testing the query cost. I don't think a major change was being suggested here but some tests could definitely be added.

@ruslantalpa
Copy link
Contributor

ruslantalpa commented Apr 7, 2020 via email

@wolfgangwalther wolfgangwalther deleted the computed-overload-returning branch April 10, 2020 08:57
@wolfgangwalther
Copy link
Member Author

wolfgangwalther commented Apr 10, 2020

@ruslantalpa Thanks for the hint about CTEs and performance. I looked that up a bit and learned a few things about Postgres CTEs being "optimization fences".

The gist is: Before postgres 12 conditions couldn't be pushed into / out of CTEs. CTEs act more or less like a temp table. Postgres 12 introduces WITH xxx AS [NOT] MATERIALIZED to control this.

Refactoring the statements would have to take this into account, as this could potentially a big performance kille, if done wrong. Although the current solution is not affected by this.

vwwv pushed a commit to on-ramp/postgrest that referenced this pull request Apr 20, 2020
* fix some typos and spelling

* fix pg_source CTE name should be prefixed with pgrst_

* added tests for overloaded computed columns on patch calls
monacoremo pushed a commit to monacoremo/postgrest that referenced this pull request Jul 17, 2021
* fix some typos and spelling

* fix pg_source CTE name should be prefixed with pgrst_

* added tests for overloaded computed columns on patch calls
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants