-
-
Notifications
You must be signed in to change notification settings - Fork 569
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance of computed columns in connections #265
Comments
For the query:
we could potentially solve it with a single SQL statement along the lines of: select
to_json(__local_0__) as value,
json_build_object(
'summary',
to_json("perf"."post_summary"(__local_0__))
) as computed_value
from "perf"."post" as __local_0__
where "id" is not null
order by "id" using < and then combine value/computed value with Object.assign or similar. You could go further with explicit column lists (like what's required by #151), minimising encoding data that isn't used, e.g. something like: select
json_build_object(
'id',
to_json(__local_0__."id"),
'body',
to_json(__local_0__."body"),
'summary',
to_json("perf"."post_summary"(__local_0__))
) as computed_value
from "perf"."post" as __local_0__
where "id" is not null
order by "id" using < (I'm not sure if the |
Also, if we were to do this we'd also be able to sort by computed columns as requested in #262 by @Valorize: select
json_build_object(
'id',
to_json(__local_0__."id"),
'body',
to_json(__local_0__."body"),
'summary',
to_json("perf"."post_summary"(__local_0__))
) as computed_value
from "perf"."post" as __local_0__
where "id" is not null
order by "perf"."post_summary"(__local_0__) using < (this would only work for computed columns that take no additional arguments, as the ones that take arguments cannot be expressed as static strings in our |
So I spent an hour digging through how PostGraphQL works, thought it might be useful to share some of my scattered learnings. I'm a GraphQL newbie, so apologies if some of this is obvious. We start in src/postgraphql/schema/createPostGraphQLSchema which runs the introspection query against the PG database and then passes it's learnings onto createGqlSchema. Particularly of interest for this specific issue #265 in this file is the The root of the schema generation is src/graphql/schema/createGqlSchema which recurses through and creates everything that's necessary based on the learnings from the postgresql introspection. It defines the query root node. This then loops through creating the type for each each "collection" (table) via The If you spit out {
values: [
{value: {id: 1, field: 'value', otherField: 'otherValue'}, cursor: [1]},
{value: {...}, cursor: ...},
...
],
hasNextPage: function...,
hasPreviousPage: function...,
} Where the value objects are the It's then later that this data is fed back into the resolver for the computed columns (via the The computed columns are effectively added here: https://github.com/calebmer/postgraphql/blob/c43ab8d852b4f83704571287c64662b1a3e6d915/src/graphql/schema/collection/getCollectionGqlType.ts#L81-L84; the columns are fetched via later on in that file via here: https://github.com/calebmer/postgraphql/blob/c43ab8d852b4f83704571287c64662b1a3e6d915/src/graphql/schema/collection/getCollectionGqlType.ts#L108 |
What we optimally want to do is find the list of all the fields that are required for a particular row/collection and then run them all via one query, as above. It's not yet clear to me how to go about doing that. |
@benjie thanks for sharing you learnings! This is very helpful to me. |
Exactly. This has been on my mind for a while, because it also means we could do away with
Another less efficient but easier to implement option would be to split this query @benjie wrote effectively into two queries: select
to_json(__local_0__) as value
from "perf"."post" as __local_0__
where "id" is not null
order by "id" using < Then once we get that back: select
json_build_object(
'summary',
to_json("perf"."post_summary"(__local_0__))
) as computed_value
from "perf"."post" as __local_0__
where "id" in ($1)
order by "id" using < Where Not as efficient as the optimal solution, but more efficient than what we have now. We should at least be able to optimize the most common case (returning a scalar from a procedure), and I have a PR that I’ve been sitting on which should make this easier to implement. I’ll see if I can clean it up and send it soon. |
Excellent feedback @calebmer, thanks for helping us to understand. Regarding the My API has some large JSON objects and arrays so your suggested modification which doesn't pipe the fetched data straight back into postgres again would be considerably better than the current solution 👍 . Some of my queries take 800ms+ locally at the moment; this should reduce their duration significantly, not least because it goes from O(N) to O(1) queries. |
I would be super happy with every performance boost. My requests currently take about 1360ms. For 20 items with some other nested items. |
I believe I've implemented the optimal solution in #340; there's still work to be done before it can be merged but the foundations are mostly there. |
@Valorize Are you able to try my latest stuff in #342 to see if it solves your 1360ms issue? Checkout master, install everything, then do Then You can then repeat the process with the |
Fixed in PostGraphQL v4 👍 |
Fixes #812 If you define a type in a non-exposed schema, or define a type but don't grant access to it, then that type will generally not appear in your schema (if `ignoreRBAC: false`). If it does appear, chances are it's been pulled in by a SQL function or similar; if this is the case then pulling it through with no columns doesn't make any sense (and if there's no PK it can cause a crash as the type now has no fields). This PR adds attributes back on tables that haven't been granted *any* read permissions.
I am lacking the proper terminology to express the problem correctly but here you go.
For every row in a table the computed column procedure is executed, hence every other row contributes significantly to the execution time of the query. I am just guessing here but could we maybe use
with
queries to 'inline' the queries 🙈You can find a 'test case' here https://github.com/ferdinandsalis/computed-columns-perf
@calebmer please let me know if you were expecting something else for the issue and the test case.
The text was updated successfully, but these errors were encountered: