WIP: Reduce the number of SQL queries #342

benjie · 2017-02-04T11:54:16Z

calculate hasNextPage/hasPreviousPage in the same query as fetching the data
remove a source of N+1 queries relating to relations (foo.barsByBazId)

TODO:

merge master
try and remove additional parameters where possible (requires @calebmer's help)
look into using CTEs to bypass performance issues, a la fix(postgres): use CTEs for paginators #380 #396

KNOWN ISSUES:

I've been informed by @cusxio that the following query returns null, but that including the username field fixes this - sounds like I need to add username to the list of implicit fields on the search endpoints:

query {
  userByUsername(username:"cusx") {
    id
  }
}

Workaround:

query {
  userByUsername(username:"cusx") {
    id
    username
  }
}

Fixes #219
Fixes #265

…uments

valoricDe · 2017-02-28T10:02:07Z

I was to eager. Throwing out the externalFieldNameDependencies ended in not fetching the sub sub queries. Ok so we rather account for classicIds with

query: sql.query`${sql.identifier(aliasIdentifier)}.${sql.identifier(externalFieldName === 'row_id' ? 'id' : externalFieldName)}`,

But thats just a hack. @benjie do you have a better idea?

valoricDe · 2017-02-28T10:26:25Z

Subjective Evaluation: Request time cut in half 💛

benjie · 2017-02-28T11:06:39Z

externalFieldName is the name of the postgresql column (without id -> rowId shenanigans) this field represents

externalFieldNameDependencies is a list of external field names that we need to fetch in order for the system to work (because if we don't fetch them then we cannot generate the required field - e.g. the __id computed field (which is still __id because I've not merged the latest master yet), or relations that need to know the row's fooId so it can look up barsByFooId, and so on). When sorting we also have to add the sort criteria to the list of fields to fetch otherwise the logic in the paginators falls down I think.

benjie · 2017-02-28T11:08:05Z

I'm not sure how best to deal with rowId and I'm sure I'm getting various things from places I oughtn't since I just hacked it out as a proof of concept, so I'm awaiting Caleb's ideas on this point.

benjie · 2017-03-01T13:45:18Z

Just discovered a bug in this relating to one-to-many relations on sub-entries on mutation payloads.

benjie · 2017-03-01T14:35:23Z

Found it 👍

0x6368656174 · 2017-03-29T03:54:37Z

In this version, the GRAND (column) ON [TABLE] does not work, in contrast to the version of bg/performance-experiment. And I really need the support of GRAND (column) ON [TABLE] = (((

benjie · 2017-03-29T08:12:56Z

@0x6368656174 Do you know why it doesn't work on this branch? Given it's based on the other (which didn't pass tests), I'm theorising it's because this branch pulls in the other required columns that are necessary for PostGraphQL internals, e.g. if you order by a field, that field gets implicitly added to the selection; similarly if you filter by a field.

0x6368656174 · 2017-03-29T23:13:17Z

@benjie if I execute the query:

{
  allUsers {
    nodes {
      id
      name
    }
  }
}

In this branch, the query is selected:

select coalesce(json_agg(__local_0__), '[]'::json) as "rows", ( select count(*) from ( select * from "public"."users" as __local_1__ where "id" is not null and true ) as __local_2__ ) ::integer as "totalCount", ( select count(*) from ( select * from "public"."users" as __local_1__ where "id" is not null and true ) as __local_3__ where true ) > count(*) as "hasNextPage", false as "hasPreviousPage" from ( select json_build_object( 'value', json_build_object('id', __local_4__."id", 'name', __local_4__."name"), 'cursor', json_build_array("id") ) as __local_0__ from ( select * from "public"."users" as __local_1__ where "id" is not null and true ) as __local_4__ where true and true order by "id" using < limit all ) as __local_5__

And it should be:

select coalesce(json_agg(__local_0__), '[]'::json) as "rows", ( select count(*) from ( select "id", "name" from "public"."users" as __local_1__ where "id" is not null and true ) as __local_2__ ) ::integer as "totalCount", ( select count(*) from ( select "id", "name" from "public"."users" as __local_1__ where "id" is not null and true ) as __local_3__ where true ) > count(*) as "hasNextPage", false as "hasPreviousPage" from ( select json_build_object( 'value', json_build_object('id', __local_4__."id", 'name', __local_4__."name"), 'cursor', json_build_array("id") ) as __local_0__ from ( select "id", "name" from "public"."users" as __local_1__ where "id" is not null and true ) as __local_4__ where true and true order by "id" using < limit all ) as __local_5__;

Plus, I do not need totalCount, hasNextPage, hasPreviosPage, but in the SQL-request they are writing.

benjie · 2017-03-30T08:35:59Z

Formatted, so I can compare:

SELECT coalesce(json_agg(__local_0__), '[]'::json) AS "rows",

  (SELECT count(*)
   FROM
     (SELECT *
      FROM "public"."users" AS __local_1__
      WHERE "id" IS NOT NULL
        AND TRUE ) AS __local_2__) ::integer AS "totalCount",

  (SELECT count(*)
   FROM
     (SELECT *
      FROM "public"."users" AS __local_1__
      WHERE "id" IS NOT NULL
        AND TRUE ) AS __local_3__
   WHERE TRUE ) > count(*) AS "hasNextPage",
       FALSE AS "hasPreviousPage"
FROM
  (SELECT json_build_object('value', json_build_object('id', __local_4__."id", 'name', __local_4__."name"), 'cursor', json_build_array("id")) AS __local_0__
   FROM
     (SELECT *
      FROM "public"."users" AS __local_1__
      WHERE "id" IS NOT NULL
        AND TRUE ) AS __local_4__
   WHERE TRUE
     AND TRUE
   ORDER BY "id" USING <
   LIMIT ALL) AS __local_5__

SELECT coalesce(json_agg(__local_0__), '[]'::json) AS "rows",

  (SELECT count(*)
   FROM
     (SELECT "id",
             "name"
      FROM "public"."users" AS __local_1__
      WHERE "id" IS NOT NULL
        AND TRUE ) AS __local_2__) ::integer AS "totalCount",

  (SELECT count(*)
   FROM
     (SELECT "id",
             "name"
      FROM "public"."users" AS __local_1__
      WHERE "id" IS NOT NULL
        AND TRUE ) AS __local_3__
   WHERE TRUE ) > count(*) AS "hasNextPage",
       FALSE AS "hasPreviousPage"
FROM
  (SELECT json_build_object('value', json_build_object('id', __local_4__."id", 'name', __local_4__."name"), 'cursor', json_build_array("id")) AS __local_0__
   FROM
     (SELECT "id",
             "name"
      FROM "public"."users" AS __local_1__
      WHERE "id" IS NOT NULL
        AND TRUE ) AS __local_4__
   WHERE TRUE
     AND TRUE
   ORDER BY "id" USING <
   LIMIT ALL) AS __local_5__;

benjie · 2017-03-30T08:44:19Z

@0x6368656174 So you're saying that the select * in the count statements and the source table are causing you trouble.

I'm aware that we over-fetch the hasPreviousPage/etc - I ran out of time whilst writing this proof of concept.

The intention of this PR was performance, rather than excluding fields for permissions reasons, and I'm hesitant to advance the code any further knowing that we'll probably be rebuilding a fair chunk of it. That said, you're welcome to build your own branch off of this one that solves your issue - if it's simple enough then I may consider merging, but you'll have to be careful to include all required fields, not just the ones that are directly requested (because PostGraphQL requires some internally for various things: sorting, relations, etc).

Personally I don't use column-level grants on select statements, I just split the tables based on permissions boundaries into sub-tables, e.g. instead of a monolithic user table, I might have user, user_profile, user_email, user_auth, etc.

0x6368656174 · 2017-03-30T09:13:51Z

@benjie Splitting one monolithic table into several is not a way out of the problem. First, there can be too many small tables that will degrade performance. Secondly, if one column is available for two user roles, but the rest are not available, then you get a whole table with only one column.

0x6368656174 · 2017-03-30T09:17:28Z

@benjie The easiest solution is to replace the select (*) with the select ("id"), and everything will work, because "Id" in PostGraphQL should always be available to the role, if it is available at least some other column.

benjie · 2017-03-30T10:42:39Z

Well, for the counts simply select 1 should suffice and doesn't require us to figure out the primary key(s).

I don't find performance degrades too much with splitting concerns into separate tables; one-to-one indexed joins are very fast.

benjie · 2017-05-22T12:04:49Z

This code:

https://github.com/postgraphql/postgraphql/blob/bg/performance-experiment-2/src/graphql/schema/collection/createCollectionQueryFieldEntries.ts#L62

Needs to specify externalFieldNameDependencies

benjie · 2017-05-22T12:28:27Z

There's also an issue here:

https://github.com/postgraphql/postgraphql/blob/bg/performance-experiment-2/src/postgraphql/schema/procedures/createPgProcedureObjectTypeGqlFieldEntry.ts#L70

If you have a procedure which returns a row from the database, this isn't sufficient to parse it into a Map as is required by much of the postgraphql internals - it just leaves it as a vanilla JSON object. If you then try and access attributes on this, it throws an error like value.get is not a function

benjie · 2017-05-22T12:36:56Z

Using fixtures.return.type.transformPgValueIntoValue seems to fix the latter.

benjie · 2017-05-22T13:20:12Z

Seems this same issue was fixed on master only a couple months ago.

24cb295

benjie · 2017-07-18T07:14:24Z

Closing in favour of V4: #506

benjie added 30 commits February 1, 2017 22:46

fix(postgres): fix processing money type

8dc0b9b

Move the common select fragment into its own file

c390862

Some type browsing within resolveInfo

cd22083

Args

b614cf3

Nearly there

6d94bf7

Make fields a map from name to SQL fragment

3a85c3b

Actually query

e4f3939

Send resolveInfo through to read

c266056

Support non-relay format

4b1208c

Desperate tweaks

b9ebaea

POPME

0ba7e1f

I can't believe it's working

8435b27

This commit doesn't exist, you're imagining things

59e05b3

Use SQL name for fields rather than GQL name

9738fcf

More tweaking

e25b80d

Typo

16a5129

Neater implementation

bc8ec2b

Working using type._typeConfig hack

e6668f3

Alias procedure columns so can call multiple times with different arg…

48865dd

…uments

Use graphql-js internal argument getter

b4c3606

Unused

8822038

Rename function

a9d0b81

Reformat

6cf1af6

Add support for InlineFragments

f6bf5f0

Safer against future refactoring

6e7819f

Pass field name through to sqlName/sqlExpression

a2c8c97

Convert paginator to use only SQL

12f9428

More familiar

47d8f9b

Beginnings of SQL-ification of PGPaginatorOrderingAttributes

e3ebd86

Move hasNextPage/hasPreviousPage into SQL

5bb3898

Handle fallback gracefully

3304343

calebmer mentioned this pull request Mar 5, 2017

Transitioning from "big framework"? e.g. Rails, Django #307

Closed

benjie mentioned this pull request Mar 5, 2017

Unexpected error behaviour, because of failing transactions #375

Closed

This was referenced Mar 16, 2017

Slow performance when using orderBy #380

Closed

fix(postgres): use CTEs for paginators #380 #396

Merged

benjie mentioned this pull request Mar 29, 2017

Support GRANT (column) ON [TABLE] #415

Closed

EyMaddis mentioned this pull request Apr 7, 2017

Allow conditions on computed properties (optimize computed property query) #430

Closed

Change credentials to same-origin

e9d8035

Copy fix from master

387d5f0

24cb295

benjie mentioned this pull request Jul 8, 2017

PostGraphQL V4 - announcing graphile-build #506

Merged

30 tasks

benjie closed this Jul 18, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Reduce the number of SQL queries #342

WIP: Reduce the number of SQL queries #342

benjie commented Feb 4, 2017 •

edited

valoricDe commented Feb 28, 2017 •

edited by calebmer

valoricDe commented Feb 28, 2017

benjie commented Feb 28, 2017

benjie commented Feb 28, 2017

benjie commented Mar 1, 2017

benjie commented Mar 1, 2017

0x6368656174 commented Mar 29, 2017 •

edited

benjie commented Mar 29, 2017

0x6368656174 commented Mar 29, 2017

benjie commented Mar 30, 2017

benjie commented Mar 30, 2017

0x6368656174 commented Mar 30, 2017

0x6368656174 commented Mar 30, 2017

benjie commented Mar 30, 2017

benjie commented May 22, 2017

benjie commented May 22, 2017

benjie commented May 22, 2017

benjie commented May 22, 2017

benjie commented Jul 18, 2017

WIP: Reduce the number of SQL queries #342

WIP: Reduce the number of SQL queries #342

Conversation

benjie commented Feb 4, 2017 • edited

valoricDe commented Feb 28, 2017 • edited by calebmer

valoricDe commented Feb 28, 2017

benjie commented Feb 28, 2017

benjie commented Feb 28, 2017

benjie commented Mar 1, 2017

benjie commented Mar 1, 2017

0x6368656174 commented Mar 29, 2017 • edited

benjie commented Mar 29, 2017

0x6368656174 commented Mar 29, 2017

benjie commented Mar 30, 2017

benjie commented Mar 30, 2017

0x6368656174 commented Mar 30, 2017

0x6368656174 commented Mar 30, 2017

benjie commented Mar 30, 2017

benjie commented May 22, 2017

benjie commented May 22, 2017

benjie commented May 22, 2017

benjie commented May 22, 2017

benjie commented Jul 18, 2017

benjie commented Feb 4, 2017 •

edited

valoricDe commented Feb 28, 2017 •

edited by calebmer

0x6368656174 commented Mar 29, 2017 •

edited