Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cursor-based pagination postgresql performance issue with desc order #12650

Closed
ziimakc opened this issue Apr 3, 2022 · 6 comments · Fixed by prisma/prisma-engines#3661
Closed
Labels
bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. domain/client Issue in the "Client" domain: Prisma Client, Prisma Studio etc. kind/bug A reported bug. topic: cursor topic: performance/queries topic: performance

Comments

@ziimakc
Copy link

ziimakc commented Apr 3, 2022

Bug description

For a simple postgresql database table test with one column id of type pk integer, filled with 1.000.000 rows desc cursor based pagination takes ~300ms while asc takes ~0.3ms.

How to reproduce

Schema

model Test {
  id Int @id @default(autoincrement())

  @@map("test")
}

ASC query

Query:

prisma.test.findMany({ take: 4, skip: 1, cursor: {  id: 500, }, orderBy: { id: 'asc', }, });

Generated SQL:

select
	"public"."test"."id"
from
	"public"."test",
	(
	select
		"public"."test"."id" as "Test_id_0"
	from
		"public"."test"
	where
		("public"."test"."id") = (500)) as "order_cmp"
where
	"public"."test"."id" >= "order_cmp"."Test_id_0"
order by
	"public"."test"."id" asc
limit 4 offset 1;

Explain analyze on this query

Limit  (cost=0.95..1.35 rows=4 width=4) (actual time=0.205..0.207 rows=4 loops=1)
  ->  Nested Loop  (cost=0.85..33021.47 rows=333333 width=4) (actual time=0.204..0.206 rows=5 loops=1)
        Join Filter: (test.id >= test_1.id)
        Rows Removed by Join Filter: 499
        ->  Index Only Scan using test_pkey on test  (cost=0.42..18019.92 rows=1000000 width=4) (actual time=0.017..0.062 rows=504 loops=1)
              Heap Fetches: 0
        ->  Materialize  (cost=0.42..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=504)
              ->  Index Only Scan using test_pkey on test test_1  (cost=0.42..1.54 rows=1 width=4) (actual time=0.005..0.005 rows=1 loops=1)
                    Index Cond: (id = 500)
                    Heap Fetches: 0
Planning Time: 0.088 ms
Execution Time: 0.223 ms

DESC query

Query:

prisma.test.findMany({ take: 4, skip: 1, cursor: {  id: 500, }, orderBy: { id: 'desc', }, });

Generated SQL:

explain analyze select
	"public"."test"."id"
from
	"public"."test",
	(
	select
		"public"."test"."id" as "Test_id_0"
	from
		"public"."test"
	where
		("public"."test"."id") = (500)) as "order_cmp"
where
	"public"."test"."id" <= "order_cmp"."Test_id_0"
order by
	"public"."test"."id" desc
limit 4 offset 1;

Explain analyze on this query

Limit  (cost=0.95..1.35 rows=4 width=4) (actual time=373.832..373.834 rows=4 loops=1)
  ->  Nested Loop  (cost=0.85..33021.47 rows=333333 width=4) (actual time=373.830..373.832 rows=5 loops=1)
        Join Filter: (test.id <= test_1.id)
        Rows Removed by Join Filter: 999500
        ->  Index Only Scan Backward using test_pkey on test  (cost=0.42..18019.92 rows=1000000 width=4) (actual time=0.015..102.618 rows=999505 loops=1)
              Heap Fetches: 0
        ->  Materialize  (cost=0.42..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=999505)
              ->  Index Only Scan using test_pkey on test test_1  (cost=0.42..1.54 rows=1 width=4) (actual time=0.007..0.008 rows=1 loops=1)
                    Index Cond: (id = 500)
                    Heap Fetches: 0
Planning Time: 0.129 ms
Execution Time: 373.863 ms

DESC query with replaced Test_id_0 to actual value:

Interesting thing if we change query little bit and replace "order_cmp"."Test_id_0" with it's value, performance gets back on track.

explain analyze select
	"public"."test"."id"
from
	"public"."test",
	(
	select
		"public"."test"."id" as "Test_id_0"
	from
		"public"."test"
	where
		("public"."test"."id") = (500)) as "order_cmp"
where
	"public"."test"."id" <= 500
order by
	"public"."test"."id" desc
limit 4 offset 1;

Explain analyze on this query

Limit  (cost=0.89..1.03 rows=4 width=4) (actual time=0.016..0.018 rows=4 loops=1)
  ->  Nested Loop  (cost=0.85..19.38 rows=507 width=4) (actual time=0.014..0.017 rows=5 loops=1)
        ->  Index Only Scan Backward using test_pkey on test  (cost=0.42..11.50 rows=507 width=4) (actual time=0.009..0.009 rows=5 loops=1)
              Index Cond: (id <= 500)
              Heap Fetches: 0
        ->  Materialize  (cost=0.42..1.55 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=5)
              ->  Index Only Scan using test_pkey on test test_1  (cost=0.42..1.54 rows=1 width=0) (actual time=0.003..0.003 rows=1 loops=1)
                    Index Cond: (id = 500)
                    Heap Fetches: 0
Planning Time: 0.113 ms
Execution Time: 0.034 ms

Expected behavior

Cursor-based pagination postgresql performance should be fast for desc order, same as asc.

Prisma information

Environment & setup

  • OS: Windows 11
  • Database: PostgreSQL 14.2
  • Node.js version: 16.14.0

Prisma Version

prisma                  : 3.11.0
@prisma/client          : 3.11.0
Current platform        : windows
Query Engine (Node-API) : libquery-engine b371888aaf8f51357c7457d836b86d12da91658b (at ..\..\node_modules\.pnpm\@prisma+engines@3.11.0-48.b371888aaf8f51357c7457d836b86d12da91658b\node_modules\@prisma\engines\query_engine-windows.dll.node)
Migration Engine        : migration-engine-cli b371888aaf8f51357c7457d836b86d12da91658b (at ..\..\node_modules\.pnpm\@prisma+engines@3.11.0-48.b371888aaf8f51357c7457d836b86d12da91658b\node_modules\@prisma\engines\migration-engine-windows.exe)
Introspection Engine    : introspection-core b371888aaf8f51357c7457d836b86d12da91658b (at ..\..\node_modules\.pnpm\@prisma+engines@3.11.0-48.b371888aaf8f51357c7457d836b86d12da91658b\node_modules\@prisma\engines\introspection-engine-windows.exe)
Format Binary           : prisma-fmt b371888aaf8f51357c7457d836b86d12da91658b (at ..\..\node_modules\.pnpm\@prisma+engines@3.11.0-48.b371888aaf8f51357c7457d836b86d12da91658b\node_modules\@prisma\engines\prisma-fmt-windows.exe)
Default Engines Hash    : b371888aaf8f51357c7457d836b86d12da91658b
Studio                  : 0.458.0
Preview Features        : interactiveTransactions
@ziimakc ziimakc added the kind/bug A reported bug. label Apr 3, 2022
@janpio janpio added domain/client Issue in the "Client" domain: Prisma Client, Prisma Studio etc. topic: performance bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. topic: cursor labels Apr 4, 2022
@okomarov
Copy link

okomarov commented Apr 12, 2022

This does not seem Prisma related.

There's a bunch of posts about how pagination with LIMIT and OFFSET degrades when the table grows and if changing the direction of the sort on an indexed column (which default by ASC). Suggestions include to add an ascending and a descending index, use row numbers to seek rather than to offset.

An example of such post is: https://stackoverflow.com/questions/34110504/optimize-query-with-offset-on-large-table

@ziimakc
Copy link
Author

ziimakc commented Apr 12, 2022

@okomarov cursor-based pagination doesn't suffer or degrade for large tables much because it's doesn't use OFFSET (offset 1 in example is because we use skip: 1, but it's a very small value and never be more). Index for primary key created by default. It maybe a postgres issue, but anyway prisma may do optimizations to make it better and it's possible as in last example.

@casey-chow
Copy link

I wonder what the relation between this issue and #11138 is.

@ziimakc
Copy link
Author

ziimakc commented Apr 14, 2022

@casey-chow it's the same, here is just simplified to the minimum with possible solution. One thing is that I think we also should test it without skip = 1 to make sure it has no influence.

@okomarov
Copy link

@ziimakc offset of 1 will still trigger a full sort of the table even with the cursor based pagination, or am I missing something?

@ziimakc
Copy link
Author

ziimakc commented Apr 14, 2022

@okomarov only big offset value will decrease query performance, the more offset is, the more database is forced to read. Because cursor based pagination doesn't use offset more that one it's not a problem here.

Just tested without offset 1 and results are the same, so offset is not an issue here:

explain analyze select
	"public"."test"."id"
from
	"public"."test",
	(
	select
		"public"."test"."id" as "Test_id_0"
	from
		"public"."test"
	where
		("public"."test"."id") = (500)) as "order_cmp"
where
	"public"."test"."id" <= "order_cmp"."Test_id_0"
order by
	"public"."test"."id" desc
limit 4;
Limit  (cost=0.85..1.25 rows=4 width=4) (actual time=381.888..381.891 rows=4 loops=1)
  ->  Nested Loop  (cost=0.85..33021.47 rows=333333 width=4) (actual time=381.887..381.889 rows=4 loops=1)
        Join Filter: (test.id <= test_1.id)
        Rows Removed by Join Filter: 999500
        ->  Index Only Scan Backward using test_pk on test  (cost=0.42..18019.92 rows=1000000 width=4) (actual time=0.028..113.052 rows=999504 loops=1)
              Heap Fetches: 0
        ->  Materialize  (cost=0.42..1.55 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=999504)
              ->  Index Only Scan using test_pk on test test_1  (cost=0.42..1.54 rows=1 width=4) (actual time=0.007..0.008 rows=1 loops=1)
                    Index Cond: (id = 500)
                    Heap Fetches: 0
Planning Time: 0.148 ms
Execution Time: 381.958 ms

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. domain/client Issue in the "Client" domain: Prisma Client, Prisma Studio etc. kind/bug A reported bug. topic: cursor topic: performance/queries topic: performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants