Prisma generated query takes 10x as long than a manually composed one #14402

jschuur · 2022-07-20T10:37:48Z

I have a pretty standard GraphQL resolver Prisma query that ends up being horribly inefficient when I let Prisma generate the SQL, as opposed to just writing the (Postgre) SQL myself. At my current database size, it takes over 5 seconds via Prisma, and about 400 ms via SQL that I have composed.

At this point, I am not sure if this is just a fundamental Prisma issue in how it generates SQL in my case, or if there is anything I can do to avoid having to resort to a raw query.

More details, including my indexes and an EXPLAIN (ANALYZE, BUFFERS) output over at Stack Overflow, but here is the Prisma syntax...

  const videos = await ctx.prisma.video.findMany({
    where: {
      channel: {
        NOT: {
          status: {
            in: [ChannelStatus.HIDDEN, ChannelStatus.ARCHIVED],
          },
        },
      },
      status: {
        in: [VideoStatus.UPCOMING, VideoStatus.LIVE, VideoStatus.PUBLISHED],
      },
    },
    include: {
      channel: {
        include: {
          links: true,
        },
      },
    },
    cursor: _args.cursor
      ? {
          id: _args.cursor,
        }
      : undefined,
    skip: _args.cursor ? 1 : 0,
    orderBy: {
      sortTime: 'desc',
    },
    take: Math.min(_args.limit, config.GRAPHQL_MAX_RECENT_VIDEOS),
  });

...the relevant parts of my Prisma schema (some fields left out for brevity, and some of my indexes have temporarily been defined outside of my Prisma schema, directly on the DB while I debug this)...

model Channel {
  id                 Int           @id @default(autoincrement())
  status             ChannelStatus @default(value: ACTIVE)
  videos             Video[]
  links              ChannelLink[]
}

model Video {
  id                 Int           @id @default(autoincrement())
  channel            Channel       @relation(fields: [channelId], references: [id])
  channelId          Int
  status             VideoStatus   @default(value: PUBLISHED)
  sortTime           DateTime

  @@index([status])
}

...the SQL that this generates (Prisma 4.0.0)...

SELECT
	"public"."Video"."id",
	"public"."Video"."youtubeId",
	"public"."Video"."channelId",
	"public"."Video"."type",
	"public"."Video"."status",
	"public"."Video"."reviewed",
	"public"."Video"."category",
	"public"."Video"."youtubeTags",
	"public"."Video"."language",
	"public"."Video"."title",
	"public"."Video"."description",
	"public"."Video"."duration",
	"public"."Video"."durationSeconds",
	"public"."Video"."viewCount",
	"public"."Video"."likeCount",
	"public"."Video"."commentCount",
	"public"."Video"."scheduledStartTime",
	"public"."Video"."actualStartTime",
	"public"."Video"."actualEndTime",
	"public"."Video"."sortTime",
	"public"."Video"."createdAt",
	"public"."Video"."updatedAt",
	"public"."Video"."publishedAt"
FROM
	"public"."Video",
	(
		SELECT
			"public"."Video"."sortTime" AS "Video_sortTime_0"
		FROM
			"public"."Video"
		WHERE ("public"."Video"."id") = (29949)) AS "order_cmp"
WHERE (("public"."Video"."id")
	IN(
		SELECT
			"t0"."id" FROM "public"."Video" AS "t0"
			INNER JOIN "public"."Channel" AS "j0" ON ("j0"."id") = ("t0"."channelId")
		WHERE ((NOT "j0"."status" IN('HIDDEN', 'ARCHIVED'))
		AND "t0"."id" IS NOT NULL))
	AND "public"."Video"."status" IN('UPCOMING', 'LIVE', 'PUBLISHED')
	AND "public"."Video"."sortTime" <= "order_cmp"."Video_sortTime_0")
ORDER BY
	"public"."Video"."sortTime" DESC OFFSET 0;

...and my much faster version version that I wrote:

SELECT
	*
FROM
	"Video",
	"Channel"
WHERE
	"Channel".id = "Video"."channelId"
	AND "Channel".status NOT IN ('HIDDEN', 'ARCHIVED')
	AND "Video"."status" in('UPCOMING', 'LIVE', 'PUBLISHED')
	AND "Video"."sortTime" <= (
		SELECT
			"sortTime"
		FROM
			"Video"
		WHERE
			id = 29949)
	ORDER BY
		"sortTime" DESC
	LIMIT 84;

I also noticed that the Prisma syntax does not add a LIMIT statement, despite there being a take parameter. Is that to be expected? Curiously, if I add an explicit LIMIT to the Prisma generated SQL, the query takes about twice as long.

I've also tried using a composite index (CREATE INDEX "Video_sortTime_status_idx" ON "Video"("sortTime" DESC, status)) as suggested on SO, but this hasn't helped (updated EXPLAIN output). Over on SO, they suggested Postgres will only use one index, and I guess it's favouring Prisma's default one it set up for the primary key?

The text was updated successfully, but these errors were encountered:

matus-sabo · 2022-07-20T12:11:55Z

Maybe its relatated to cursor pagination with desc order #12650

jschuur · 2022-07-20T13:11:31Z

Maybe its relatated to cursor pagination with desc order #12650

I do have an explicit DESC index set for sortTime though.

matus-sabo · 2022-07-20T19:58:49Z

U can try implement cursor pagination without prisma cursor. You can find some inspiration there #11138 (comment)

matus-sabo · 2022-07-20T20:00:59Z

Prisma cursor don't work when u provide non existent cursor #14283

jschuur · 2022-07-21T06:34:52Z

At this point, I've gone back to using an offset based pagination (code or see it in action here when you scroll to the bottom of the page). While this is normally not ideal, I usually only need the first few hundred records of a sorted list of over 130k, so the offset value never gets to big that the usual performance problems of having to traverse too much data via an offset will come up.

As there are a few cursor-related issues reported here, perhaps there will be some optimizations on the Prisma side down the line so I can revisit this.

jschuur · 2022-07-21T06:41:42Z

U can try implement cursor pagination without prisma cursor. You can find some inspiration there #11138 (comment)

I can't do a simple greater than comparison on the ID as a cursor replacement like it's suggested in this comment. The sortTime order and my ID sort order are not always in sync, so I would need to do something like this:

AND "Video"."sortTime" <= (
	SELECT
		"sortTime"
	FROM
		"Video"
	WHERE
		id = 29949)

I can't do that natively in Prisma (without Prisma cursors), and if I split this into two Prisma calls to get the sortTime first, that's another performance hit that negates any improvements.

pranavbadami · 2022-09-03T19:32:08Z

Having a very similar issue to this with related tables as well.

In my case, I'm trying to filter on a few related tables and seeing much more complicated logic generated by prisma versus when I add a simple join manually.

Here's the query I'm passing to prismaClient:

const invocations = await prismaClient.triggerInvocation.findMany({
    include: {
      triggerStep: {
        include: {
          transitions: true,
        },
      },
    },
    where: {
      scheduledNextStepCount: 0,
      scheduledTask: {
        status: "completed",
      },
      variables: {
        path: ["user", "id"],
        equals: userId,
      },
  
      triggerStep: {
        platform: platform,
        communityId: communityId,
        trigger: {
          isActive: true, < === THIS IS THE PROBLEMATIC PART
        },
        transitions: {
          some: {
            OR: USER_RESPONSE_TYPES.map((type) => ({
              when: {
                path: ["userResponse", "type"],
                equals: type,
              },
            })),
          },
        },
      },
    },
  });

I added the trigger: {isActive: true} because I realized for my app this was going to be more performant. Here's how I added this join in my own SQL (left) vs how Prisma added it (right):

The full query diff is here

Does anyone know why Prisma created the join this way? When I run explain analyze with my query I had 198ms execution time and Prisma's is 1.9 seconds (10x difference)

I would love to use the query syntax and not use a raw query, so any help would be greatly appreciated! Thanks in advance 🙏

pranavbadami · 2022-09-03T20:44:35Z

My issue above seems to be related to this: #13306 Seems like I might have to go with a raw query after all...

aqrln added team/client Issue for team Client. topic: performance topic: postgresql bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. kind/bug A reported bug. labels Jul 21, 2022

janpio added the topic: performance/queries label Apr 27, 2023

Weakky added bug/2-confirmed Bug has been reproduced and confirmed. and removed bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. labels Aug 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prisma generated query takes 10x as long than a manually composed one #14402

Prisma generated query takes 10x as long than a manually composed one #14402

jschuur commented Jul 20, 2022 •

edited

matus-sabo commented Jul 20, 2022

jschuur commented Jul 20, 2022 •

edited

matus-sabo commented Jul 20, 2022

matus-sabo commented Jul 20, 2022

jschuur commented Jul 21, 2022 •

edited

jschuur commented Jul 21, 2022 •

edited

pranavbadami commented Sep 3, 2022

pranavbadami commented Sep 3, 2022

Prisma generated query takes 10x as long than a manually composed one #14402

Prisma generated query takes 10x as long than a manually composed one #14402

Comments

jschuur commented Jul 20, 2022 • edited

matus-sabo commented Jul 20, 2022

jschuur commented Jul 20, 2022 • edited

matus-sabo commented Jul 20, 2022

matus-sabo commented Jul 20, 2022

jschuur commented Jul 21, 2022 • edited

jschuur commented Jul 21, 2022 • edited

pranavbadami commented Sep 3, 2022

pranavbadami commented Sep 3, 2022

jschuur commented Jul 20, 2022 •

edited

jschuur commented Jul 20, 2022 •

edited

jschuur commented Jul 21, 2022 •

edited

jschuur commented Jul 21, 2022 •

edited