Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

findUnique() batching should deduplicate equality checks when where clause parameters are the same #22902

Open
astnmsn opened this issue Jan 31, 2024 · 11 comments
Labels
domain/client Issue in the "Client" domain: Prisma Client, Prisma Studio etc. kind/improvement An improvement to existing feature and code. tech/engines Issue for tech Engines. topic: batching topic: findUnique() topic: performance/queries topic: performance topic: where / filter

Comments

@astnmsn
Copy link

astnmsn commented Jan 31, 2024

Problem

When findUnique queries are issued with the same where and include parameters within the same tick, they are batched into a single findUnique with each of the original parameters OR'd together. Resulting in an output from the prisma query log like this:

("public"."Roles"."userId" = $1 AND "public"."Roles"."projectId" = $2) 
OR ("public"."Roles"."userId" = $3 AND "public"."Roles"."projectId" = $4)
...
OR ("public"."Roles"."userId" = $N AND "public"."Roles"."projectId" = $(N+1))

However, in cases that the where clause values are the same across queries, this could be flattened to reduce the number of equality checks like so:

("public"."Roles"."userId" = $1 AND "public"."Roles"."projectId" = $2) 
OR ("public"."Roles"."userId" = $3 AND "public"."Roles"."projectId" = $4)

where $1 === $3 and $2 === $4

would become

("public"."Roles"."userId" = $1 AND "public"."Roles"."projectId" = $2)

Suggested solution

Deduplicate these queries to reduce the size of the where clause generated

Alternatives

None

Additional context

This is especially useful when using graphQL and trying to reduce the overhead associated with the N+1 problem.

@aqrln aqrln added kind/improvement An improvement to existing feature and code. tech/engines Issue for tech Engines. domain/client Issue in the "Client" domain: Prisma Client, Prisma Studio etc. topic: batching labels Feb 1, 2024
@aqrln
Copy link
Member

aqrln commented Feb 1, 2024

This sounds like a nice improvement, thank you for the suggestion. Have you observed any real performance issues because of duplication (e.g. increased latency, optimiser not being able to collapse them on the database side, etc) or is it more of a thing that just came up when you saw the query?

@janpio janpio changed the title findUnique batching should deduplicate equality checks when where clause parameters are the same findUnique batching should deduplicate equality checks when where clause parameters are the same Feb 1, 2024
@astnmsn
Copy link
Author

astnmsn commented Feb 3, 2024

I don’t have exact metrics at the moment (but can pull them if it helps). I have found some instances where the optimizer does not correctly collapse the clauses, but that is a rare occurrence. In our use case we can see this number of clauses in the WHERE statement reach into the thousands. So if nothing else it is consistently having an impact on the time parse the query itself and create and the execution plan

@janpio janpio changed the title findUnique batching should deduplicate equality checks when where clause parameters are the same findUnique() batching should deduplicate equality checks when where clause parameters are the same Mar 14, 2024
@dickfickling
Copy link

@aqrln I have run into performance issues with this due to our extension that adds soft delete support. Here's what our extension looks like:

const prisma = basePrisma.$extends({
  name: "Soft Delete",
  query: {
    $allModels: {
      findUnique: ({ args, query, model }) => {
        return query({ ...args, where: { deletedAt: null, ...args.where } });
      },
      findUniqueOrThrow: ({ args, query, model }) => {
        return query({ ...args, where: { deletedAt: null, ...args.where } });
      },
    },
  },
});

Concurrent queries that look like this are handled two different ways depending on whether the extension is enabled:

prisma.myTable.findUniqueOrThrow({ where: { id: "a" } });
prisma.myTable.findUniqueOrThrow({ where: { id: "b" } });
prisma.myTable.findUniqueOrThrow({ where: { id: "c" } });
prisma.myTable.findUniqueOrThrow({ where: { id: "d" } });

Without the extension, it becomes:

select * from "myTable" where id in ("a", "b", "c", "d");

With the extension, it becomes:

select * from "myTable" where (id = "a" and "isDeleted" is null) or (id = "b" and "isDeleted" is null) or (id = "c" and "isDeleted" is null) or (id = "d" and "isDeleted" is null)

Ideally, it would be batched as:

select * from "myTable" where id in ("a", "b", "c", "d") and isDeleted is null;

Sometimes upwards of 2-3k of these queries are batched at once, and where the original or ideal query work just fine, the query where every id/isDeleted combination is specified explicitly causes our database to hang.

@chris-basis
Copy link

Hey piggybacking on this. Running into the same issue that ends up throwing an error because we're exceeding the number of bound parameters allowed by our driver.

We would not be running into this issue just yet if Prisma was a little more efficient in the batch query generated.

Also using this thread to suggest limiting the batch size to prevent parameter binding limits being hit.

@janpio
Copy link
Member

janpio commented Apr 23, 2024

@chris-basis Please make sure to also open a bug report issue if you are running into a problem/crash because of this. We need that information to fully understand the situation. Thanks.

@chris-basis
Copy link

@chris-basis Please make sure to also open a bug report issue if you are running into a problem/crash because of this. We need that information to fully understand the situation. Thanks.

#23957 👍

@janpio janpio changed the title findUnique() batching should deduplicate equality checks when where clause parameters are the same findUnique() batching should deduplicate equality checks when where clause parameters are the same May 31, 2024
@janpio
Copy link
Member

janpio commented May 31, 2024

@dickfickling Do you know why this is causing a performance problem? Generally the database should be smart enough to use the same indexes in a query where "isDeleted" is null is repeated multiple times. And just the additional characters in the query string also should not have any significant impact.

If you have an explicit reproduction, that would be amazing. Thanks!


Aside from that, better and more readable queries are of course desirable, which this improvement would lead to.

@dickfickling
Copy link

dickfickling commented Jun 3, 2024

@janpio I haven't dug into the issue much, but here's a repo that displays the error on my machine: https://github.com/dickfickling/prisma_perf. To reproduce: yarn infra, yarn push, and yarn go

With this specific setup the error (yarn go just hangs forever) seems to crop up with queries for 5000-7000 records. We've reproduced it in our prod env with as few as 2000 records.

@janpio
Copy link
Member

janpio commented Jun 3, 2024

@dickfickling Hm, that works fine for me:

yarn run v1.22.22
$ dotenv -e .env.dev -- node --loader ts-node/esm src/index.ts
prisma:query SELECT "public"."User"."id", "public"."User"."firstName", "public"."User"."lastName", "public"."User"."email", "public"."User"."createdAt", "public"."User"."updatedAt", "public"."User"."deletedAt" FROM "public"."User" WHERE (("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $1) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $2) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $3) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $4) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $5) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $6) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $7) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $8) OR ("public"."User"."deletedAt" IS NUL [...] $9940) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9941) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9942) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9943) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9944) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9945) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9946) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9947) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9948) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9949) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9950) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9951) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9952) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9953) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9954) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9955) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9956) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9957) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9958) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9959) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9960) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9961) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9962) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9963) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9964) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9965) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9966) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9967) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9968) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9969) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9970) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9971) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9972) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9973) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9974) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9975) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9976) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9977) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9978) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9979) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9980) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9981) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9982) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9983) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9984) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9985) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9986) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9987) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9988) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9989) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9990) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9991) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9992) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9993) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9994) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9995) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9996) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9997) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9998) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $9999) OR ("public"."User"."deletedAt" IS NULL AND "public"."User"."id" = $10000)) OFFSET $10001
[
  null,
  {
    id: 1,
    firstName: '0',
    lastName: '0',
    email: '0@0.co',
    createdAt: 2024-06-03T14:57:47.460Z,
    updatedAt: 2024-06-03T14:57:47.460Z,
    deletedAt: null
  },
  {
    id: 2,
    firstName: '1',
    lastName: '1',
    email: '1@1.co',
    createdAt: 2024-06-03T14:57:47.460Z,
    updatedAt: 2024-06-03T14:57:47.460Z,
    deletedAt: null
  },
  [...]
  {
    id: 98,
    firstName: '97',
    lastName: '97',
    email: '97@97.co',
    createdAt: 2024-06-03T14:57:47.460Z,
    updatedAt: 2024-06-03T14:57:47.460Z,
    deletedAt: null
  },
  {
    id: 99,
    firstName: '98',
    lastName: '98',
    email: '98@98.co',
    createdAt: 2024-06-03T14:57:47.460Z,
    updatedAt: 2024-06-03T14:57:47.460Z,
    deletedAt: null
  },
  ... 9900 more items
]
Done in 12.50s.

Should this fail every time, or just sometimes?

@dickfickling
Copy link

dickfickling commented Jun 4, 2024

if you increase from 10k is there a number where the extended prisma with middleware stops working but the base prisma still works?

i think it's something to do with memory or CPU limits in parsing the query. for context i'm running postgres with 2gb of memory and 4vcpus

edit: at 10k it fails every time for me

@janpio
Copy link
Member

janpio commented Jun 4, 2024

I split out the discussion with @dickfickling about the database hanging into its own issue: #24419 (I will hide the comments above some time in the future so clean up this issue thread.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain/client Issue in the "Client" domain: Prisma Client, Prisma Studio etc. kind/improvement An improvement to existing feature and code. tech/engines Issue for tech Engines. topic: batching topic: findUnique() topic: performance/queries topic: performance topic: where / filter
Projects
None yet
Development

No branches or pull requests

5 participants