Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected query leading to querying full table when using batched findUnique() #23343

Closed
rgmvisser opened this issue Mar 4, 2024 · 10 comments · Fixed by prisma/prisma-engines#4789
Assignees
Labels
bug/2-confirmed Bug has been reproduced and confirmed. kind/bug A reported bug. team/client Issue for team Client. topic: batching topic: findUnique()
Milestone

Comments

@rgmvisser
Copy link

rgmvisser commented Mar 4, 2024

Bug description

This bug occurs when running multiple findUnique queries when extending the prisma client.

Resulting query:

SELECT
	"public"."Post"."tenantId",
	"public"."Post"."userId",
	"public"."Post"."id",
	"public"."Post"."text"
FROM "public"."Post"
WHERE (
	("public"."Post"."tenantId" = $1 AND "public"."Post"."userId" = $2)
	OR "public"."Post"."tenantId" = $3
	OR ("public"."Post"."tenantId" = $4 AND "public"."Post"."userId" = $5)
	OR "public"."Post"."tenantId" = $6
)
OFFSET $7

Because we are dynamically adding the tenantId to every query, this results in a query where all posts of the tenant are being queried (see OR "public"."Post"."tenantId" = $3 OR). It does return the correct results, but it causes a lot of load as the database will return all Posts for a tenant.

This was super hard to debug, but luckily found this article eventually that lead me in this direction: https://www.prisma.io/docs/orm/prisma-client/queries/query-optimization-performance

At the moment, I have changed all queries to "findFirst", but this is a bit of a footgun as it results in very unexpected behavior.

How to reproduce

I have made a repo with a reproduction here: https://github.com/rgmvisser/prisma-bug-reproduction

Expected behavior

That findUnique also takes into account the added "where" clause of the tenant inside the OR instead of outside. So instead of:

SELECT
	"public"."Post"."tenantId",
	"public"."Post"."userId",
	"public"."Post"."id",
	"public"."Post"."text"
FROM "public"."Post"
WHERE (
	("public"."Post"."tenantId" = $1 AND "public"."Post"."userId" = $2)
	OR "public"."Post"."tenantId" = $3
	OR ("public"."Post"."tenantId" = $4 AND "public"."Post"."userId" = $5)
	OR "public"."Post"."tenantId" = $6
)
OFFSET $7

A query like:

SELECT
	"public"."Post"."tenantId",
	"public"."Post"."userId",
	"public"."Post"."id",
	"public"."Post"."text"
FROM "public"."Post"
WHERE (
	(
		"public"."Post"."tenantId" = $1
		AND "public"."Post"."userId" = $2
		AND "public"."Post"."tenantId" = $3
	)
	OR (
		"public"."Post"."tenantId" = $4
		AND "public"."Post"."userId" = $5
		AND "public"."Post"."tenantId" = $6
	)
)
OFFSET $7

Prisma information

Schema

generator client {
  provider = "prisma-client-js"
}

datasource db {
  provider = "postgresql"
  url      = env("DATABASE_URL")
}

model Post {
  id       Int
  tenantId String
  userId   Int
  text     String

  @@unique([tenantId, userId])
}

Code

import { PrismaClient } from "@prisma/client";

const prisma = new PrismaClient({ log: ["query", "info", "warn"] });

function getTenantClient(tenantId: string) {
  return prisma.$extends({
    query: {
      $allModels: {
        $allOperations({ model, operation, args, query }) {
          let argsWithTenant: Record<string, any> = args;

          if (argsWithTenant?.where?.tenantId) {
            return query(args);
          } else if (argsWithTenant?.where) {
            argsWithTenant.where = {
              ...argsWithTenant.where,
              tenantId: tenantId,
            };
          } else if (argsWithTenant) {
            argsWithTenant.where = { tenantId: tenantId };
          } else {
            argsWithTenant = { where: { tenantId: tenantId } };
          }

          return query(argsWithTenant);
        },
      },
    },
  });
}

type TenantClient = ReturnType<typeof getTenantClient>;

async function main() {
  await prisma.post.deleteMany({});

  const tenant1Client = getTenantClient("tenant1");
  const tenant2Client = getTenantClient("tenant2");

  await prisma.post.create({
    data: {
      id: 1,
      tenantId: "tenant1",
      userId: 1,
      text: "Post 1!",
    },
  });
  await prisma.post.create({
    data: {
      id: 2,
      tenantId: "tenant1",
      userId: 2,
      text: "Post 2!",
    },
  });
  await prisma.post.create({
    data: {
      id: 3,
      tenantId: "tenant2",
      userId: 3,
      text: "Post 3!",
    },
  });
  await prisma.post.create({
    data: {
      id: 4,
      tenantId: "tenant2",
      userId: 4,
      text: "Post 4!",
    },
  });

  const queryPost = (client: TenantClient, tenantId: string, userId: number) =>
    client.post.findUnique({
      where: {
        tenantId_userId: {
          tenantId,
          userId,
        },
      },
    });

  const concurrentPosts = await Promise.all([
    queryPost(tenant1Client, "tenant1", 1),
    queryPost(tenant2Client, "tenant2", 3),
  ]);

  console.log({ concurrentPosts });

  /* 
 {
  concurrentPosts: [
    { id: 1, tenantId: 'tenant1', userId: 1, text: 'Post 1!' },
    { id: 3, tenantId: 'tenant2', userId: 3, text: 'Post 3!' }
  ]
}
 */
}

main().catch((e) => console.error(e));

Environment & setup

  • OS: macOS
  • Database: Postgres
  • Node.js version: v21.5

Prisma Version

prisma                  : 5.10.2
@prisma/client          : 5.10.2
Computed binaryTarget   : darwin-arm64
Operating System        : darwin
Architecture            : arm64
Node.js                 : v21.5.0
Query Engine (Node-API) : libquery-engine 5a9203d0590c951969e85a7d07215503f4672eb9 (at node_modules/@prisma/engines/libquery_engine-darwin-arm64.dylib.node)
Schema Engine           : schema-engine-cli 5a9203d0590c951969e85a7d07215503f4672eb9 (at node_modules/@prisma/engines/schema-engine-darwin-arm64)
Schema Wasm             : @prisma/prisma-schema-wasm 5.10.0-34.5a9203d0590c951969e85a7d07215503f4672eb9
Default Engines Hash    : 5a9203d0590c951969e85a7d07215503f4672eb9
Studio                  : 0.499.0
@rgmvisser rgmvisser added the kind/bug A reported bug. label Mar 4, 2024
@janpio janpio changed the title Unexpected query leading to querying full table when using findUnique in combination with extending the prisma client Unexpected query leading to querying full table when using findUnique() in combination with extending the prisma client Mar 4, 2024
@janpio
Copy link
Member

janpio commented Mar 4, 2024

Do I understand correctly that running a single findUnique() will return the correct data, but then when you run multiple per tick to trigger Prisma's batching, the query becomes invalid/too broad?

@rgmvisser
Copy link
Author

rgmvisser commented Mar 4, 2024

Do I understand correctly that running a single findUnique() will return the correct data, but then when you run multiple per tick to trigger Prisma's batching, the query becomes invalid/too broad?

Correct, it becomes too broad. The SQL query will return all posts of that specific tenant. The result that prisma returns is still correct, but my guess is that's because you filter under the hood.

@janpio
Copy link
Member

janpio commented Mar 4, 2024

Could you maybe try to run the similar complete queries manually in parallel, instead of modifying them via client extensions? That would let us isolate if this is a Client extension problem, or a general batching problem for the type of query you are constructing via the extension.

@rgmvisser
Copy link
Author

rgmvisser commented Mar 4, 2024

No problem @janpio , running the code like this (assuming this is what you meant):

const queryPost = (tenantId: string, userId: number) =>
    prisma.post.findUnique({
      where: {
        tenantId_userId: {
          tenantId,
          userId,
        },
        tenantId,
      },
    });

  const concurrentPosts = await Promise.all([queryPost("tenant1", 1), queryPost("tenant2", 3)]);

Results in the SQL query:

SELECT
	"public"."Post"."tenantId",
	"public"."Post"."userId",
	"public"."Post"."id",
	"public"."Post"."text"
FROM "public"."Post"
WHERE (
	("public"."Post"."tenantId" = $1 AND "public"."Post"."userId" = $2)
	OR "public"."Post"."tenantId" = $3
	OR ("public"."Post"."tenantId" = $4 AND "public"."Post"."userId" = $5)
	OR "public"."Post"."tenantId" = $6
) OFFSET $7

So it looks like it does not have anything to do with extending the model, but with findUnique batching.

@janpio
Copy link
Member

janpio commented Mar 4, 2024

That was exactly what I was hoping for, so this is isolated to "just" batching and we can figure out why this is happening. Thanks!

@janpio janpio added bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. topic: findUnique() team/client Issue for team Client. labels Mar 4, 2024
@janpio janpio changed the title Unexpected query leading to querying full table when using findUnique() in combination with extending the prisma client Unexpected query leading to querying full table when using batched findUnique() Mar 4, 2024
@rgmvisser
Copy link
Author

@janpio Just curious if there was any updates on this?

@janpio
Copy link
Member

janpio commented Mar 18, 2024

No, this is now waiting to be picked up by someone from the engineering team for a full reproduction attempt.

@Druue Druue self-assigned this Mar 21, 2024
@Druue Druue added bug/2-confirmed Bug has been reproduced and confirmed. and removed bug/1-unconfirmed Bug should have enough information for reproduction, but confirmation has not happened yet. labels Mar 27, 2024
@Druue
Copy link
Contributor

Druue commented Mar 27, 2024

Hey @rgmvisser,

Thanks for highlighting this issue!
So what you ran into here with the following WHERE statement being generated was a bug in how we handle indices in findUnique{where}

WHERE (
	("public"."Post"."tenantId" = $1 AND "public"."Post"."userId" = $2)
	OR "public"."Post"."tenantId" = $3
	OR ("public"."Post"."tenantId" = $4 AND "public"."Post"."userId" = $5)
	OR "public"."Post"."tenantId" = $6
)

This was reproducible even without the second query as it was caused by the appearance of a field twice: once by itself, and once through the compound key. Our logic was set-up where if we encounter a key twice, we treated that as a new query, hence the many ORs. So, just one findUnique() call was enough to reproduce as that gave us the following sql:

WHERE (
	("public"."Post"."tenantId" = $1 AND "public"."Post"."userId" = $2)
	OR "public"."Post"."tenantId" = $3
)

Where really that OR should be an AND.

Thanks again!

@rgmvisser
Copy link
Author

Thanks for the update! :)

Is there a timeline for when this will be resolved?

@Druue
Copy link
Contributor

Druue commented Mar 27, 2024

You can track progress in the linked PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug/2-confirmed Bug has been reproduced and confirmed. kind/bug A reported bug. team/client Issue for team Client. topic: batching topic: findUnique()
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants