-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query is 5 times slower when using findMany
compared to running identical query via queryRaw
#11130
Comments
Can you share the log output of the query generated and executed by Prisma under the hood? (https://pris.ly/d/logging or via https://github.com/peterrogov/prisma-performance-test/blob/master/src/index.ts#L10) |
Here's the query generated by Prisma SELECT "public"."StringIdModel"."id", "public"."StringIdModel"."value" FROM "public"."StringIdModel" WHERE 1=1 ORDER BY "public"."StringIdModel"."id" ASC LIMIT $1 OFFSET $2 If I modify my test code to re-execute the generated query via
|
Apparently, the most of the execution time difference comes from this ORDER BY "public"."StringIdModel"."id" ASC Also, apparently, this bit is a must. Otherwise postgres will return N rows but random ones... To get exactly the same rows it has to order them first. However, even after rewriting my custom query and including Also, I do not fully understand why sequential integer Id requires |
Below are the time measurements for for three different models. One has String UUID ID, the other one is Int and the last one is BigInt.
Custom query now includes |
But I see correctly that we are down from 130x performance impact in the initial post to <20% for using Prisma Client API vs. running the same generated query with the $queryRaw method, correct? What is the difference left between the generated and the custom query? |
@janpio Yes, you are right the 130x difference was due to ORDER BY but that was my silly mistake. I can't see any other difference that would make sense apart from a more verbose syntax of Prisma queries. But since the final query is still quite short and simple I don't think that alone could create a ~20% difference in execution time. Also, as you can see, when I take the same query (generated by Prisma) and execute it via It feels like there's still something happening in prisma that adds 10-20% overhead to query execution even for such a simplistic scenario. I didn't test on more complex queries but I probably should. And I wish to emphasize that this difference in execution time is worse for models that have a string ID column. Those where ID is Int or BigInt are very close in terms of execution time. |
That is the difference between running Prisma the ORM, and using Prisma to execute a plain, predefined SQL statement. The additional work around typing, creating the query and so on is expected to add some overhead. We still might want to investigate where this overhead comes from, and if we can improve it of course. But it will not have very high priority as we are currently still focused on building out missing functionality. For most users this overhead is totally acceptable (and if not, they can use the raw query as you described). |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Just came across this thread. There is an optimisation for order by and skip, to wrap the order query and do the skip after. Like this... SELECT * FROM ( Is there any way to hint Prisma to do this, or do I have to go to raw SQL? Just asking out of curiosity. TIA. |
Note sure if this is an oversight or the root cause but in your example you inverted the skip and offset parameter for the findMany call |
While in the original repro the difference was mainly caused by the different query that was missing an
The order in which these queries are run doesn't matter:
Seeding script: import { PrismaClient } from '@prisma/client'
async function main() {
const prisma = new PrismaClient()
const total = 1_000_000;
for (let i = 0; i < total; i++) {
if (i % 1000 === 0) {
console.log(`${Math.round((i / total) * 100)}%`)
}
await prisma.stringIdModel.create({
data: {
value: String(Math.random()),
},
})
}
}
void main() Reproduction script: import { PrismaClient } from '@prisma/client'
async function main() {
const prisma = new PrismaClient({
log: ['query'],
})
const queryLimit = 1000
const queryOffset = 950000
console.time('findMany')
await prisma.stringIdModel.findMany({ skip: queryLimit, take: queryOffset })
console.timeEnd('findMany')
console.time('queryRaw')
await prisma.$queryRaw`SELECT "public"."StringIdModel"."id", "public"."StringIdModel"."value" FROM "public"."StringIdModel" WHERE 1=1 ORDER BY "public"."StringIdModel"."id" ASC LIMIT ${queryLimit} OFFSET ${queryOffset}`
console.timeEnd('queryRaw')
}
void main() The schema is based on the original report, posting it here for convenience as well: generator client {
provider = "prisma-client-js"
}
datasource db {
provider = "postgresql"
url = env("POSTGRESQL_DATABASE_URL")
}
model StringIdModel {
id String @id @default(uuid())
value String
} I get approximately the same difference both with latest Prisma 5 and with reported Prisma 3.8.0. Looking at the traces, one significant contributor for Marking this as "confirmed" for now so we can look into it in more details and see if it makes sense and whether that's expected or not. |
findMany
compared to running identical query via queryRaw
Thanks so much for catching this! I totally overlooked both this and your comment, and copied the query as is from the original repro. Fixing this indeed makes the difference to be in the much more expected range:
Tracing shows that the actual overhead of I'll go ahead and close the issue. Please let me know if something was not taken into account and it should be reopened, or open an issue if you are facing a significant overhead with Prisma for any query. |
Bug description
I am experiencing a huge discrepancy in query execution time when using prisma 3.8.0 with postgres (local installation).
When I query a large table made of simple records and use pagination params (
skip
andtake
) this query takes a very long time to execute viaprisma.<model>.findMany
however when I use a similar query withprisma.queryRaw
the execution time is much less. Normal prisma queries are dramatically slower.The model I use is very simple and defined as follows
Assuming the database table is filled with 1M random records and I run tthis code:
This is the result I get
As you can see the difference is 130 (!) times.
I have tried running this many times with different ID column types, different dataset sizes and I can't see any other reason why this is the case. The only thing I can think of is something going wrong inside prisma.
Here's a repo that reproduces the problem
https://github.com/peterrogov/prisma-performance-test
How to reproduce
Use the following repository that has code with reproduction https://github.com/peterrogov/prisma-performance-test
OR
prisma.<model>.findMany
and specify skip and take params. Provideskip
andtake
such that you're fetching from the end of the dataset. Measure query execution time.prisma.queryRaw
and make query likeSELECT * FROM ... OFFSET ... LIMIT ...
. Run the query and measure execution timeExpected behavior
There shouldn't be any significant difference executing such simple queries directly via SQL or via prisma methods. The time it takes a query with prisma must be close to a direct raw SQL.
Prisma information
@prisma/client 3.8.0
Environment & setup
PostgreSQL 12.4
MacOS 11.6.2
NodeJS 17.3
Prisma Version
The text was updated successfully, but these errors were encountered: