-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf(core): Accelerate common execution queries #9817
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
packages/cli/src/databases/migrations/common/1717498465932-RefactorExecutionIndices.ts
Outdated
Show resolved
Hide resolved
Thanks, from the story I was under the impression we only wanted to touch these, rather than all of them. |
packages/cli/src/databases/migrations/common/1717498465932-RefactorExecutionIndices.ts
Outdated
Show resolved
Hide resolved
@@ -19,9 +19,18 @@ abstract class IndexOperation extends LazyPromise<void> { | |||
protected tablePrefix: string, | |||
queryRunner: QueryRunner, | |||
protected customIndexName?: string, | |||
protected skipIfMissing = false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should only be on the DropIndex
class
export class DropIndex extends IndexOperation {
constructor(
tableName: string,
columnNames: string[],
tablePrefix: string,
queryRunner: QueryRunner,
customIndexName?: string,
protected skipIfMissing = false,
) {
super(tableName, columnNames, tablePrefix, queryRunner, customIndexName);
}
async execute(queryRunner: QueryRunner) {
return await queryRunner
.dropIndex(this.fullTableName, this.customIndexName ?? this.fullIndexName)
.catch((error) => {
if (error instanceof Error && error.message.includes('not found') && this.skipIfMissing) {
return;
}
throw error;
});
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we somehow use the IF EXISTS
parameter for DBs that support it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should only be on the
DropIndex
class
could we somehow use the IF EXISTS parameter for DBs that support it?
We could call queryRunner.query
with handcrafted SQL and then fall back to ignoring the error in a catch clause if IF EXISTS
is unsupported, but I think the current approach is simpler. Let me know otherwise!
packages/cli/src/databases/migrations/common/1717498465932-RefactorExecutionIndices.ts
Outdated
Show resolved
Hide resolved
packages/cli/src/databases/migrations/common/1717498465932-RefactorExecutionIndices.ts
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple comments, mainly about the structure of the indexes
return await queryRunner | ||
.dropIndex(this.fullTableName, this.customIndexName ?? this.fullIndexName) | ||
.catch((error) => { | ||
if (error instanceof Error && error.message.includes('not found') && this.skipIfMissing) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we certain that this is the message returned by all DBs? It's unfortunate that we need to rely on this type of hack just because MySQL doesn't support IF EXISTS
. We could also also use the IF EXISTS
for those that support it (SQLite, PG) and rely on this for MySQL. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They all throw the same error TypeORMError: Supplied index {indexId} was not found in table {table}
.
Re: handling differently depending on DB, I replied about this above - my reasoning is it's simpler to have one way of handling this. Once we remove MySQL this will no longer be an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, makes sense 👍 Maybe we can be more specific and expect a TypeORMError
instead of any generic Error
?
* - `status, startedAt` for `ExecutionRepository.findManyByRangeQuery` (default query) | ||
* - `workflowId, status, startedAt` for `ExecutionRepository.findManyByRangeQuery` (filter query) | ||
* - `waitTill, status` for `ExecutionRepository.getWaitingExecutions` | ||
* - `stoppedAt, deletedAt, status` for `ExecutionRepository.softDeletePrunableExecutions` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we actually done any investigation how much these indexes help and how much they impact write performance? Having multiple indexes (especially composite) is not cheap as they need to be updated with every insert, update and delete.
Looking at the fields, the (status, startedAt)
index can probably also serve the ExecutionRepository.findManyByRangeQuery
query as the only column missing is workflowId
. We could verify this.
For the last two indexes we could create a partial index using WHERE
clause to include only rows that have waitTill
and stoppedAt
as NOT NULL
and deletedAt
as IS NULL
. That way the index size is more limited and performs better as that's how we use them in the queries. Again, MySQL doesn't support partial indexes so in that case we could create the indexes without the WHERE clause.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have we actually done any investigation how much these indexes help and how much they impact write performance? Having multiple indexes (especially composite) is not cheap as they need to be updated with every insert, update and delete. Looking at the fields, the (status, startedAt) index can probably also serve the ExecutionRepository.findManyByRangeQuery query as the only column missing is workflowId. We could verify this.
I think Omar did but that's now lost. If you have time tomorrow, let's pair on this and you can show me how?
For the last two indexes we could create a partial index using WHERE clause to include only rows that have waitTill and stoppedAt as NOT NULL and deletedAt as IS NULL. That way the index size is more limited and performs better as that's how we use them in the queries. Again, MySQL doesn't support partial indexes so in that case we could create the indexes without the WHERE clause.
Will do tomorrow 👍🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let's do it 👍 Basically the steps are:
- Have a DB with relevant amount of data
- Create the indexes
- Run different queries to see how they perform. Run
EXPLAIN ANALYZE
to see if the DBMS is utilizing the indexes - Compare results
await schemaBuilder.createIndex('execution_entity', ['status', 'startedAt']); | ||
await schemaBuilder.createIndex('execution_entity', ['workflowId', 'status', 'startedAt']); | ||
await schemaBuilder.createIndex('execution_entity', ['waitTill', 'status']); | ||
await schemaBuilder.createIndex('execution_entity', ['stoppedAt', 'deletedAt', 'status']); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we give explicit names to these so they are better identifiable and modifying them later on is easier
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant name them explicitly, so it's obvious from the code what the index names will be, instead of letting the ORM generate the names. But those are also fine as long as they are stable across all different DBs
https://linear.app/n8n/issue/PAY-1609