New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delete runs in daterange in batches #433
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job @Roemer! Thanks a lot for fixing that code I am happy to see that people use Sorry Cypress to run as many test that we are facing performance issues 😬
Just a few minor comments:
- batched iteration on results is great and simple! I think the same result could be achieved with native mongoDB cursor, but your implementation is totally find!
- if we are really looking to optimize it, we can limit the fields we get from
getRunsInDateRange
by explicitly fetching onlyrunId
s. If I am not mistaken, mongoDB won't even read data from the disk and send the results back straight from therunId
index on runs collection.
packages/api/src/datasources/runs.ts
Outdated
async deleteRunsInDateRange( | ||
startDate: Date, | ||
endDate: Date, | ||
limit: number = 0 | ||
) { | ||
const getResult = await this.getRunsInDateRange(startDate, endDate, limit); | ||
return this.deleteRunsByIds(getResult.runIds); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is not used, please delete
|
||
const runIds = response.map((x) => x.run[0].runId) as string[]; | ||
return await this.deleteInstancesByRunIds(runIds); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍🏻
break; | ||
} | ||
// Delete the runs | ||
const runsDeleteResponse = await resolvers.Mutation.deleteRuns( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice reuse of mutation 👍🏻
P.S. I've merged #427 into master and updated this branch, don't be scared |
a2f4b95
to
5c6a202
Compare
I rebased it back to master instead of merged and removed the unneeded method and added a projection when getting the runs. |
6694160
to
5c6a202
Compare
I now tested it and I could delete 2 months worth of data in a few seconds ;) So from my side, this PR can be merged. |
This PR changes the behavior of the deletion of runs in a specified date range.
Previously, it used a very slow
aggregate
to find theinstances
for theruns
that should be deleted and tried to delete them all at once. This timeouted at various ends in larger setups.With this change, it will now process the deletion in batches:
runIds
instances
for thoserunIds
runs
for thoserunIds
runs
are found for the defined date-rangeTo make it even faster, a new index for
runId
is added to theinstance
collection.