New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize find and modify #476
Conversation
Assigned this first to 0.10.0 milestone like the original PR was, but I realised this needs to be in 1.0.0 since it has a breaking change. |
@joeframbach I added you here in case you wanna have a look. |
Very cool @lushc, one question/idea though about your |
25349c3
to
5fdbf66
Compare
@loris yeah it's quite a rare scenario so it's definitely less than ideal having the 2nd query to handle it being called so often, even if the performance impact is low. I'll have to experiment with the implementation of a separate interval, although one issue that comes to mind is respecting a definition's lock lifetime (and so passing the "runs a one-time job after its lock expires" test). We'll have to iterate through each definition to work out the job's lock deadline, but if the interval is infrequent then there's the chance the job gets picked up later than it should have. However, with it being an edge-case maybe that's OK? Especially if this "unstuck" interval is configurable (which it'll need to be to pass the test) and documented. In the meanwhile, if you could test the performance of this branch to see how it fits your use-case that'd be very much appreciated 👍 @OmgImAlexis saw your comment in the other PR, would be interested to hear if you're still getting those test timeouts? I've tried to replicate and couldn't on MongoDB 2.6.10. |
@@ -125,13 +123,9 @@ function handleLegacyCreateIndex(err, result, self, cb){ | |||
// Looks like a mongo.version < 2.4.x | |||
err = null; | |||
self._collection.ensureIndex( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going forward in mongodb, ensureIndex is deprecated in favor of createIndex. Is this an appropriate time to also make this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be a separate PR is better to keep this one more focused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree to keep it focused. IMO, the ensureIndex
/createIndex
should be removed in favor of a simple instruction in the setup in README.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow ups: #490
80afd02
to
f0a8291
Compare
$setOnInsert operators This fixes the 'should not run if job is disabled' test sometimes failing due to the upsert exception causing the next update operation to set the 'disabled' property to false (because the job isn't persisted and thus the original attribute value is lost)
…adding a 2nd query for stuck jobs By explicitly setting the lastFinishedAt field (so it appears in all documents) and indexing it, we avoid recursive lock warnings due to document scans checking if the field exists
f0a8291
to
4cb60cb
Compare
4cb60cb
to
1fcc15d
Compare
@lushc Any ETA on this PR and v1.1.0? Got a feeling this would help when used with Azure CosmosDB which currently consume quote a bit of RU (CosmosDB billable unit for how much capacity is being used) since the query is quite complex. |
…x usage. Ports over changes made in agenda#476, which in turn references agenda#300. Co-authored-by: will123195 <will123195@gmail.com>
I've revisited this on version 2.0.2 with a MongoDB 3.2 instance and from my tests it looks like these changes would now incur a performance penalty. The I've attached the
|
Wohoo, thanks for checking it out! |
Rebased & included the fixes in #300 (comment).
As per @will123195's original comment this a breaking change (e.g. for Agendash) since an unlocked job now has its
lockedAt
field set to the Unix epoch rather than null, plus the field is always in the document so any$exists: false
queries won't work. Likewise with thedisabled
field.Overall, significant performance gains should be made by getting rid of the recursive lock warnings. With 2,000,000 jobs in the database, query time has gone from ~4000ms to ~400ms. Deviating from the original PR is a 2nd query introduced to fix the failing test highlighted in my comment, but it also appears to fix #410 where unfinished jobs weren't being resumed. For the execution plan to use the index effectively the
$or
has to remain top-level, so the query duplication is unavoidable, but thankfully it's low cost (~50ms).