-
-
Notifications
You must be signed in to change notification settings - Fork 504
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable a custom starting id #1634
Conversation
Need to be able to set a custom starting ID for a collection - to prevent collisions when combining datasets, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See mentioned issue. Also needs tests.
@@ -59,7 +65,7 @@ public function generate(DocumentManager $dm, $document) | |||
$key = $this->key ?: $dm->getDocumentCollection($className)->getName(); | |||
|
|||
$query = array('_id' => $key); | |||
$newObj = array('$inc' => array('current_id' => 1)); | |||
$newObj = array('$inc' => array('current_id' => $this->startingId)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't change the starting value, but changes the increment. Setting startingId
to 5 would create a sequence of [5, 10, 15, 20, ...]
, not [5, 6, 7, 8, 9, ...]
as one would expect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to look it up in the docs, but you can use $setOnInsert
to set the initial value if the document was inserted:
$newObj = [
'$inc' => ['current_id' => 1],
'$setOnInsert' => ['current_id' => $this->startingId],
];
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alcaeus you legend. Will commit shortly. With tests.
Unable to use '$inc' and '$setOnInsert' together due to known bug. https://jira.mongodb.org/browse/SERVER-10711 Results in error: Cannot update 'current_id' and 'current_id' at the same time Workaround from here: https://stackoverflow.com/questions/41552405/mongodb-collection-update-initialize-a-document-with-default-values/41953190#41953190 _Tests to follow..._
$command['new'] = true; | ||
/* | ||
* Unable to use '$inc' and '$setOnInsert' together due to known bug. | ||
* @see https://jira.mongodb.org/browse/SERVER-10711 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Almost 4 years and counting. Yay! Workaround looks good though 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only thing that isn't 100% is the atomicity - I assume the upsert is atomic, but this workaround certainly isn't. Not sure whether adding a collection lock would suck performance...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The upsert would be atomic, yes. With the workaround, I'm hoping that the combination of "how often will this feature be used" and "how often would an upsert have to happen" narrows it down enough for it not to be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can preserve atomicity with the following two-step process:
// 1. Optimize for the common case: increment existing sequence
$command = [
'findAndModify' => $coll,
'query' => ['_id' => $key, 'current_id' => ['$exists' => true]],
'update' => ['$inc' => ['current_id' => $this->startingId]],
'upsert' => false,
'new' => true,
];
/* 2. If no result was returned, we upsert a new sequence. Don't bother to
* incorporate {$exists: false} into the criteria as that doesn't won't avoid
* an exception during a possible race condition. */
$command = [
'findAndModify' => $coll,
'query' => ['_id' => $key],
'update' => ['$inc' => ['current_id' => $this->startingId]],
'upsert' => true,
'new' => true,
];
/* 3. If a duplicate key exception was thrown, we encountered a race condition
* where another process created the sequence between our previous findAndModify
* commands. In that case, we can ignore the exception and attempt the first,
* increment approach one more time (perhaps throwing our exception if that then
* fails -- indicating that someone is rapidly inserting and deleting documents
* in the sequence table. */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I've made some of the recommended changes, just need to find and catch the duplicate key exception...
... and write tests...
'insert' => $coll, | ||
'documents' => array(array('_id' => $key, 'current_id' => $this->startingId)) | ||
); | ||
$result = $db->command($command); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're going to perform an insert, this should use Collection::insert()
for consistency instead of manually crafting an insert
command (something the PHP library) doesn't even do).
* Results in error: Cannot update 'current_id' and 'current_id' at the same time | ||
*/ | ||
$command = array( | ||
'findandmodify' => $coll, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know offhand what case variations mongod
allows, but the canonical name is findAndModify
.
$command['new'] = true; | ||
/* | ||
* Unable to use '$inc' and '$setOnInsert' together due to known bug. | ||
* @see https://jira.mongodb.org/browse/SERVER-10711 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can preserve atomicity with the following two-step process:
// 1. Optimize for the common case: increment existing sequence
$command = [
'findAndModify' => $coll,
'query' => ['_id' => $key, 'current_id' => ['$exists' => true]],
'update' => ['$inc' => ['current_id' => $this->startingId]],
'upsert' => false,
'new' => true,
];
/* 2. If no result was returned, we upsert a new sequence. Don't bother to
* incorporate {$exists: false} into the criteria as that doesn't won't avoid
* an exception during a possible race condition. */
$command = [
'findAndModify' => $coll,
'query' => ['_id' => $key],
'update' => ['$inc' => ['current_id' => $this->startingId]],
'upsert' => true,
'new' => true,
];
/* 3. If a duplicate key exception was thrown, we encountered a race condition
* where another process created the sequence between our previous findAndModify
* commands. In that case, we can ignore the exception and attempt the first,
* increment approach one more time (perhaps throwing our exception if that then
* fails -- indicating that someone is rapidly inserting and deleting documents
* in the sequence table. */
@stampycode could you take a look at the issues @jmikola and I mentioned? I'd like to wrap up ODM 1.2 soon and this is currently on the milestone. |
@jmikola @alcaeus
Unfortunately, and for reasons I have not yet fathomed, the suggested changes do not have the desired effect:
I have no idea what is going on here. |
@stampycode I took a look at the changes, looks like there was an error in the code @jmikola suggested. The first command of course has to increment the |
@alcaeus: Indeed, my example was incrementing by |
Need to be able to set a custom starting point for an auto-increment field - to prevent collisions when combining datasets, for example.