Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable a custom starting id #1634

Closed
wants to merge 4 commits into from
Closed

Conversation

stampycode
Copy link
Contributor

Need to be able to set a custom starting point for an auto-increment field - to prevent collisions when combining datasets, for example.

Need to be able to set a custom starting ID for a collection - to prevent collisions when combining datasets, for example.
Copy link
Member

@alcaeus alcaeus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See mentioned issue. Also needs tests.

@@ -59,7 +65,7 @@ public function generate(DocumentManager $dm, $document)
$key = $this->key ?: $dm->getDocumentCollection($className)->getName();

$query = array('_id' => $key);
$newObj = array('$inc' => array('current_id' => 1));
$newObj = array('$inc' => array('current_id' => $this->startingId));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't change the starting value, but changes the increment. Setting startingId to 5 would create a sequence of [5, 10, 15, 20, ...], not [5, 6, 7, 8, 9, ...] as one would expect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops.

Copy link
Member

@alcaeus alcaeus Aug 22, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Had to look it up in the docs, but you can use $setOnInsert to set the initial value if the document was inserted:

$newObj = [
    '$inc' => ['current_id' => 1],
    '$setOnInsert' => ['current_id' => $this->startingId],
];

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alcaeus you legend. Will commit shortly. With tests.

@alcaeus alcaeus added this to the 1.2 milestone Aug 22, 2017
Unable to use '$inc' and '$setOnInsert' together due to known bug.
https://jira.mongodb.org/browse/SERVER-10711
Results in error: Cannot update 'current_id' and 'current_id' at the same time

Workaround from here:
https://stackoverflow.com/questions/41552405/mongodb-collection-update-initialize-a-document-with-default-values/41953190#41953190

_Tests to follow..._
$command['new'] = true;
/*
* Unable to use '$inc' and '$setOnInsert' together due to known bug.
* @see https://jira.mongodb.org/browse/SERVER-10711
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost 4 years and counting. Yay! Workaround looks good though 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only thing that isn't 100% is the atomicity - I assume the upsert is atomic, but this workaround certainly isn't. Not sure whether adding a collection lock would suck performance...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upsert would be atomic, yes. With the workaround, I'm hoping that the combination of "how often will this feature be used" and "how often would an upsert have to happen" narrows it down enough for it not to be a problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can preserve atomicity with the following two-step process:

// 1. Optimize for the common case: increment existing sequence
$command = [
    'findAndModify' => $coll,
    'query' => ['_id' => $key, 'current_id' => ['$exists' => true]],
    'update' => ['$inc' => ['current_id' => $this->startingId]],
    'upsert' => false,
    'new' => true,
];

/* 2. If no result was returned, we upsert a new sequence. Don't bother to
 * incorporate {$exists: false} into the criteria as that doesn't won't avoid
 * an exception during a possible race condition. */
$command = [
    'findAndModify' => $coll,
    'query' => ['_id' => $key],
    'update' => ['$inc' => ['current_id' => $this->startingId]],
    'upsert' => true,
    'new' => true,
];

/* 3. If a duplicate key exception was thrown, we encountered a race condition
 * where another process created the sequence between our previous findAndModify
 * commands. In that case, we can ignore the exception and attempt the first,
 * increment approach one more time (perhaps throwing our exception if that then
 * fails -- indicating that someone is rapidly inserting and deleting documents
 * in the sequence table. */

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I've made some of the recommended changes, just need to find and catch the duplicate key exception...
... and write tests...

'insert' => $coll,
'documents' => array(array('_id' => $key, 'current_id' => $this->startingId))
);
$result = $db->command($command);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going to perform an insert, this should use Collection::insert() for consistency instead of manually crafting an insert command (something the PHP library) doesn't even do).

* Results in error: Cannot update 'current_id' and 'current_id' at the same time
*/
$command = array(
'findandmodify' => $coll,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know offhand what case variations mongod allows, but the canonical name is findAndModify.

$command['new'] = true;
/*
* Unable to use '$inc' and '$setOnInsert' together due to known bug.
* @see https://jira.mongodb.org/browse/SERVER-10711
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can preserve atomicity with the following two-step process:

// 1. Optimize for the common case: increment existing sequence
$command = [
    'findAndModify' => $coll,
    'query' => ['_id' => $key, 'current_id' => ['$exists' => true]],
    'update' => ['$inc' => ['current_id' => $this->startingId]],
    'upsert' => false,
    'new' => true,
];

/* 2. If no result was returned, we upsert a new sequence. Don't bother to
 * incorporate {$exists: false} into the criteria as that doesn't won't avoid
 * an exception during a possible race condition. */
$command = [
    'findAndModify' => $coll,
    'query' => ['_id' => $key],
    'update' => ['$inc' => ['current_id' => $this->startingId]],
    'upsert' => true,
    'new' => true,
];

/* 3. If a duplicate key exception was thrown, we encountered a race condition
 * where another process created the sequence between our previous findAndModify
 * commands. In that case, we can ignore the exception and attempt the first,
 * increment approach one more time (perhaps throwing our exception if that then
 * fails -- indicating that someone is rapidly inserting and deleting documents
 * in the sequence table. */

@alcaeus
Copy link
Member

alcaeus commented Sep 27, 2017

@stampycode could you take a look at the issues @jmikola and I mentioned? I'd like to wrap up ODM 1.2 soon and this is currently on the milestone.

@jmikola jmikola self-assigned this Oct 10, 2017
@stampycode
Copy link
Contributor Author

@jmikola @alcaeus
So I've run into a stumbling block in my tests.
Using the code I originally suggested works ok:

  • set $startingId to 1000001
  • first element fails to find non-existent counter
  • creates counter with number 1000001
  • returns 1000001
  • second request finds counter, increments it
  • returns 1000002

Unfortunately, and for reasons I have not yet fathomed, the suggested changes do not have the desired effect:

  • set $startingId to 1000001
  • first element fails to find non-existent counter
  • creates counter with number 2000001
  • returns 2000001
  • second request finds counter, increments it
  • returns 3000001

I have no idea what is going on here.

@alcaeus alcaeus removed this from the 1.2 milestone Oct 22, 2017
@alcaeus
Copy link
Member

alcaeus commented Oct 22, 2017

@stampycode I took a look at the changes, looks like there was an error in the code @jmikola suggested. The first command of course has to increment the current_id field by one, not by the starting value. Only the second command that is supposed to upsert a record may increment it with the starting value. I've pushed a fix to #1661 and merge it once the build has finished. Thanks!

@alcaeus alcaeus closed this Oct 22, 2017
@jmikola
Copy link
Member

jmikola commented Oct 23, 2017

@alcaeus: Indeed, my example was incrementing by startingId instead of 1. @stampycode: Sorry for the misleading suggestion!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants