New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
get_or_create should be atomic #478
Comments
It cant be atomic - because of the syntax and it determines if the document existed prior to it possibly being created. Semantically, this is different to an upsert, you might create a new instance but not save it disk because this is part of a bigger operation. However, this does create a race condition and other approaches might be more applicable to your use case. I have updated the docs to highlight this. |
OK, this is really strange. I would never know from documentation that document is not really saved. For me, created implies saving. Anyway, this is what happens in Django. So if you are targeting semantic equality to Django, then it is really strange that it would not work like this. Anyway, |
Hi, Well people wanted to do a fetch / create then save as part of their flow - so The semantics aren't the same as an upsert - infact doing an upsert could result in the loss of data, as you'd have to query against the whole document otherwise an upsert would overwrite the existing documents. For that reason, I'm not reopening. Also, the same race condition exists in django - see: https://github.com/django/django/blob/master/django/db/models/query.py#L432-465 There are two separate operations:
Cheers, Ross |
I know that Django's version also has race condition, but this is because it it hard to do it right when you need to support multiple database backends. For mongodb we could find a better solution. |
@mitar but there isn't a better solution for many scenarios as semantically this is the only way to achieve what some people want. You can always use an Wont fix |
@mitar @rozza afaik there is not a race condition in django if the fields you are using to query are unique=True: In case of an IntegrityError it rollback and then (line 467) it returns the result of a simple .get() We've just face a race condition using get_or_create and been honest I don't think what could be the workaround... |
So unique=True stops the race condition - not true, there is a catch for the exception and code handles that race condition, if you fall into that catgeory. I'm happy to add the error catch here, but it still is leaky - depends on your write concerns as to if you get an index duplicate error back from the server. You should just use an upsert if you can - its what they are there for, otherwise version and locking patterns can be used. There is a note on the method describing the race condition, so it is use at your own risk. |
Marked the method for deprecation - but thinking about it we can do it better. The race condition doesnt exist where unique=True - exactly same as django (indexes that enforce uniqueness will stop multiple items being added) we should enforce a safe write and catch an index error. We could also code a rollback if created - add an extra query to match all items and delete all but the first ordered by obj id (creation time) - those would be items created in a race and then return the results of a get with created = False. |
Added new ticket: MongoEngine#35 |
Current implementation of get_or_create is not atomic. It should probably use upsert:
http://stackoverflow.com/questions/6703330/how-do-i-do-a-get-or-create-in-pymongo-python-mongodb
The text was updated successfully, but these errors were encountered: