Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove get_or_create #300

Closed
rozza opened this issue Apr 30, 2013 · 8 comments
Closed

Remove get_or_create #300

rozza opened this issue Apr 30, 2013 · 8 comments
Assignees
Milestone

Comments

@rozza
Copy link
Contributor

rozza commented Apr 30, 2013

Its bad and has been marked to be removed.

@honzajavorek
Copy link

I do not understand how can I easily perform following:

  • Having model with some unique fields.
  • Having object with some data, not saved yet.
  • Now I need to perform operation where I send document to Mongo and it is updated or inserted according to violating uniqueness of the fields. The operation should be easy to write and to automate. I found this, but it is in no way easy to write in case I have a dictionary of data or already the object instantiated. Something like Object.save(upsert=True) does not work for object without _id.

Maybe docs could provide more education and helpful in this. I do this in almost every app I make, because there are always some synchronizations, where I get some data identical as in previous cron job and some are new. I get them usually as already filled model instances from scrapers or data importers. I really do not see a nice way how to deal with this right now.

@honzajavorek
Copy link

I wrote and use something like this in my models:

    def sync(self):
        """Insert or update, depending on unique fields."""
        cls = self.__class__  # model class

        # prepare data as in save
        self.clean()

        # get all unique fields
        unique_fields = {}
        for key in self._data.keys():
            field = getattr(cls, key)  # field object
            if field.unique:
                unique_fields[key] = getattr(self, key)  # value
            for key in (field.unique_with or []):
                unique_fields[key] = getattr(self, key)  # value

        # select the object by its unique fields
        query = cls.objects(**unique_fields)

        # prepare data to set
        data = {}
        for key, value in self._data.items():
            data['set__' + key] = value
        del data['set__id']

        # perform upsert
        query.update_one(upsert=True, **data)

Not perfect, but close to what I really tend to need in my applications. Maybe I am missing something, but I can't find any nice alternative to this provided directly by MongoEngine.

@Karmak23
Copy link
Contributor

Karmak23 commented Sep 5, 2013

Your code seems fine. My workaround approach is quite different though:

  • I query on the unique fields first,
  • if not existing, I create a document with only the unique fields defined
  • if failing with duplicatekey error (the race condition has happended), i re-do the query
  • i then (and only then) update the non-unique field on the document, being it created or not.

My code is lengthy and far from optimal (not friendly to the database). I have duplicated it in most documents because I wasn't aware about field.unique.

Your code seems to be a very nice addition to a common mixin to any document class. The finding of unique fields part could easily be cached in a class attribute, or even determined once and for all in a subclassed __init__().

regards,

@MRigal
Copy link
Member

MRigal commented Nov 28, 2014

I agree that it is not easy.

Another argument for something else than upsert is that update with upsert doesn't seem to handle properly the fields that have a different 'db_field'. At least, I can't use it, whereas the save method works...

@MRigal
Copy link
Member

MRigal commented Jun 12, 2015

Merged and removed

@MRigal MRigal closed this as completed Jun 12, 2015
@thedrow
Copy link
Contributor

thedrow commented Jun 13, 2015

Wait that makes our queryset incompatible with Django's queryset.
That's fine if it's really required but I don't understand why it's so bad to have get_or_create().

@MRigal
Copy link
Member

MRigal commented Jun 14, 2015

I don't know what "compatible with Django's queryset" really means. We have a couple other things which are different, like "scalar"/"list_values".

It was depreciated since a quite long time (2 years) and the problem is that it is simply a lie. With mongo, you just can't have a get_or_create method, as you can never be completely sure, except if you add retry and work with special write_concern settings, that the object you create with such a method wouldn't be created a second time if the same function is called simultaneously from another point. It is just inheritent to the DB type

@thedrow
Copy link
Contributor

thedrow commented Jun 15, 2015

I mean that MongoEngine's queryset should be, if possible compatible with Django's queryset because that makes integration with the rest of the Django echosystem just work without much effort on our part.
I guess what your saying makes sense since unlike RDBMS Mongo isn't transactional.

jtushman added a commit to monarch-org/monarch that referenced this issue Oct 13, 2017
mongoengine has deprecated/removed the get_or_create function
(MongoEngine/mongoengine#300)

So this should fix it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants