Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReferenceField lazy/auto dereference #42

Closed
lafrech opened this issue Aug 29, 2016 · 8 comments
Closed

ReferenceField lazy/auto dereference #42

lafrech opened this issue Aug 29, 2016 · 8 comments

Comments

@lafrech
Copy link
Collaborator

lafrech commented Aug 29, 2016

Currently, a ReferenceField deserializes as (for instance) a PyMongoReference, and must be dereferenced manually using fetch().

from pymongo import MongoClient
from umongo import Instance, Document, fields

db = MongoClient().test
instance = Instance(db)

@instance.register
class Ref(Document):
    a = fields.IntField()

@instance.register
class Doc(Document):
    ref = fields.ReferenceField(Ref)

ref = Ref(a=12)
ref.commit()

# Here's my object
print(ref)
# <object Document __main__.Ref({'a': 12, '_id': ObjectId('57c4ae2cf66737156a9ebfff')})>

# Let's reference it in another object
doc = Doc(ref=ref)

# I can't get it directly from here
print(doc.ref)
# <object umongo.frameworks.pymongo.PyMongoReference(document=Ref, pk=ObjectId('57c4ae2cf66737156a9ebfff'))>

# I need to call fetch manually
print(doc.ref.fetch())
# <object Document __main__.Ref({'a': 12, '_id': ObjectId('57c4ae2cf66737156a9ebfff')})>

This is not as practical as automatic dereferencing (systematic or lazy as in MongoEngine).

Is this a design choice (simplicity) or something that you'd like to see improved?

In practice, I could call fetch() every time I'm accessing a ref field, but there are cases where you need to pass an object to another method you can't modify (imported library) and you really want doc.ref to be the referenced document, not a PyMongoReference.

print(doc.ref.a)
AttributeError: 'PyMongoReference' object has no attribute 'a'

And I can't even dereference every reference manually to get a "clean" object:

# This won't work
doc.ref = doc.ref.fetch()
print(doc.ref)
# <object umongo.frameworks.pymongo.PyMongoReference(document=Ref, pk=ObjectId('57c4ae2cf66737156a9ebfff'))>

I'm kinda stuck. Is there a way around this? Am I missing something?

@touilleMan
Copy link
Member

Is this a design choice (simplicity) or something that you'd like to see improved?

It's a design choice !
Given umongo support asynchronous drivers, retrieving data from the database must be explicit in order to yield on this operation:

# using asyncio
async def get_ref_a(doc):
    referenced = await doc.ref.fetch()
    return referenced.a

Beside, I like the idea of explicitness in fetching data from the database (this way you easily know how much you hit the database in a single request, which is pretty nice for optimizing)

Regarding your particular usecase, you can add a property on the document

@instance.register
class MyDoc(Document):
    ref = fields.ReferenceField(ReferencedDocument)
   @property
   def ref_fetched(self):
       return self.ref.fetch()

referenced = ReferencedDocument(a=42)
my_doc = MyDoc(ref=referenced)
assert my_doc.ref_fetched.a == 42

Another more generic possibility would be to subclass PyMongoReference with a PyMongoReferenceAutoFetch (basically what's in your PR #43 ) then use it in a subclass of PyMongoBuilder (see https://github.com/Scille/umongo/blob/master/umongo/frameworks/pymongo.py#L312)

To do that you need to register your new builder with register_builder (see https://github.com/Scille/umongo/blob/master/umongo/frameworks/__init__.py#L105) before creating the Instance registering your Document.

I've just pushed a patch (3c679c1) to make the overloading of a builder possible.

From this, maybe we could provide your PyMongoReferenceAutoFetch and PyMongoBuilderAutoFetch as convenient bonus that users need to manually register to use them.

@lafrech
Copy link
Collaborator Author

lafrech commented Aug 31, 2016

Thanks for this. I'll have a look at it.

@lafrech
Copy link
Collaborator Author

lafrech commented Oct 5, 2016

I tried the PyMongoReference subclass approach as you explained.

To get it to work, I also had to create a PyMongoInstanceAutoFetch.

AFAIU, register_builder is useless here because in PyMongoInstance,the buillder is call explicitly, so find_from_db is not called. It would be called from an Instance subclass, but not a LazyLoaderInstance.

    self.BUILDER_CLS = import_module('umongo.frameworks.pymongo').PyMongoBuilder

@lafrech
Copy link
Collaborator Author

lafrech commented Oct 5, 2016

(Also, I'm not sure you need to overload __init__ in PyMongoReference, since self._document = None is done in Reference already.)

@lafrech
Copy link
Collaborator Author

lafrech commented Oct 5, 2016

For the record, here's what I have in my application code (I put the whole thing in application code and I don't depend on the branch in #43):

instance.py content:

# -*- coding: utf-8 -*-
"""Custom PyMongoInstance"""

from umongo.fields import ReferenceField
from umongo.frameworks.pymongo import PyMongoReference, PyMongoBuilder
from umongo.instance import LazyLoaderInstance


class PyMongoReferenceAutoFetch(PyMongoReference):

    def __init__(self, document_cls, pk):
        # Attribute getter/setter/deleter are overridden to provide
        # direct access to referenced document. We must use __dict__
        # directly to get around those.
        self.__dict__['document_cls'] = document_cls
        self.__dict__['pk'] = pk
        self.__dict__['_document'] = None

    def __getitem__(self, name):
        return self.fetch()[name]

    def __setitem__(self, name, value):
        self.fetch()[name] = value

    def __delitem__(self, name):
        del self.fetch()[name]

    def __getattr__(self, name):
        if name in self.__dict__:
            return self.__dict__[name]
        return getattr(self.fetch(), name)

    def __setattr__(self, name, value):
        if name in self.__dict__:
            self.__dict__[name] = value
        else:
            setattr(self.fetch(), name)

    def __delattr__(self, name):
        if name in self.__dict__:
            del self.__dict__[name]
        else:
            delattr(self.fetch(), name)


class PyMongoBuilderAutoFetch(PyMongoBuilder):

    def _patch_field(self, field):
        super()._patch_field(field)

        if isinstance(field, ReferenceField):
            field.reference_cls = PyMongoReferenceAutoFetch


class PyMongoInstanceAutoFetch(LazyLoaderInstance):
    """
    :class:`umongo.instance.LazyLoaderInstance` implementation for pymongo
    """
    def __init__(self, *args, **kwargs):
        self.BUILDER_CLS = PyMongoBuilderAutoFetch
        super().__init__(*args, **kwargs)

I still think this PyMongoReferenceAutoFetch implementation is not satisfying. It could be improved by reserving __attributes (double underscore) rather than using __dict__, or maybe done differently by using __get__, but I'm not familiar enough with it so I didn't find out any better way yet.

@touilleMan
Copy link
Member

PyMongoInstanceAutoFetch is not usefull if you use the register_builder:

umongo.frameworks.register_builder(PyMongoBuilderAutoFetch)  # do this somewhere before creating instance

db = MongoClient().test
instance = umongo.Instance(db)

Beside, I'm thinking of adding this as a bonus for the pymongo driver... stay tunned !

PS: you do not need the header # -*- coding: utf-8 -*- anymore in Python3 given the file is considered as UTF8 by default.

@lafrech
Copy link
Collaborator Author

lafrech commented Dec 7, 2016

Thank you for your feedback on this.

I just realized that:

  • I don't use this so widely in my code. There is currently only one use case. We use an oauth library that expect the objects we provide it to behave "normally", so it won't call fetch() and we can't pre-call fetch(). But this could be arranged with a property just like you suggest. This will work only on attribute access, not attribute set, but...

  • The implementation I proposed in ReferenceField lazy dereference #43 allows to modify fields in the referenced Document seamlessly but it is tricky because when committing the document, the referenced Document does not get committed and the modification is lost.

Overall, I'm leaning to your side. Rather make things explicit than try to hide complexity and create dangerous corner cases.

Anyone wanting to pick this up, keep this in mind. __setattr_ may work (as shown above, it has an error, see #43 for the fix) but the referenced Document won't be committed. You'd rather not try to add another layer of complexity to autocommit referenced Documents, so at some point, you're gonna need to fetch that Document anyway.

Beside, I'm thinking of adding this as a bonus for the pymongo driver... stay tunned !

OK maybe your implementation will be better and more useful than mine. I'm leaving this open, then.

PS: you do not need the header # -- coding: utf-8 -- anymore in Python3 given the file is considered as UTF8 by default.

Thanks for the tip.

@lafrech
Copy link
Collaborator Author

lafrech commented Jan 13, 2019

Let's close this for now and keep fetch manual.

The conversation is interesting. It provides a way to do auto-dereferencing and what issues to expect.

@lafrech lafrech closed this as completed Jan 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants