Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORM mode: Add support for arbitrary class instances #562

Merged
merged 10 commits into from Jun 6, 2019
Merged

Conversation

samuelcolvin
Copy link
Member

My take on ORM support. @tiangolo please let me know if this would work for you?

This is just a WIP, will need more tests and docs but I want your opinion on it before I got any further.

Related issue number

replace #520

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • HISTORY.rst has been updated
    • if this is the first change since a release, please add a new section
    • include the issue number or this pull request number #<number>
    • include your github username @<whomever>

@samuelcolvin samuelcolvin changed the title Support ORM objects to parse_obj Support ORM objects in parse_obj May 29, 2019
@codecov
Copy link

codecov bot commented May 29, 2019

Codecov Report

Merging #562 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff          @@
##           master   #562   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          15     15           
  Lines        2508   2531   +23     
  Branches      501    503    +2     
=====================================
+ Hits         2508   2531   +23

pydantic/main.py Outdated

@classmethod
def _decompose_class(cls, obj: Any) -> 'DictStrAny':
return obj.__dict__
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't want to (or can't) use .__dict__, having this as a separate function means you can override _decompose_class and build a dict some other way.

pydantic/main.py Outdated
exc = TypeError(f'{cls.__name__} expected dict not {type(obj).__name__}')
raise ValidationError([ErrorWrapper(exc, loc='__obj__')]) from e
else:
obj = cls._decompose_class(obj)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to catch attribute error at least here.

@tiangolo
Copy link
Member

Thank you!

Let me pull it and play with it.

@tiangolo
Copy link
Member

Thanks for your effort on this.

I have a couple of observations.

An ORM with __getattr__

Let's say we have this class, that simulates how some popular ORMs work:

class FooGetAttr:
    def __getattr__(self, key: str):
        if key == 'foo':
            return 'Foo'
        else:
            raise AttributeError

And let's say we have this test (to simulate the use case):

def test_object_with_getattr():
    foo_valid = FooGetAttr()
    model = Model.parse_obj(foo_valid)
    assert model.foo == 'Foo'
    assert model.bar == 1
    assert model.dict(skip_defaults=True) == {'foo': 'Foo'}

To override _decompose_class for this to work it would be needed to duplicate a lot of the code in validate_model:

from pydantic.utils import ForwardRef
from pydantic.errors import ConfigError, DictError, ExtraError, MissingError
from pydantic.main import _missing


class Model(BaseModel):
    foo: str
    bar: int = 1

    @classmethod
    def _decompose_class(model, obj: Any):
        config = model.__config__
        values = {}
        for name, field in model.__fields__.items():
            if type(field.type_) == ForwardRef:
                raise ConfigError(
                    f'field "{field.name}" not yet prepared so type is still a ForwardRef, '
                    f'you might need to call {model.__class__.__name__}.update_forward_refs().'
                )
            value = getattr(obj, field.alias, _missing)
            if value is _missing and config.allow_population_by_alias and field.alt_alias:
                value = getattr(obj, field.name, _missing)
            if value is not _missing:
                values[name] = value
        return values

And all that _decompose_class would be code that users would have to write themselves.

And it would be needed for each of their models. Or in a main base model and inherit all their models from it.


One of the main objectives I have is to reduce the code complexity and duplication needed in users' code (in my case, let's say FastAPI users).

One option for me would be to create and expose a custom class inheriting from BaseModel, including a similar method override. But I would prefer to avoid that.

I would like FastAPI to be fully compatible with standard Pydantic, so that users can re-use Pydantic models for other scenarios, and bring their Pydantic models from other places to FastAPI, without having to use a customized version.

Simplifying the problem

Now, simplifying the problem to the minimum that I would like to support in FastAPI:

What I really need is to be able to use getattr based on some models .__fields__. And the call chain actually starts at Field.validate (not at Model.__init__ or Model.validate_model).

One option I can think of that doesn't involve changing a Model's __init__ signature (as in #520) is:

Adding a parameter to Field.validate to be used to extract an object's data. This parameter would be None by default and the normal mechanisms would be used to extract data from the object (dict.get).

Then this function that would be injected into Field.validate, if provided, would have to be injected in all the Field validators (because they call Field.validate themselves), _validate_tuple, _validate_mapping, _validate_singleton, _apply_validators, etc.

And then, in _apply_validators, use that function (if provided) with the model class to extract each of the fields from the object.

And then I could have that special function to be injected as part of FastAPI and do it automatically for response models.

This approach would have the benefit of not modifying the signature of BaseModel.__init__.

The drawbacks

The drawbacks of this approach are:

  • The number of changes would be quite higher I expect. And the points/methods it would touch would be more, as the custom function would need to be passed/injected by all the methods that call each other.
  • This would make it work only as a response model in FastAPI, but if an external user wanted to use it somewhere else to convert an ORM object to Pydantic (e.g. in a queue worker) it wouldn't be possible.
  • It wouldn't work (not even in FastAPI) if the user wants to convert the ORM model using Pydantic and return it directly. Or use Pydantic to serialize it and send it to a queue worker, etc.

Alternative model

Other option I can think of (although I think it wouldn't be very clean) is for Pydantic to provide another sub-class, e.g ORMBaseModel, including a _decompose_class using getattrs.

Drawbacks

  • Possible code duplication.
  • Probably increased maintenance burden.
  • Splitting Pydantic users' codebase in ORM and non-ORM models, that might lead to confusion (maybe).

What do you think?

What are the main drawbacks you see in #520 ? Maybe I can think of alternatives to avoid them.

@samuelcolvin
Copy link
Member Author

First of all, you don't need all that logic, because the dict returned by _decompose_class can have extra unused fields. Similarly it, can be missing elements, that will be dealt with later.

So the code above, just becomes

    @classmethod
    def _decompose_class(model, obj: Any):
        return model.__fields__

We could alternatively use something like:

    @classmethod
    def _decompose_class(model, obj: Any):
        return {name: getattr(obj, name) for name in dir(obj)}

But that would have other problems.

We could also look for a custom method on a class like __get_pydantic_fields__() and use that if it exists to avoid users having to customise _decompose_class.

I'll look throught the rest of what you've said when I get a chance.

@tiangolo
Copy link
Member

Hmm, I'm probably not understanding how to use _decompose_class.

I tried replacing the example from above with both suggestions and testing it (with the same test in the comment above) and it didn't work (I got testing errors).

The single line version I can get working, not breaking skip_defaults functionality, is:

    @classmethod
    def _decompose_class(model, obj: Any):
        return {field.name: getattr(obj, field.alias, _missing) for field in model.__fields__.values() if getattr(obj, field.alias, _missing) is not _missing}

But that's quite long. And it doesn't have into account allow_population_by_alias.


Hmm, so __get_pydantic_fields__ would be something users would add to an ORM class?


I'll look throught the rest of what you've said when I get a chance.

Thank you.

@samuelcolvin
Copy link
Member Author

I think you're confused here _decompose_class is implemented on the model, it takes an instance of the ORM class, not a pydantic model.

Here's an example of it working with sqlalchemy:

from typing import List
from devtools import debug
from sqlalchemy import Column, Integer, String
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.ext.declarative import declarative_base
from pydantic import BaseModel, constr

Base = declarative_base()

class Company(Base):
    __tablename__ = 'companies'
    id = Column(Integer, primary_key=True, nullable=False)
    public_key = Column(String(20), index=True, nullable=False, unique=True)
    name = Column(String(63), unique=True)
    domains = Column(ARRAY(String(255)))

co = Company(id=123, public_key='foobar', name='Testing', domains=['example.com', 'foobar.com'])
debug(co, co.__dict__)

class CompanyModel(BaseModel):
    id: int
    public_key: constr(max_length=20)
    name: constr(max_length=63)
    domains: List[constr(max_length=255)]

debug(CompanyModel.parse_obj(co))

outputs:

image

Which is surely what you want?

@samuelcolvin
Copy link
Member Author

_decompose_class is just converting a class (eg. an ORM item instance) into a dict which pydantic can process as it normally would. Aliases will work fine.

@tiangolo
Copy link
Member

tiangolo commented May 29, 2019

Ah, yes, I think I understand. The first parameter is the same Pydantic class itself (being a @classmethod) and the second parameter is the arbitrary object.

Generating a dict with whatever can be extracted from the object is more or less what I'm already doing in the jsonable_encoder in FastAPI, in fact, I have a bit of hacky/ugly code to omit those _sa attributes.

The jsonable_encoder, used that way, has created several problems:

  • Relationships are not fetched populated, even if declared as attributes in the Pydantic model.
  • Dynamic and hybrid properties don't work.
  • Some types of IDs (I think?) didn't work either.

In general, things that were fetched/generated by SQLAlchemy on attribute access, with __getattr__


But yes, models with normal columns worked OK, more or less.

@samuelcolvin
Copy link
Member Author

samuelcolvin commented May 29, 2019

Ok, what about this?

I'm now wrapping classes in GetterDict, which will mean converting an ORM instance to a dict is no-longer eager and @property etc. should work.

Unfortunately there's no obvious way to check if something is an instance of some class to process or a simple variable like int, so we have to protect this feature behind orm_mode. There might be a better way of doing this.

Does this work better?

If so can you live with orm_mode config switch?

@tiangolo
Copy link
Member

Hacky McHack face.
😂


This looks great, thank you!

I'm finishing a big uncommitted mess/work in progress in FastAPI, so I haven't been able to install it and test it locally with the changes in FastAPI I had created for it. But as soon as I finish this mess (commit/discard) I'll switch to that branch and test it.

@samuelcolvin samuelcolvin changed the title Support ORM objects in parse_obj ORM mode: Add support for arbitrary class instances May 30, 2019
@tiangolo
Copy link
Member

I was finally able to test this properly for the last 2 hours or so.

With this, I can solve all the use cases I can imagine. Including SQLAlchemy hybrid properties and others.

That's awesome. Thank you! 🎉 🚀 🍰

And yes, I think orm_mode is fine.

@samuelcolvin
Copy link
Member Author

great news!

I'm going to release v0.27 since it's got cython on it, I don't want to add any more complication to it.

I'll then work on completing this and we an add it to the next release.

@tiangolo
Copy link
Member

Cython! I'm pretty excited about that too. 🚀


That's awesome! Thank you for this. I know you're not into ORMs, but thanks for putting the effort into finding a way to solve it 🎉 🍰

I think there's quite a bunch of people that will be excited about it 😄

@samuelcolvin
Copy link
Member Author

@tiangolo please review and let me know if you're happy with this.

@tiangolo
Copy link
Member

tiangolo commented Jun 4, 2019

Great! I'm on it.

Copy link
Member

@tiangolo tiangolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is awesome! 🌮 🎉

A couple of minor things that I'm also OK without them.

Apart from that, LG(reat)TM.

pydantic/utils.py Outdated Show resolved Hide resolved
tests/test_edge_cases.py Outdated Show resolved Hide resolved
Copy link
Member

@tiangolo tiangolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! 🎉

LGTM

@samuelcolvin samuelcolvin merged commit 3dfae21 into master Jun 6, 2019
@samuelcolvin samuelcolvin deleted the orm-mode branch June 6, 2019 10:29
@tiangolo
Copy link
Member

tiangolo commented Jun 6, 2019

🎉 This is great! Thanks!

I know several people that will celebrate it. 🎉 🍰


@classmethod
def _decompose_class(cls: Type['Model'], obj: Any) -> GetterDict:
return GetterDict(obj)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@samuelcolvin this entire PR is extremely clever, really awesome job and solution even though you don't like ORMs!

alexdrydew pushed a commit to alexdrydew/pydantic that referenced this pull request Dec 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants