Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ORM mode: Add support for arbitrary class instances #562

Merged
merged 10 commits into from Jun 6, 2019

Conversation

Projects
None yet
2 participants
@samuelcolvin
Copy link
Owner

commented May 29, 2019

My take on ORM support. @tiangolo please let me know if this would work for you?

This is just a WIP, will need more tests and docs but I want your opinion on it before I got any further.

Related issue number

replace #520

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • HISTORY.rst has been updated
    • if this is the first change since a release, please add a new section
    • include the issue number or this pull request number #<number>
    • include your github username @<whomever>

@samuelcolvin samuelcolvin force-pushed the orm-mode branch from 6e06673 to 0bd417c May 29, 2019

@samuelcolvin samuelcolvin changed the title Support ORM objects to parse_obj Support ORM objects in parse_obj May 29, 2019

@codecov

This comment has been minimized.

Copy link

commented May 29, 2019

Codecov Report

Merging #562 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff          @@
##           master   #562   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          15     15           
  Lines        2508   2531   +23     
  Branches      501    503    +2     
=====================================
+ Hits         2508   2531   +23

@samuelcolvin samuelcolvin referenced this pull request May 29, 2019

Closed

Add support for arbitrary class instances #520

4 of 4 tasks complete

@classmethod
def _decompose_class(cls, obj: Any) -> 'DictStrAny':
return obj.__dict__

This comment has been minimized.

Copy link
@samuelcolvin

samuelcolvin May 29, 2019

Author Owner

If you don't want to (or can't) use .__dict__, having this as a separate function means you can override _decompose_class and build a dict some other way.

exc = TypeError(f'{cls.__name__} expected dict not {type(obj).__name__}')
raise ValidationError([ErrorWrapper(exc, loc='__obj__')]) from e
else:
obj = cls._decompose_class(obj)

This comment has been minimized.

Copy link
@samuelcolvin

samuelcolvin May 29, 2019

Author Owner

need to catch attribute error at least here.

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented May 29, 2019

Thank you!

Let me pull it and play with it.

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented May 29, 2019

Thanks for your effort on this.

I have a couple of observations.

An ORM with __getattr__

Let's say we have this class, that simulates how some popular ORMs work:

class FooGetAttr:
    def __getattr__(self, key: str):
        if key == 'foo':
            return 'Foo'
        else:
            raise AttributeError

And let's say we have this test (to simulate the use case):

def test_object_with_getattr():
    foo_valid = FooGetAttr()
    model = Model.parse_obj(foo_valid)
    assert model.foo == 'Foo'
    assert model.bar == 1
    assert model.dict(skip_defaults=True) == {'foo': 'Foo'}

To override _decompose_class for this to work it would be needed to duplicate a lot of the code in validate_model:

from pydantic.utils import ForwardRef
from pydantic.errors import ConfigError, DictError, ExtraError, MissingError
from pydantic.main import _missing


class Model(BaseModel):
    foo: str
    bar: int = 1

    @classmethod
    def _decompose_class(model, obj: Any):
        config = model.__config__
        values = {}
        for name, field in model.__fields__.items():
            if type(field.type_) == ForwardRef:
                raise ConfigError(
                    f'field "{field.name}" not yet prepared so type is still a ForwardRef, '
                    f'you might need to call {model.__class__.__name__}.update_forward_refs().'
                )
            value = getattr(obj, field.alias, _missing)
            if value is _missing and config.allow_population_by_alias and field.alt_alias:
                value = getattr(obj, field.name, _missing)
            if value is not _missing:
                values[name] = value
        return values

And all that _decompose_class would be code that users would have to write themselves.

And it would be needed for each of their models. Or in a main base model and inherit all their models from it.


One of the main objectives I have is to reduce the code complexity and duplication needed in users' code (in my case, let's say FastAPI users).

One option for me would be to create and expose a custom class inheriting from BaseModel, including a similar method override. But I would prefer to avoid that.

I would like FastAPI to be fully compatible with standard Pydantic, so that users can re-use Pydantic models for other scenarios, and bring their Pydantic models from other places to FastAPI, without having to use a customized version.

Simplifying the problem

Now, simplifying the problem to the minimum that I would like to support in FastAPI:

What I really need is to be able to use getattr based on some models .__fields__. And the call chain actually starts at Field.validate (not at Model.__init__ or Model.validate_model).

One option I can think of that doesn't involve changing a Model's __init__ signature (as in #520) is:

Adding a parameter to Field.validate to be used to extract an object's data. This parameter would be None by default and the normal mechanisms would be used to extract data from the object (dict.get).

Then this function that would be injected into Field.validate, if provided, would have to be injected in all the Field validators (because they call Field.validate themselves), _validate_tuple, _validate_mapping, _validate_singleton, _apply_validators, etc.

And then, in _apply_validators, use that function (if provided) with the model class to extract each of the fields from the object.

And then I could have that special function to be injected as part of FastAPI and do it automatically for response models.

This approach would have the benefit of not modifying the signature of BaseModel.__init__.

The drawbacks

The drawbacks of this approach are:

  • The number of changes would be quite higher I expect. And the points/methods it would touch would be more, as the custom function would need to be passed/injected by all the methods that call each other.
  • This would make it work only as a response model in FastAPI, but if an external user wanted to use it somewhere else to convert an ORM object to Pydantic (e.g. in a queue worker) it wouldn't be possible.
  • It wouldn't work (not even in FastAPI) if the user wants to convert the ORM model using Pydantic and return it directly. Or use Pydantic to serialize it and send it to a queue worker, etc.

Alternative model

Other option I can think of (although I think it wouldn't be very clean) is for Pydantic to provide another sub-class, e.g ORMBaseModel, including a _decompose_class using getattrs.

Drawbacks

  • Possible code duplication.
  • Probably increased maintenance burden.
  • Splitting Pydantic users' codebase in ORM and non-ORM models, that might lead to confusion (maybe).

What do you think?

What are the main drawbacks you see in #520 ? Maybe I can think of alternatives to avoid them.

@samuelcolvin

This comment has been minimized.

Copy link
Owner Author

commented May 29, 2019

First of all, you don't need all that logic, because the dict returned by _decompose_class can have extra unused fields. Similarly it, can be missing elements, that will be dealt with later.

So the code above, just becomes

    @classmethod
    def _decompose_class(model, obj: Any):
        return model.__fields__

We could alternatively use something like:

    @classmethod
    def _decompose_class(model, obj: Any):
        return {name: getattr(obj, name) for name in dir(obj)}

But that would have other problems.

We could also look for a custom method on a class like __get_pydantic_fields__() and use that if it exists to avoid users having to customise _decompose_class.

I'll look throught the rest of what you've said when I get a chance.

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented May 29, 2019

Hmm, I'm probably not understanding how to use _decompose_class.

I tried replacing the example from above with both suggestions and testing it (with the same test in the comment above) and it didn't work (I got testing errors).

The single line version I can get working, not breaking skip_defaults functionality, is:

    @classmethod
    def _decompose_class(model, obj: Any):
        return {field.name: getattr(obj, field.alias, _missing) for field in model.__fields__.values() if getattr(obj, field.alias, _missing) is not _missing}

But that's quite long. And it doesn't have into account allow_population_by_alias.


Hmm, so __get_pydantic_fields__ would be something users would add to an ORM class?


I'll look throught the rest of what you've said when I get a chance.

Thank you.

@samuelcolvin

This comment has been minimized.

Copy link
Owner Author

commented May 29, 2019

I think you're confused here _decompose_class is implemented on the model, it takes an instance of the ORM class, not a pydantic model.

Here's an example of it working with sqlalchemy:

from typing import List
from devtools import debug
from sqlalchemy import Column, Integer, String
from sqlalchemy.dialects.postgresql import ARRAY
from sqlalchemy.ext.declarative import declarative_base
from pydantic import BaseModel, constr

Base = declarative_base()

class Company(Base):
    __tablename__ = 'companies'
    id = Column(Integer, primary_key=True, nullable=False)
    public_key = Column(String(20), index=True, nullable=False, unique=True)
    name = Column(String(63), unique=True)
    domains = Column(ARRAY(String(255)))

co = Company(id=123, public_key='foobar', name='Testing', domains=['example.com', 'foobar.com'])
debug(co, co.__dict__)

class CompanyModel(BaseModel):
    id: int
    public_key: constr(max_length=20)
    name: constr(max_length=63)
    domains: List[constr(max_length=255)]

debug(CompanyModel.parse_obj(co))

outputs:

image

Which is surely what you want?

@samuelcolvin

This comment has been minimized.

Copy link
Owner Author

commented May 29, 2019

_decompose_class is just converting a class (eg. an ORM item instance) into a dict which pydantic can process as it normally would. Aliases will work fine.

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented May 29, 2019

Ah, yes, I think I understand. The first parameter is the same Pydantic class itself (being a @classmethod) and the second parameter is the arbitrary object.

Generating a dict with whatever can be extracted from the object is more or less what I'm already doing in the jsonable_encoder in FastAPI, in fact, I have a bit of hacky/ugly code to omit those _sa attributes.

The jsonable_encoder, used that way, has created several problems:

  • Relationships are not fetched populated, even if declared as attributes in the Pydantic model.
  • Dynamic and hybrid properties don't work.
  • Some types of IDs (I think?) didn't work either.

In general, things that were fetched/generated by SQLAlchemy on attribute access, with __getattr__


But yes, models with normal columns worked OK, more or less.

@samuelcolvin samuelcolvin force-pushed the orm-mode branch from cec77a4 to 28665dc May 29, 2019

@samuelcolvin

This comment has been minimized.

Copy link
Owner Author

commented May 29, 2019

Ok, what about this?

I'm now wrapping classes in GetterDict, which will mean converting an ORM instance to a dict is no-longer eager and @property etc. should work.

Unfortunately there's no obvious way to check if something is an instance of some class to process or a simple variable like int, so we have to protect this feature behind orm_mode. There might be a better way of doing this.

Does this work better?

If so can you live with orm_mode config switch?

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented May 29, 2019

Hacky McHack face.
😂


This looks great, thank you!

I'm finishing a big uncommitted mess/work in progress in FastAPI, so I haven't been able to install it and test it locally with the changes in FastAPI I had created for it. But as soon as I finish this mess (commit/discard) I'll switch to that branch and test it.

@samuelcolvin samuelcolvin changed the title Support ORM objects in parse_obj ORM mode: Add support for arbitrary class instances May 30, 2019

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented May 30, 2019

I was finally able to test this properly for the last 2 hours or so.

With this, I can solve all the use cases I can imagine. Including SQLAlchemy hybrid properties and others.

That's awesome. Thank you! 🎉 🚀 🍰

And yes, I think orm_mode is fine.

@samuelcolvin

This comment has been minimized.

Copy link
Owner Author

commented May 30, 2019

great news!

I'm going to release v0.27 since it's got cython on it, I don't want to add any more complication to it.

I'll then work on completing this and we an add it to the next release.

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented May 30, 2019

Cython! I'm pretty excited about that too. 🚀


That's awesome! Thank you for this. I know you're not into ORMs, but thanks for putting the effort into finding a way to solve it 🎉 🍰

I think there's quite a bunch of people that will be excited about it 😄

samuelcolvin added some commits Jun 4, 2019

@samuelcolvin samuelcolvin force-pushed the orm-mode branch from bc3af20 to 773bbb6 Jun 4, 2019

@samuelcolvin

This comment has been minimized.

Copy link
Owner Author

commented Jun 4, 2019

@tiangolo please review and let me know if you're happy with this.

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented Jun 4, 2019

Great! I'm on it.

@tiangolo
Copy link
Collaborator

left a comment

This is awesome! 🌮 🎉

A couple of minor things that I'm also OK without them.

Apart from that, LG(reat)TM.

Show resolved Hide resolved pydantic/utils.py Outdated
Show resolved Hide resolved tests/test_edge_cases.py Outdated

samuelcolvin added some commits Jun 5, 2019

@tiangolo
Copy link
Collaborator

left a comment

Awesome! 🎉

LGTM

@samuelcolvin samuelcolvin merged commit 3dfae21 into master Jun 6, 2019

7 of 10 checks passed

Header rules No header rules processed
Details
Pages changed All files already uploaded
Details
Redirect rules No redirect rules processed
Details
Mixed content No mixed content detected
Details
codecov/project 100% (+0%) compared to 6d5c48e
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
deploy/netlify Deploy preview ready!
Details
pyup.io/safety-ci No dependencies with known security vulnerabilities.
Details
samuelcolvin.pydantic Build #20190606.5 succeeded
Details

@samuelcolvin samuelcolvin deleted the orm-mode branch Jun 6, 2019

@tiangolo

This comment has been minimized.

Copy link
Collaborator

commented Jun 6, 2019

🎉 This is great! Thanks!

I know several people that will celebrate it. 🎉 🍰

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.