Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking for setting attributes #389

Merged
merged 18 commits into from Feb 13, 2019

Conversation

Projects
None yet
2 participants
@dgasmith
Copy link
Contributor

dgasmith commented Feb 10, 2019

Change Summary

Tracking when default arguments are override and allows a skip_defaults kwarg in the dict method.

A problem that I have not yet solved is that a BaseModel goes through the dict_validator and is then passed back into BaseModel.validate after being cast to a dict, this does not allow the __fields_set__ to come through at the moment and recursive models do not work. Any suggestions here?

make benchmark-pydantic
Before:           pydantic best=29.726μs/iter avg=30.439μs/iter stdev=0.563μs/iter
After:            pydantic best=29.061μs/iter avg=29.695μs/iter stdev=0.473μs/iter

Related issue number

Discussion in #378.

Checklist

  • Unit tests for the changes exist
  • Tests pass on CI and coverage remains at 100%
  • Documentation reflects the changes where applicable
  • HISTORY.rst has been updated
    • if this is the first change since a release, please add a new section
    • include the issue number or this pull request number #<number>
    • include your github username @<whomever>

dgasmith added some commits Feb 10, 2019

@samuelcolvin

This comment has been minimized.

Copy link
Owner

samuelcolvin commented Feb 10, 2019

Single quotes please everywhere, also you'll need to respect black. Use make format.

@dgasmith

This comment has been minimized.

Copy link
Contributor Author

dgasmith commented Feb 11, 2019

Of course, I was holding off on that area until we had an idea of what to do with the recursive copy operation. Easiest thing would be to write a custom copy validator for BaseModels rather than relying on the standard dict validator. However, that solution isn't as elegant as the current code.

@samuelcolvin

This comment has been minimized.

Copy link
Owner

samuelcolvin commented Feb 11, 2019

Oh I see. Maybe related to #265?

Do you know how to proceed or do you have a specific question?

@dgasmith

This comment has been minimized.

Copy link
Contributor Author

dgasmith commented Feb 11, 2019

Similar to #265. If you are fine with having a custom validator for BaseModels rather than the dict_validator we could solve both the issue here and #265.

@samuelcolvin

This comment has been minimized.

Copy link
Owner

samuelcolvin commented Feb 11, 2019

great

dgasmith added some commits Feb 11, 2019

@dgasmith

This comment has been minimized.

Copy link
Contributor Author

dgasmith commented Feb 11, 2019

#265 was deeper than I expected. I think that can be solved however, but I will leave it for another PR. This is now ready to review.

@codecov

This comment has been minimized.

Copy link

codecov bot commented Feb 11, 2019

Codecov Report

Merging #389 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff          @@
##           master   #389   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files          14     14           
  Lines        2031   2059   +28     
  Branches      409    419   +10     
=====================================
+ Hits         2031   2059   +28
@samuelcolvin
Copy link
Owner

samuelcolvin left a comment

Are you sure about the benchmark numbers? They sound to good to be true.


def __init__(self, **data: Any) -> None:
if TYPE_CHECKING: # pragma: no cover
self.__values__: Dict[str, Any] = {}
self.__setstate__(self._process_values(data))
self.__fields_set__: Set[str] = set()
fields_set = data.pop("__fields_set__", set(data.keys()))

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

Strict rule in pydantic that you can;t pass anything other than values to the class like this.

Users should be able to create a model with MyModel(**whatever) without having to think about what's in whatever.

Remove this and either call __setstate__ directly, or set model.__fields_set__ = thing directly.

Show resolved Hide resolved pydantic/main.py Outdated
Show resolved Hide resolved pydantic/main.py Outdated
@@ -324,7 +340,8 @@ def construct(cls, **values: Any) -> 'BaseModel':
Chances are you don't want to use this method directly.
"""
m = cls.__new__(cls)
m.__setstate__(values)
fields_set = values.pop("__fields_set__", None)

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

Single quotes always please unless the string contains single quotes.

But again, can we avoid doing values.pop().

This comment has been minimized.

@dgasmith

dgasmith Feb 12, 2019

Author Contributor

I think I need to add the ability to pass in some sort of state from the calling instance. Can I change construct to take in a fields_set or should I add a new function like construct_with_fields or similar?

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

construct can take more arguments, just __init__ that should remain pure.

@@ -342,15 +359,13 @@ def copy(
"""
if include is None and exclude is None and update is None:
# skip constructing values if no arguments are passed
v = self.__values__
v = self.__values__.copy()

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

do we need this copy?

This comment has been minimized.

@dgasmith

dgasmith Feb 12, 2019

Author Contributor

No, not if we remove the explicit add of __field_set__ below.

self, include: Set[str] = None, exclude: Optional[Set[str]] = set(), skip_defaults: bool = False
) -> Set[str]:
if skip_defaults:
keys = self.__fields_set__.copy()

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

i doubt we need copy() here.

This comment has been minimized.

@dgasmith

dgasmith Feb 12, 2019

Author Contributor

We do as we modify the set in-place later for speed.

This comment has been minimized.

@samuelcolvin
if skip_defaults:
keys = self.__fields_set__.copy()
else:
keys = set(self.__values__.keys())

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

I doubt we need set() here.

This comment has been minimized.

@dgasmith

dgasmith Feb 12, 2019

Author Contributor

If we do not convert it to a set (there is no copy for dict keys), we rely on the fact that dict keys are special sets which return LHS references if modified in place. Seems a bit ugly, when we can explicitly copy the data here and move on with things.

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

ok, that's fine.

keys = set(self.__values__.keys())

if include:
keys &= set(include)

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

I doubt we need set() here.

This comment has been minimized.

@dgasmith

dgasmith Feb 12, 2019

Author Contributor

I was erring on the side of safety, not everyone uses mypy. But happy to remove it.

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

I think this one really can be removed. If someone passes something weird here they should get an error.

yield k, self._get_value(v, by_alias=by_alias)
yield k, self._get_value(v, by_alias=by_alias, skip_defaults=skip_defaults)

def _calculate_keys(

This comment has been minimized.

@samuelcolvin

samuelcolvin Feb 12, 2019

Owner

I think we can probably do this in a more efficient way, eg. if include=None, exclude=set() and skip_defaults=False (all the defaults), then we just take all keys.

Can't we take care of that case but just returning None or something here and not checking key containment?

This comment has been minimized.

@dgasmith

dgasmith Feb 12, 2019

Author Contributor

Sure, I can work on optimizing that out.

This comment has been minimized.

@dgasmith

dgasmith Feb 12, 2019

Author Contributor

On this note, the include/exclude defaults are not consistent (copy for example has them both defaulting to None). I can clean this up as well.

This comment has been minimized.

@samuelcolvin
Show resolved Hide resolved HISTORY.rst

samuelcolvin and others added some commits Feb 12, 2019

Update pydantic/main.py
Co-Authored-By: dgasmith <dgasmith@icloud.com>
Update pydantic/main.py
Co-Authored-By: dgasmith <dgasmith@icloud.com>
Update HISTORY.rst
Co-Authored-By: dgasmith <dgasmith@icloud.com>
@dgasmith

This comment has been minimized.

Copy link
Contributor Author

dgasmith commented Feb 12, 2019

Timings:

defaultfields  BENCHMARK_REPEATS=20 make benchmark-pydantic
python benchmarks/run.py pydantic-only
                                pydantic time=0.940s, success=48.45%
...
                                pydantic best=0.886s, avg=0.911s, stdev=0.017s

                                pydantic best=7.387μs/iter avg=7.592μs/iter stdev=0.138μs/iter
master  BENCHMARK_REPEATS=20 make benchmark-pydantic
python benchmarks/run.py pydantic-only
                                pydantic time=0.919s, success=48.45%
...
                                pydantic best=0.888s, avg=0.914s, stdev=0.020s

                                pydantic best=7.397μs/iter avg=7.620μs/iter stdev=0.165μs/iter

Not entirely sure why. Could depend heavily on benchmark construction and a bit of extra set manipulation over raw lists.

@samuelcolvin

This comment has been minimized.

Copy link
Owner

samuelcolvin commented Feb 12, 2019

Tried it myself and I get a ~5% time increase with this PR. Not ideal.

To put this in proportion, this increase is roughly the same as the increase which was initially introduced by the typing PR #373, I spent a couple of hours fighting with it to mostly eliminate that performance deterioration. I'm therefore somewhat reticent about introducing that slowdown here.

Get the PR ready and I'll see what I can do.

@samuelcolvin

This comment has been minimized.

Copy link
Owner

samuelcolvin commented on pydantic/main.py in 493dbb6 Feb 12, 2019

better to add

if TYPE_CHECKING:
    ...
    SetOrKeys = Union[Set[str], KeysView[str]]

At the beginning.

But I'm not sure, if return self.__values__.keys() below is the best approach: wouldn't you bet better to return None and not check containment at all if the result is None?

@dgasmith

This comment has been minimized.

Copy link
Contributor Author

dgasmith commented Feb 12, 2019

Odd on timings, those were 3.6. For 3.7 I see:

this branch               pydantic best=6.871μs/iter avg=7.165μs/iter stdev=0.169μs/iter
master                    pydantic best=6.841μs/iter avg=7.238μs/iter stdev=0.170μs/iter

Agreed on the keys comment, patched.

dgasmith and others added some commits Feb 12, 2019

@samuelcolvin

This comment has been minimized.

Copy link
Owner

samuelcolvin commented Feb 13, 2019

For me this is now ready to be merged. Please check my changes and let me know if you disagree with anything or have any questions.

We might need to update history to state that contruct's signature has changed.

@dgasmith

This comment has been minimized.

Copy link
Contributor Author

dgasmith commented Feb 13, 2019

Looks good! I was hesitant to change constructor, but this is much better and is extensible if there are other slots needed in the future. Thanks for working through this and the great feedback.

@samuelcolvin samuelcolvin merged commit 96e3e74 into samuelcolvin:master Feb 13, 2019

3 checks passed

codecov/project 100% (+0%) compared to baade9a
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
deploy/netlify Deploy preview ready!
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.