-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Add model signature generation #1034
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add model signature generation #1034
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1034 +/- ##
=======================================
Coverage 99.94% 99.94%
=======================================
Files 21 21
Lines 3699 3699
Branches 728 728
=======================================
Hits 3697 3697
Misses 1 1
Partials 1 1 Continue to review full report at Codecov.
|
AttributeError: attribute '__code__' of 'cython_function_or_method' objects is not writable Possible there's other attrs that cannot be set in cython, but can't check myself yet
pydantic/main.py
Outdated
| orig_posonlycount = orig_code.co_posonlyargcount if PY38 else 0 | ||
| orig_argnames = orig_code.co_varnames[: orig_code.co_argcount + orig_code.co_kwonlyargcount + orig_posonlycount] | ||
| orig_argnames_set = set(orig_argnames) | ||
| fields_names = tuple([name for name in fields if name not in orig_argnames_set]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might not maintain order of orig_argnames I believe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn’t do anything with orig_argnames, but excluding them from fields. This is actually done to maintain their order.
tests/test_main.py
Outdated
| @@ -1022,3 +1023,72 @@ class NewModel(DerivedModel, something=2): | |||
| something = 1 | |||
|
|
|||
| assert NewModel.something == 2 | |||
|
|
|||
|
|
|||
| def test_init_signature(capsys): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new test file please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
tests/test_main.py
Outdated
| assert param.kind is kind | ||
| assert param.default == default | ||
|
|
||
| assert Model.__init__.__name__ == '__init__' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we add
assert help(Model.__init__) == """
...
"""Or equivalent to provide an easy to read check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve done in first commit, but because of different formatting, tests on CI failed. There were inconsistency with spaces between arguments. Probably we can fix it by comparing strings without spaces
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or different tests for different versions of python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
pydantic/main.py
Outdated
| orig_init.__globals__, | ||
| name=orig_init.__name__, | ||
| ) | ||
| fake_init.__doc__ = new_init.__doc__ = orig_init.__doc__ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a good idea to add a helpful docstring to the real __init__ which by default appears in help(MyModel.__init__)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s not bad idea for sure. What should be written there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, something like:
Create a new model by parsing and validating input data from keyword arguments, a
ValidationErrorwill be raised if the input data cannot be parsed to for a valid model.
Or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also adding info about original signature to docstring on generation
pydantic/main.py
Outdated
| orig_code = orig_init.__code__ | ||
| # cannot modify cls.__init__ directly, | ||
| # as it shared between children classes, so we copy it | ||
| cls.__init__ = new_init = FunctionType( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we move from here on into a separate function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, good idea. Where do you think is the best place for it? main, utils?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't really mind, either typing or utils I guess.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
Moved tests to separate file Moved __init__ creation to utils
Co-Authored-By: Samuel Colvin <samcolvin@gmail.com>
pydantic/utils.py
Outdated
|
|
||
| doc = orig_init.__doc__ or '' | ||
| real_signature = inspect.signature(orig_init) | ||
| new_doc = f'Signature is generated based on model fields. Real signature:\n {real_signature}\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure adding the real signature here is very helpful.
I think we should just use orig_init.__doc__? Perhaps with ...\n(signature auto generated from model fields) appended or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
tests/test_init_signature.py
Outdated
| sig = signature(Model.__init__) | ||
| assert sig != signature(BaseModel.__init__) | ||
|
|
||
| params = sig.parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this would be much easier to read and debug with something like
assert {n: str(p) for n, p in s.parameters.items()} = {
...
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, thats' right. But I'd went with
assert [str(p) for p in sig.parameters.values()] == [
'self',
'id: int = 1',
'bar=2',
'baz: Any',
"name: str = 'John Doe'",
'**data',
]to maintain the order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
tests/test_init_signature.py
Outdated
| assert m.baz == 'Ok!' | ||
| assert m.name == 'John Doe' | ||
|
|
||
| sig = signature(MyModel.__init__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done!
|
|
pydantic/utils.py
Outdated
| orig_code, orig_init.__globals__, closure=orig_init.__closure__, name=orig_init.__name__ | ||
| ) | ||
| else: | ||
| orig_init = getattr(current_init, '__origin_init__', current_init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this fork not work if not compiled? Just wondering why we can't use the same logic in both cases (it's totally fine if we have to, I'm just trying to understand)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess it would work, but it also will add redundant call to a stack which may cause confusion on step into model creation while debugging and will be also a little overhead. Current behavior just will lead python right into original init and won’t add any redundant calls, so I decided to stick with it, when possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see where you are coming from.
Possible counter-point: this also means that there could be bugs that show up in the compiled version and not the un-compiled version, or vice versa. I think it would be extremely annoying to figure out what was going wrong in a reported issue if it dependended on the compiled status. Obviously that's always a possibility, but the chance seems considerably higher given an explicit logic fork on that condition. Do you still think it makes sense to keep the fork given that consideration?
@samuelcolvin I'd also be interested to get your opinion here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Personally I don't see a problem here, but this is surely needs to be considered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My problem here is not just about different routes, but also about different tracebacks:
Running the following code:
from datetime import datetime
from typing import List
from pydantic import BaseModel
class User(BaseModel):
id: int
name = 'John Doe'
signup_ts: datetime = None
friends: List[int] = []
user = User()With pydantic not compiled I get:
Traceback (most recent call last):
File "test.py", line 14, in <module>
user = User()
File "/home/samuel/code/pydantic/pydantic/main.py", line 293, in __init__
raise validation_error
pydantic.error_wrappers.ValidationError: 1 validation error for User
id
field required (type=value_error.missing)
But when compiled I get:
Traceback (most recent call last):
File "test.py", line 14, in <module>
user = User()
File "pydantic/utils.py", line 169, in pydantic.utils.generate_typed_init.__init__
orig_init(*args, **kwargs)
File "pydantic/main.py", line 293, in pydantic.main.BaseModel.__init__
raise validation_error
pydantic.error_wrappers.ValidationError: 1 validation error for User
id
field required (type=value_error.missing)
I think it's important these two tracebacks look the same and I think the traceback should not have the extra pydantic.utils.generate_typed_init.__init__... line.
I see three possible solutions to this:
- Find a way to re arrange this function to avoid this extra step. maybe not possible.
- Find another work around for this problem, e.g.:
- just modify the docstring to detail the arguments
- implement our own custom
signature()method that provides a sensible signature for a model, without changing the actual model code
- Manually customise the traceback so it's always the same, and perhaps even more simplified, e.g. just
Traceback (most recent call last):
File "test.py", line 14, in <module>
user = User()
pydantic.error_wrappers.ValidationError: 1 validation error for User
id
field required (type=value_error.missing)
or something.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should look here for reasons why in cpython we cannot achieve same traceback as in python:
https://stackoverflow.com/questions/59072875/how-to-clone-function-in-cython-i-getting-systemerror-unknown-opcode/59087135#59087135
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, i see that, I tried this myself I got something to work see 5230383 or the init-signature-sandbox branch. But I think that approach has other problems.
I think the best solution is our own utility function to generate the signature for an init function. That inspect.signature() would be the same as now, but pydantic.inspect.signature() (or whatever) would give the new improved signature. Would that work for the original use case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get how exactly pydantic.inspect.signature() would help us solve copying problem.
Also, it seems like your approach would break code like this:
class MyModelsBase(BaseModel):
def __init__(self, **data):
super().__init__(self, **data)
print(f'Initialized {self} with {data}')
class MyModel(MyModelsBase):
a: int
MyModel(a=a) # no output, as on MyModel creation '__init__' was not in namespace and was replacedThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2. Find another work around for this problem, e.g.:
- just modify the docstring to detail the arguments
- implement our own custom
signature()method that provides a sensible signature for a model, without changing the actual model code
Even if we modify docstring to detail the arguments, or implement our own argument, they still will be modified on the same __init__ of a base model, so every declared model will rewrite it and eventually it'll end up with the docstring and signature of the last model created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also thought about declaring __annotations__, __kwdefaults__, __wrapped__ and __signature__ as python property so they will give signatures of current models, but that doesn't seem possible too, because WhateverModel.__init__ would still be bound to BaseModel and I don't see any way how these properties could know what model's signature to give on call
pydantic/utils.py
Outdated
| orig_othernames = orig_code.co_varnames[orig_allargscount:] | ||
|
|
||
| orig_argnames_set = set(orig_argnames) # just for fast lookups | ||
| valid_fields_args = tuple([name for name in fields if name not in orig_argnames_set and name.isidentifier()]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this generate the right signature if a field has an alias and doesn't have allow_population_by_field_name=True?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, aliases are not considered at all.
Can you please give me example of such model and what signature should look like?
And thanks for that question, I found a bug in current implementation: models with fields declared in namespace would have this in signature:
__init__(__pydantic_self__, *, a=FieldInfo(default=1, alias='A', extra={}), **data: Any) -> NoneUpd. Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
class A(BaseModel):
a: int = Field(alias='b')It looks like your generated signature will be (ignoring any annotations hints):
__init__(__pydantic_self__, a)but if you call __init__(a=1) you'll get an error. I would think the signature *should * be __init__(__pydantic_self__, b) in this case. This is what I meant by aliases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, note that if you are using the type annotation from the field, the generated signature will actually also be wrong for any type that is not idempotent. In particular, at least currently, if you have a SecretStr field, you have to pass a str to the __init__ function and not a SecretStr. I opened an issue to fix this for SecretStr, but there are other types for which this is also an issue.
(Admittedly, this is also a problem in the PyCharm plugin and, in strict mode, the mypy plugin too.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, now I see. Aliases should be considered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for SecretStr and other custom types: I don't see any problems here since these annotations will be available only at runtime and won't bother type-checkers. Until some code checks incoming types at runtime and don't know how to deal with pydantic, I don't think having pydantic specific types in signature would be a problem. For example, FastAPI understands such annotations just fine:
class GetUserFilters(BaseModel):
name: Optional[constr(strip_whitespace=True, max_length=70)]
email: Optional[SecretStr]
@app.get('/users')
def get_users(params=Depends(GetUserFilters)):
...Upd.
Having thoughts about this, maybe these types could have special attribute for actual type of expected value, that would be added to the field. This also probably could help PyCharm plugin to reveal correct types of expected values.
I'll open separate issue where we can discuss that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree they won't bother type checkers, but it still feels bad for it to be generated "wrong". I think there is a tradeoff here between it faithfully representing the valid signature of the method (i.e., including a type annotation of Any or similar to allow for parsing), and providing useful information.
In general, I would prefer to err on the side of "correctness" (e.g., generating an __init__ signature with Any as the annotation), but from a practical perspective I'm okay with it being occasionally wrong in a minor way like this if it is substantially more useful (and preferred in practice).
pydantic/utils.py
Outdated
| orig_code, orig_init.__globals__, closure=orig_init.__closure__, name=orig_init.__name__ | ||
| ) | ||
| else: | ||
| orig_init = getattr(current_init, '__origin_init__', current_init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My problem here is not just about different routes, but also about different tracebacks:
Running the following code:
from datetime import datetime
from typing import List
from pydantic import BaseModel
class User(BaseModel):
id: int
name = 'John Doe'
signup_ts: datetime = None
friends: List[int] = []
user = User()With pydantic not compiled I get:
Traceback (most recent call last):
File "test.py", line 14, in <module>
user = User()
File "/home/samuel/code/pydantic/pydantic/main.py", line 293, in __init__
raise validation_error
pydantic.error_wrappers.ValidationError: 1 validation error for User
id
field required (type=value_error.missing)
But when compiled I get:
Traceback (most recent call last):
File "test.py", line 14, in <module>
user = User()
File "pydantic/utils.py", line 169, in pydantic.utils.generate_typed_init.__init__
orig_init(*args, **kwargs)
File "pydantic/main.py", line 293, in pydantic.main.BaseModel.__init__
raise validation_error
pydantic.error_wrappers.ValidationError: 1 validation error for User
id
field required (type=value_error.missing)
I think it's important these two tracebacks look the same and I think the traceback should not have the extra pydantic.utils.generate_typed_init.__init__... line.
I see three possible solutions to this:
- Find a way to re arrange this function to avoid this extra step. maybe not possible.
- Find another work around for this problem, e.g.:
- just modify the docstring to detail the arguments
- implement our own custom
signature()method that provides a sensible signature for a model, without changing the actual model code
- Manually customise the traceback so it's always the same, and perhaps even more simplified, e.g. just
Traceback (most recent call last):
File "test.py", line 14, in <module>
user = User()
pydantic.error_wrappers.ValidationError: 1 validation error for User
id
field required (type=value_error.missing)
or something.
What do you think?
Co-Authored-By: Samuel Colvin <samcolvin@gmail.com>
Co-Authored-By: Samuel Colvin <samcolvin@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really looking forward to this - people have been requesting pydantic support in Hypothesis, and this PR will mean we get it for free 🎉
Co-Authored-By: Samuel Colvin <samcolvin@gmail.com>
docs/usage/models.md
Outdated
| To be included in signature, field alias or name must be a valid python identifier. *pydantic* prefer aliases over names, but may use field name if alias is not suitable. | ||
| If any field's alias and name are both invalid identifiers, `**data` argument will be added. | ||
|
|
||
| Also, `**data` argument is always present in signature if `Config.extra` is `Extra.allow` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe should also mention this here?
| Also, `**data` argument is always present in signature if `Config.extra` is `Extra.allow` | |
| Also, `**data` argument is always present in signature if `Config.extra` is `Extra.allow`. | |
| !!! note | |
| Types in signature are same as declared in model annotations. They are not necessarily the ones that can be parsed. That may change once #1055 solved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@samuelcolvin, after this one gets resolved, I guess, we're ready to go 🎉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've rewritten the docs to improve the english, and incorporated this suggestion.
|
For me this is ready to merge if you agree @MrMrRobat? |
|
@samuelcolvin, is there any reason for changes in af3dd4d? Added I think that it may lead to confusion in such cases as: class Model(BaseModel):
a: float
b: int = 2
def __init__(self, a: float, b: int):
super().__init__(a=a, b=b)
class Config:
extra = 'allow'Even though |
pydantic/utils.py
Outdated
| @@ -149,7 +149,7 @@ def generate_model_signature( | |||
|
|
|||
| present_params = signature(init).parameters.values() | |||
| merged_params: Dict[str, Parameter] = {} | |||
| var_kw = None | |||
| var_kw = Parameter('data', Parameter.VAR_KEYWORD, annotation=Any) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts on this: #1034 (comment)
tests/test_model_signature.py
Outdated
| @@ -57,3 +57,14 @@ def test_invalid_identifiers_signature(): | |||
| assert _equals(str(signature(model)), '(*, valid_identifier: int = 123, yeah: int = 0) -> None') | |||
| model = create_model('Model', **{'123 invalid identifier!': 123, '!': Field(0, alias='yeah')}) | |||
| assert _equals(str(signature(model)), '(*, yeah: int = 0, **data: Any) -> None') | |||
|
|
|||
|
|
|||
| def test_kwargs(): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said in #1034 (comment), this test would already work without changes in this commit
|
Yes sorry, maybe you're right. Feel free to revert this change or I will. Sorry to mess with your PR, just trying to get it merged as I know I'm slow to respond here |
|
That's totally fine, I don't mind any edits from your side. |
|
Why not make it an error to specify both a custom init method and config for extras-allowed? As far as I can see it's always redundant (just check for a VAR_KW param), and may be inconsistent (usually indicating a bug). |
| if allow_names and field_name.isidentifier(): | ||
| param_name = field_name | ||
| else: | ||
| use_var_kw = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coverage is being weird, but it looks like this line might not be coverd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it's covered in test_invalid_identifiers_signature, maybe coverage struggles to detect it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it's covered, don't worry.
Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

Change Summary
Generating model signature based on its fields
Respects custom-defined init signature:
Related issue number
Closes #1032, timothycrosley/hypothesis-auto#10
Relates to fastapi/fastapi#318
ToDo:
__init__Checklist
changes/<pull request or issue id>-<github username>.mdfile added describing change(see changes/README.md for details)