fix Implement static checking for model_validate in pydantic #1123#3390
fix Implement static checking for model_validate in pydantic #1123#3390asukaminato0721 wants to merge 4 commits into
Conversation
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
This PR adds a synthesized model_validate classmethod to Pydantic dataclass-style models in pyrefly's static checker, and adds call-time constraint checking that walks dict literals (and same-scope literal initializers) passed to Model.model_validate(...) to flag values that violate field constraints such as Field(ge=...). It also extends emit_pydantic_argument_constraint to recurse through union types so each member is checked.
Changes:
- Synthesize a
model_validateclassmethod on every Pydantic model whose own class doesn't already define one, with a typed-dict-shapedobjparameter that recursively expands nested models, lists, and dicts. - Add
check_pydantic_model_validate_constraintsinvoked from the generic call path; it inspects dict literals (and traces a name's same-scope initializer) and the inferred typed-dict shape forge/le/etc. violations on nested fields. - Add four new pydantic tests covering nested-field range checking,
from_attributes, custommodel_validatepreservation, and asuper().model_validate(...)flow.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| pyrefly/lib/alt/class/pydantic.rs | Implements synthesized model_validate builder, dict-shape type construction, and the recursive expr/type constraint checkers; adds union recursion to emit_pydantic_argument_constraint. |
| pyrefly/lib/alt/class/dataclass.rs | Wires the synthesized model_validate into Pydantic dataclass synthesized fields, suppressed only when the current class itself defines model_validate. |
| pyrefly/lib/alt/call.rs | Invokes the new constraint checker on every call expression. |
| pyrefly/lib/test/pydantic/field.rs | Adds tests for nested constraint detection, from_attributes, custom-method preservation, and super().model_validate(...). |
Comments suppressed due to low confidence (5)
pyrefly/lib/alt/class/pydantic.rs:995
- When
expectedis aType::Union, this iterates each member and recursively checks the sameactualagainst every member, emitting a constraint violation for any union member whose nested-model shape disagrees. For a value typed asint | strthis is harmless because non-pydantic members are no-ops, but forItemA | ItemB(both pydantic models) the same actual will be validated against both models and may produce duplicate or false-positive errors when the value is only intended to satisfy one branch of the union. Consider only emitting an error if all members fail (i.e. a "best-match" or "any-passes" semantics) for nested model unions.
Type::Union(union) => {
for expected in union.members.iter() {
self.check_pydantic_field_expr_constraints(
expected, actual, range, errors, seen,
);
}
}
pyrefly/lib/alt/class/pydantic.rs:1084
- Same union false-positive concern as in
check_pydantic_field_expr_constraints: when the expected type is a union of multiple pydantic models, the actual value is validated against every member and any failing branch will report a constraint error, even if another branch in the union accepts the value.
match (expected, actual) {
(Type::Union(expected), _) => {
for expected in expected.members.iter() {
self.check_pydantic_field_type_constraints(
expected, actual, range, errors, seen,
);
}
}
pyrefly/lib/alt/class/pydantic.rs:832
- When the user writes
bad = {"items": [..., {"quantity": -3}]}followed byInventory.model_validate(bad), this code recurses into the initializer expression ofbadand emits a constraint violation at the literal's range — but the assignmentbad = {...}is, by itself, not a pydantic call. The error then surfaces on the assignment line even when the user could plausibly usebadfor unrelated purposes, and the diagnostic gives no indication that the violation comes from a downstreammodel_validatecall. Consider either reporting the error at themodel_validatecall'srange(with a note pointing back to the literal) or qualifying the message so it isn't confusing on the assignment line.
if let Some(initializer) = self.pydantic_name_initializer(obj) {
self.check_pydantic_model_validate_expr_constraints(
&cls,
dataclass,
initializer,
range,
errors,
&mut seen,
);
seen.clear();
}
pyrefly/lib/alt/class/pydantic.rs:288
- Recursion into nested container types only handles
listanddict; nested pydantic models insidetuple,set,frozenset,Sequence,Mapping, etc. will not have their dict-shape input form added toobj_ty. This makes the accepted argument type formodel_validateinconsistent with what Pydantic actually accepts at runtime (it coerces dict shapes through any iterable container, not justlist/dict).
} else if cls.class_object() == self.stdlib.list_object()
&& let [elem] = cls.targs().as_slice()
{
self.heap.mk_class_type(
self.stdlib
.list(self.pydantic_model_validate_type(elem.clone(), seen)),
)
} else if cls.class_object() == self.stdlib.dict_object()
&& let [key, value] = cls.targs().as_slice()
{
self.heap.mk_class_type(self.stdlib.dict(
key.clone(),
self.pydantic_model_validate_type(value.clone(), seen),
))
} else {
Type::ClassType(cls)
}
}
pyrefly/lib/alt/class/pydantic.rs:881
pydantic_name_initializeronly resolvesBinding::Expr/Binding::NameAssign/Binding::AnnotatedTypechains and ignores re-assignments. If the user writesbad = {"quantity": 5}; bad = {"quantity": -3}; Inventory.model_validate(bad), only one of these initializers will be inspected (whichever the binding key resolves to), and the analysis may either miss real violations or report them on a stale initializer that doesn't actually flow into the call. Consider documenting this best-effort behavior in a comment, or restricting it to bindings that are provably the unique reaching definition.
fn pydantic_name_initializer<'b>(&'b self, actual: &Expr) -> Option<&'b Expr> {
let Expr::Name(name) = actual else {
return None;
};
let key = Key::BoundName(ShortIdentifier::expr_name(name));
let idx = self.bindings().key_to_idx_hashed_opt(Hashed::new(&key))?;
let mut idx = idx;
let mut gas = Gas::new(100);
while !gas.stop() {
match self.bindings().get(idx) {
Binding::Forward(forward)
| Binding::PromoteForward(forward)
| Binding::ForwardToFirstUse(forward) => idx = *forward,
binding => return Self::pydantic_initializer_from_binding(binding),
}
}
None
}
fn pydantic_initializer_from_binding(binding: &Binding) -> Option<&Expr> {
match binding {
Binding::Expr(_, expr) => Some(expr),
Binding::NameAssign(assign) => Some(&assign.expr),
Binding::AnnotatedType(_, inner) => Self::pydantic_initializer_from_binding(inner),
_ => None,
}
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if metadata.is_pydantic_model() | ||
| && !self | ||
| .get_class_fields(cls) | ||
| .is_some_and(|fields| fields.contains(&Name::new_static("model_validate"))) | ||
| { | ||
| let root_model_type = self | ||
| .get_pydantic_root_model_type_via_mro(cls, &metadata) | ||
| .map(|(ty, _)| ty); | ||
| fields.insert( | ||
| Name::new_static("model_validate"), |
| self.check_pydantic_model_validate_constraints( | ||
| ty, | ||
| &x.arguments.args, | ||
| &x.arguments.keywords, | ||
| x.arguments.range, | ||
| errors, | ||
| ); |
| ClassSynthesizedField::new(self.heap.mk_function(Function { | ||
| signature: Callable::list( | ||
| ParamList::new(params), | ||
| self.heap.mk_self_type(self.as_class_type_unchecked(cls)), | ||
| ), | ||
| metadata, | ||
| })) |
|
According to mypy_primer, this change doesn't affect type check results on a corpus of open source code. ✅ |
Summary
Fixes #1123
Added synthesized model_validate support for Pydantic models including nested model/list dict input shapes.
Wired the synthesized method into Pydantic dataclass fields.
Added call-time constraint checking for model_validate, including inline literals and same-scope literal initializers.
Test Plan
add test