New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
change regex to use re.search or re.fullmatch #1631
Comments
This is not a bug, but a suggested change. Happy to change this in v2, personally I think However, it looks like JSON Schema is closer to |
There are multiple aspects here.
|
I agree with all of this. Happy to accept a PR to correct the documentation. In v2 we have the chance to make breaking changes and get things right. I personally think using We could either switch to I will accept whichever seems most popular here. @yurikhan you might be the only person other than me who cares enough to think about this, in that case I'll accept your preference. |
Doc patch will follow. Not immediately but possibly on the weekend. +1 to As for behavior in v2, my vote is for |
"by adjusting expectations, such as documenting Pydantic’s use of re.match, calling out the difference in behavior with JSON Schema validators and suggesting that users who need schema interoperability always make sure their regexen are either explicitly anchored with ^ or explicitly unanchored with .*" It's actually a bit worse than that. I'm not sure it's possible to get schema interoperability. Because of the use of Java regexes have |
Luckily with pydantic v2 we don't have much choice without harming performance - we have to use the regex crate's |
That seems to have |
If enough people really cared, we could use python's regex library, or even make it configurable, but I'm -1 based on faff and performance. |
I can certainly live with Rust's regexes. I'm generating JSON schemas from some of my models for validation in other languages, so Python-specific syntax doesn't help me much anyway. If it was going to be configurable, then maybe it'd be nice if the regex parameter accepted the result of |
Well you can do that easily using validator functions, and even customize the schema as you wish. |
I think the original issue is resolved. In the current version of V2 the example given is "Valid" for both validation libraries. |
Bug
Output of
python -c "import pydantic.utils; print(pydantic.utils.version_info())"
:Related: #1396
In JSON Schema, all versions so far, the regular expression in the
pattern
keyword is treated as unanchored at both ends, i.e.re.search
behavior rather thanre.match
orre.fullmatch
.Pydantic uses
re.match
to validate strings withField
regex
argument, as explained in #1396.When constructing a JSON Schema from a model, Pydantic generates a
pattern
keyword from aField
regex
argument, without any regex postprocessing. Thus, the resulting JSON Schema validates differently from the original Pydantic model.I strongly feel Pydantic should follow suit and use
re.search
to validate fields withregex
argument, and explicitly call this out in the documentation. (Anchors in example usage are not sufficient.)Alternatively, if you are concerned with backward compatibility, I propose the following long-term solution:
Field
(andconstr
) argument, perhaps namedpattern
after the JSON Schema keyword, mutually exclusive with the existingregex
argument. This is API extension, thus, backward compatible.pattern
usingre.search
.BaseModel.schema
to copy thepattern
Field
argument to thepattern
schema keyword as is if present. This way, new users of thepattern
argument get correct schema generation.BaseModel.schema
so that aregex
argument with valueREGEX
produces a schemapattern
keyword with value^(REGEX)
. (The grouping is necessary because the original regex may contain alternatives.) This way, users of theregex
argument get schemas that behave consistently with the original Pydantic model.regex
, suggestingpattern
with explicit anchoring.regex
argument is used to generate a JSON Schema. This way, users of theregex
argument who are likely to be adversely affected by the inconsistensy get a heads-up.The text was updated successfully, but these errors were encountered: