-
-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
Bug
Output of python -c "import pydantic.utils; print(pydantic.utils.version_info())":
pydantic version: 1.5.1
pydantic compiled: False
install path: /home/khan/.local/lib/python3.6/site-packages/pydantic
python version: 3.6.9 (default, Nov 7 2019, 10:44:02) [GCC 8.3.0]
platform: Linux-4.15.0-88-generic-x86_64-with-Ubuntu-18.04-bionic
optional deps. installed: ['typing-extensions']
Related: #1396
import jsonschema
import pydantic
class Foo(pydantic.BaseModel):
bar: str = pydantic.Field(..., regex='baz')
try:
Foo(bar='bar baz quux')
except pydantic.ValidationError as e:
print(e)
# ValidationError: 1 validation error for Foo
# bar
# string does not match regex "baz" (type=value_error.str.regex; pattern=baz)
else:
print('Valid')
try:
jsonschema.validate({'bar': 'bar baz quux'}, Foo.schema())
except jsonschema.ValidationError as e:
print(e)
else:
print('Valid')
# ValidIn JSON Schema, all versions so far, the regular expression in the pattern keyword is treated as unanchored at both ends, i.e. re.search behavior rather than re.match or re.fullmatch.
Pydantic uses re.match to validate strings with Field regex argument, as explained in #1396.
When constructing a JSON Schema from a model, Pydantic generates a pattern keyword from a Field regex argument, without any regex postprocessing. Thus, the resulting JSON Schema validates differently from the original Pydantic model.
I strongly feel Pydantic should follow suit and use re.search to validate fields with regex argument, and explicitly call this out in the documentation. (Anchors in example usage are not sufficient.)
Alternatively, if you are concerned with backward compatibility, I propose the following long-term solution:
- Add a new
Field(andconstr) argument, perhaps namedpatternafter the JSON Schema keyword, mutually exclusive with the existingregexargument. This is API extension, thus, backward compatible. - Implement validation for
patternusingre.search. - Change the behavior of
BaseModel.schemato copy thepatternFieldargument to thepatternschema keyword as is if present. This way, new users of thepatternargument get correct schema generation. - One or more of:
- Fix the behavior of
BaseModel.schemaso that aregexargument with valueREGEXproduces a schemapatternkeyword with value^(REGEX). (The grouping is necessary because the original regex may contain alternatives.) This way, users of theregexargument get schemas that behave consistently with the original Pydantic model. - Deprecate
regex, suggestingpatternwith explicit anchoring. - Emit a warning if a model containing fields with the
regexargument is used to generate a JSON Schema. This way, users of theregexargument who are likely to be adversely affected by the inconsistensy get a heads-up.
- Fix the behavior of