Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StringConstraints: Bug #9381

Closed
SDAravind opened this issue May 3, 2024 · 11 comments · Fixed by #9623
Closed

StringConstraints: Bug #9381

SDAravind opened this issue May 3, 2024 · 11 comments · Fixed by #9623
Assignees
Labels
bug V2 Bug related to Pydantic V2
Milestone

Comments

@SDAravind
Copy link

I see the argument for changing it, but it's a very annoying breaking change to understand, so my instinct is we don't change this, even in a major update.

As for the reason - my guess is that the best explanation is "because that's how it was first done" and there hasn't been a strong (enough) argument to change it.

see below, lowercase/uppercase is set to true and strip whitespaces is set to true. Interestingly, we see the second part of email is always lowercase. This is very strange behavior. Though, I can exclude strip_whitespace as EmailStr takes care of striping whitespaces.

I'm bit confused when to use and not to use strip_whitespace. Should we use strip_whitespace with others params?

from pydantic import BaseModel, EmailStr, StringConstraints
from typing import Annotated

EStr = Annotated[EmailStr, StringConstraints(to_lower=True, strip_whitespace=True)]


class Foo(BaseModel):
    bar: EStr


Foo(bar="uSeR@ExAmPlE.com")
>>>  Foo(bar='uSeR@example.com')

Originally posted by @SDAravind in #8577 (comment)

@RajatRajdeep
Copy link

The email validation process relies on the email_validator package.

This validate_email function of the package returns an object with a normalized form of the email address, in which domain is in lowercase.

parts = email_validator.validate_email(email, check_deliverability=False)

@SDAravind
Copy link
Author

It's StringConstraints issue after EmailStr validation it has to apply lower/upper case as required but strip_whitespace doesn't make any other parameters work in StringContraints.

@austinyu
Copy link

austinyu commented May 9, 2024

Another example that might be relevant

from pydantic import BaseModel, StringConstraints, NameEmail 
from typing import Annotated

EStr = Annotated[StringConstraints(to_lower=True), NameEmail]

class User(BaseModel):
    email: EStr

user = User(email='Fred Bloggs <FREG.bloggs@example.com>')
print(user)

Result

# Expected
email=NameEmail(name='Fred Bloggs', email='freg.bloggs@example.com')
# Actual
email=NameEmail(name='Fred Bloggs', email='FREG.bloggs@example.com')

@sydney-runkle sydney-runkle added the bug V2 Bug related to Pydantic V2 label May 14, 2024
@sydney-runkle sydney-runkle added this to the 2.7 fixes milestone May 16, 2024
@sydney-runkle
Copy link
Member

@SDAravind,

Thanks for reporting this in a separate issue. I've added this to our upcoming milestone!

@sydney-runkle sydney-runkle modified the milestones: 2.7 fixes, v2.8.0 May 16, 2024
@sydney-runkle sydney-runkle self-assigned this Jun 10, 2024
@sydney-runkle
Copy link
Member

I'll note, I think this is related to #8577 (where this issue was originally discussed).

Eventually we want to support customizing the order of string constraint application. Also notably, this will be easier with our recently merged pipeline API pattern! #9459

@sydney-runkle
Copy link
Member

from typing import Annotated


from pydantic import BaseModel, EmailStr
from pydantic.experimental.pipeline import validate_as

EStr = Annotated[EmailStr, validate_as(str).str_strip().validate_as(...).str_lower()]


class Foo(BaseModel):
    bar: EStr


print(repr(Foo(bar="uSeR@ExAmPlE.com")))
# Foo(bar='user@example.com')

@adriangb
Copy link
Member

adriangb commented Jun 10, 2024

Nice! In the future I'd hope we can get Annotated[EmailStr, validate_as(...).str_strip().str_lower()] to work.

Does this end up with a string or an EmailStr?

@sydney-runkle
Copy link
Member

@adriangb,

That produces a string, but EmailStr inherently just returns a string:

This probably makes the most sense contextually:

validate_as(str).str_strip().str_lower().validate_as(...)

@sydney-runkle
Copy link
Member

Interesting case showcasing the bug:

from pydantic import BaseModel, EmailStr, StringConstraints
from typing import Annotated


class Foo(BaseModel):
    foo: EmailStr
    bar: Annotated[EmailStr, StringConstraints(to_lower=True)]
    baz: Annotated[EmailStr, StringConstraints(to_lower=True, strip_whitespace=True)]
    # foo: NEmail


print(repr(
    Foo(
        foo=" uSeR@ExAmPlE.com ",
        bar=" uSeR@ExAmPlE.com ",
        baz=" uSeR@ExAmPlE.com  "
    )
))
#> Foo(foo='uSeR@example.com', bar='user@example.com', baz='uSeR@example.com')

@sydney-runkle
Copy link
Member

sydney-runkle commented Jun 10, 2024

Aha, the culprit!

You can see in the core schema for baz, only one of the string constraints is applied to the schema!

pretty_print_core_schema(Foo.__pydantic_core_schema__)
"""
{
    'type': 'model',
    'cls': <class '__main__.Foo'>,
    'schema': {
        'type': 'model-fields',
        'fields': {
            'foo': {
                'type': 'model-field',
                'schema': {
                    'function': {'type': 'no-info', 'function': <bound method EmailStr._validate of <class 'pydantic.networks.EmailStr'>>},
                    'schema': {'type': 'str'},
                    'type': 'function-after'
                }
            },
            'bar': {
                'type': 'model-field',
                'schema': {
                    'type': 'chain',
                    'steps': [
                        {
                            'function': {'type': 'no-info', 'function': <bound method EmailStr._validate of <class 'pydantic.networks.EmailStr'>>},
                            'schema': {'type': 'str'},
                            'type': 'function-after'
                        },
                        {'type': 'str', 'to_lower': True}
                    ]
                }
            },
            'baz': {
                'type': 'model-field',
                'schema': {
                    'type': 'chain',
                    'steps': [
                        {
                            'function': {'type': 'no-info', 'function': <bound method EmailStr._validate of <class 'pydantic.networks.EmailStr'>>},
                            'schema': {'type': 'str'},
                            'type': 'function-after'
                        },
                        {'type': 'str', 'strip_whitespace': True}
                    ]
                }
            }
        },
        'model_name': 'Foo'
    },
    'ref': '__main__.Foo:4822756464'
}
"""

@sydney-runkle
Copy link
Member

I've opened a PR that fixes this issue. Specifically, the first commit is where the fix is :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug V2 Bug related to Pydantic V2
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants