Skip to content

Conversation

sydney-runkle
Copy link
Contributor

@sydney-runkle sydney-runkle commented Nov 4, 2024

This new implementation sort of starts to follow the pattern we'd use for RustModel like classes by accessing the pydantic-core structure from a wrapper pydantic type.

With this new pattern, isinstance checks work properly (they don't in the most recent version of pydantic):

from pydantic import HttpUrl, TypeAdapter

http_url = TypeAdapter(HttpUrl).validate_python('http://example.com/something')
print(http_url)
#> http://example.com/something
print(repr(http_url))
#> HttpUrl(http://example.com/something)
print(http_url.__class__)
#> <class 'pydantic.networks.HttpUrl'>
print(http_url.host)
#> example.com

I think some more refactoring should be done in pydantic-core, but this serves as the pydantic end of things, for the most part. In core (future PR), the main thing we can do is simplify the Url and MultiHostUrl classes (remove docstrings, etc).

Eventually, we should get rid of validation on init, that's pretty gnarly - but that would constitute a breaking change. Should we do this now?

@github-actions github-actions bot added the relnotes-fix Used for bugfixes. label Nov 4, 2024
Copy link

cloudflare-workers-and-pages bot commented Nov 4, 2024

Deploying pydantic-docs with  Cloudflare Pages  Cloudflare Pages

Latest commit: b7c8485
Status: ✅  Deploy successful!
Preview URL: https://19f0349f.pydantic-docs.pages.dev
Branch Preview URL: https://url-fix-i-think.pydantic-docs.pages.dev

View logs

@sydney-runkle
Copy link
Contributor Author

sydney-runkle commented Nov 4, 2024

The one testing issue that I'm running into currently is that now HttpUrl, etc aren't direct subclasses of Url, so isinstance checks are failing validation...

Maybe, this can be fixed with a union against an isinstance schema, but I hate using unions for internally defined schemas.
We could also subclass from Url and MultiHostUrl, but that seems quite messy?

Copy link

codspeed-hq bot commented Nov 4, 2024

CodSpeed Performance Report

Merging #10766 will not alter performance

Comparing url-fix-i-think (b7c8485) with main (678ec30)

Summary

✅ 44 untouched benchmarks

Copy link
Contributor

github-actions bot commented Nov 4, 2024

Coverage report

This PR does not seem to contain any modification to coverable code.

@sydney-runkle
Copy link
Contributor Author

I've fixed the above concerns with a wrap validator, though this admittedly slows validation significantly (see regression above).

Copy link
Contributor

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To summarise:

  • We've made the decision to move to a Python object hierarchy, because this will produce results which match more what users expect
  • We keep the Rust state as an internal detail
  • We use a wrap validator so that core can handle these types appropriately

I feel that one day we might be able to do away with this Python wrapper if we wanted to squeeze every last drop of performance, but I also am happy with this 👍

We should be wary and expect there may be subtle breakages caused by this, and hence call this out in update notes.

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think constraints have to be enforced by __init__.

My suggestion would be a lazy type adapter (via functools.cache) which is called in __init__.

Then in core the functional validator, we do:

core_url = handler(input)
instance = source.__new__(source)
instance._url = core_url

@sydney-runkle
Copy link
Contributor Author

My suggestion would be a lazy type adapter (via functools.cache) which is called in init.

Does this mean we can rip out validation logic on __init__ in pydantic-core? @samuelcolvin @davidhewitt

@sydney-runkle
Copy link
Contributor Author

Happy to fix validation on init in this pr, hopefully that moves this into the "don't need to deal with for a while" state.

@sydney-runkle
Copy link
Contributor Author

We might be able to pull this off with just an after validator, which could be good...

@davidhewitt
Copy link
Contributor

My suggestion would be a lazy type adapter (via functools.cache) which is called in init.

Does this mean we can rip out validation logic on __init__ in pydantic-core? @samuelcolvin @davidhewitt

Probably not, because it currently needs to assume valid state internally. Maybe in the future, but I wouldn't rush to do it.

@samuelcolvin
Copy link
Member

I agree with @davidhewitt, and I don't see any particular advantage of ripping out that logic right now.

@sydney-runkle
Copy link
Contributor Author

Probably not, because it currently needs to assume valid state internally. Maybe in the future, but I wouldn't rush to do it.

I think this effectively results in double validation for some cases (see perf degradation above), but I'm ok with this as an intermediate state for now.

@samuelcolvin
Copy link
Member

we should definitely avoid double validation, I think that's doable.

@sydney-runkle
Copy link
Contributor Author

Here's where I'm having trouble with double validation:

from pydantic import TypeAdapter, HttpUrl

ta = TypeAdapter(HttpUrl)
http_url = HttpUrl('http://example.com/something')

assert str(http_url) == 'http://example.com/something'
assert repr(http_url) == "HttpUrl('http://example.com/something')"
assert http_url.__class__ == HttpUrl
assert http_url.host == 'example.com'
assert http_url.path == '/something'
assert http_url.username is None
assert http_url.password is None

http_url_validated = ta.validate_python(http_url)
"""
pydantic_core._pydantic_core.ValidationError: 1 validation error for function-wrap[wrap_val()]
  URL input should be a string or URL [type=url_type, input_value=HttpUrl('http://example.com/something'), input_type=HttpUrl]
    For further information visit https://errors.pydantic.dev/2.10/v/url_type
"""

So, I think we need to have some sort of isinstnace check in the wrap validator. This is unfortunate though, as we'll no longer respect config.revalidate_instances...

I think this is where having a cls passed to pydantic-core's url_schema could be helpful to help with this isinstance check that's failing, but I'd really prefer to keep that out of the scope of this fix, and make that change in v2.11. We already tried that once, and things went awry with duplicated validation.


@cached_property
def _validator(self) -> TypeAdapter:
return TypeAdapter(self.__class__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It feels like it could be both simpler and more efficient if here this TypeAdapter just built the underlying rust Url, with appropriate constraints? Is there a reason why that might be problematic for subclasses? 🤔

Copy link
Contributor Author

@sydney-runkle sydney-runkle Nov 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think there's an easy way to do this - you can't easily create a TypeAdapter with just a core schema. Even if we could hack that together, should we?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well we could use a SchemaValidator instead of a TypeAdapter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Found a relatively simple way with TypeAdapter. That's easiest for now, as we already have the isinstance conditional in the wrap validator in the custom core schema.

Down the line, maybe it does make sense to just use SchemaValidator inside these types to perform internal validation. I'm going to write up an issue with next steps based on the feedback here, and I'll include this 👍

Copy link
Member

@samuelcolvin samuelcolvin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work, LGTM.

@sydney-runkle
Copy link
Contributor Author

Thanks for the abundance of iterations here @samuelcolvin @davidhewitt.

I think we're moving in the right direction here, next steps include improving performance and optimizing / standardizing how we validate types like this on __init__.

@sydney-runkle sydney-runkle enabled auto-merge (squash) November 5, 2024 17:47
@sydney-runkle sydney-runkle merged commit fba836a into main Nov 5, 2024
52 checks passed
@sydney-runkle sydney-runkle deleted the url-fix-i-think branch November 5, 2024 17:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
relnotes-fix Used for bugfixes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants