Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type Serializer as generic #374

Merged
merged 1 commit into from Apr 13, 2024
Merged

type Serializer as generic #374

merged 1 commit into from Apr 13, 2024

Conversation

davidism
Copy link
Member

@davidism davidism commented Apr 13, 2024

Serializer now takes one generic parameter, either str or bytes, to define what dumps returns rather than a str | bytes union.

The default_serializer attribute and serializer argument are a generic protocol with the same parameter. The protocol requires an object that has a loads method and a dumps method that returns the appropriate type.

URLSafeSerizlizer is typed to always return str. This makes it more convenient as the user code no longer needs to do extra work to satisfy the type checker.

s = URLSafeSerializer()
data = s.dumps(...)

fixes #347

@davidism
Copy link
Member Author

Hmm, mypy somehow knows that Serializer().dumps() returns str, even without an annotation. pyright does not though:

from itsdangerous import Serializer

s = Serializer("secret")
data = s.dumps("test")
reveal_type(data)

Mypy reveals str, pyright reveals Unknown unless you do s: Serializer[str] = Serializer().

@davidism
Copy link
Member Author

OK, figured out how to get mypy and pyright to understand that the default is Serializer[str] if no serializer argument is provided. Used an overload that specified the type of self. Now both tools reveal str in the example above.

@davidism
Copy link
Member Author

davidism commented Apr 13, 2024

This doesn't quite make things nice for typing users, as now json and pickle are not "valid" as values to serializer since neither mypy or pyright treat modules with functions as fulfilling a protocol. Additionally, pyright is strict about the argument names for protocols. So you're basically forced to write a wrapper to convince the tools that a module or conforms to the protocol, or use an ignore and an explicit type parameter on assignment.

@davidism davidism merged commit 385c0eb into main Apr 13, 2024
11 checks passed
@davidism davidism deleted the generic-serializer branch April 13, 2024 20:55
Copy link
Contributor

@Daverball Daverball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Already looks pretty good, it's mostly some subtle variance issues, that weren't quite right.

@@ -10,13 +10,20 @@
from .signer import _make_keys_list
from .signer import Signer

_TAnyStr = t.TypeVar("_TAnyStr", str, bytes, covariant=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_TAnyStr = t.TypeVar("_TAnyStr", str, bytes, covariant=True)

I would get rid of this, just use t.AnyStr, this actually needs to be invariant. Alternatively you could replace it with a typing_extensions.TypeVar and set default=str instead of covariant=True, that may lead to more robust results compared to relying on just the overloads on __init__.

Copy link
Member Author

@davidism davidism Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got errors from mypy that it had to be covariant for class Serializer(t.Generic[t.AnyStr])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was because you weren't using it for the payload argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After testing again I'm not getting a covariance error from mypy anymore when using AnyStr, not sure what was going on before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Protocol mypy does a variance calculation and gives an error if the variance doesn't match. Since you previously only used _TAnyStr in return type annotations it was calculated as covariant, as soon as you added an argument that uses the same TypeVar the variance calculation changed to invariant.

That's why it's sometimes dangerous to re-use the same TypeVar between a Protocol and a Generic, since the variance may be wrong for the Generic, but mypy does not do a variance calculation for Generic. PEP-695 should make this less of a problem in the future, since type vars will generally be scoped to the generic itself, rather than in global scope and all they use auto-variance by default, so you don't have to think about it.

Comment on lines +16 to +18
class _PDataSerializer(t.Protocol[_TAnyStr]):
def loads(self, payload: str | bytes) -> t.Any: ...
def dumps(self, obj: t.Any, **kwargs: t.Any) -> _TAnyStr: ...
Copy link
Contributor

@Daverball Daverball Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class _PDataSerializer(t.Protocol[_TAnyStr]):
def loads(self, payload: str | bytes) -> t.Any: ...
def dumps(self, obj: t.Any, **kwargs: t.Any) -> _TAnyStr: ...
class _PDataSerializer(t.Protocol[t.AnyStr]):
def loads(self, payload: t.AnyStr, /) -> t.Any: ...
def dumps(self, obj: object, /) -> t.AnyStr: ...

You are providing an upper bound here, so you don't want to make the argument list too strict.

For loads you can't use str | bytes because parameters are contravariant, so that would require each serializer to accept both str and bytes. Exactly because parameters are contravariant you can satisfy a function that expects str with a supertype of str, such as a union with bytes.

As for dumps, IIRC unless you specifically use *args: Any, **kwargs: Any neither mypy nor pyright will treat this as a gradual signature, that allows serializers with arbitrary extra arguments, it will instead require each serializer function to take arbitrary arguments. This means you will have to ignore a type error when you pass in serializer_kwargs, but that's a hit you will have to take until we have something like a TypeVarMapping to pass just kwargs along, ParamSpec doesn't really work here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, after making these changes mypy does accept json and pickle as valid.

Copy link
Member Author

@davidism davidism Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pyright is still not happy though. It still insists that modules do not satisfy the protocol.

def dumps(self, obj: t.Any, **kwargs: t.Any) -> _TAnyStr: ...


def is_text_serializer(serializer: _PDataSerializer[t.Any]) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could make this a TypeGuard, although it's probably better to wait for TypeIs support to arrive, since that will yield better results. Although you could make it TypeGuard[_PDataSerializer[str]] for now and add a comment to change it to TypeIs later.

@@ -71,7 +78,7 @@ class Serializer:
#: The default serialization module to use to serialize data to a
#: string internally. The default is :mod:`json`, but can be changed
#: to any object that provides ``dumps`` and ``loads`` methods.
default_serializer: t.Any = json
default_serializer: _PDataSerializer[_TAnyStr] = json # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default_serializer: _PDataSerializer[_TAnyStr] = json # type: ignore[assignment]
default_serializer: _PDataSerializer[str] = json

This seems more sane to me, if you're worried about subclasses providing a bytes serializer as default, you could change it to _PDataSerializer[t.Any] or a very long-winded t.Union[_PDataSerializer[str], _PDataSerializer[bytes]]. As a class attribute it shouldn't really be generic, that's prone to weird errors.

@@ -135,7 +174,9 @@ def secret_key(self) -> bytes:
"""
return self.secret_keys[-1]

def load_payload(self, payload: bytes, serializer: t.Any | None = None) -> t.Any:
def load_payload(
self, payload: bytes, serializer: _PDataSerializer[_TAnyStr] | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self, payload: bytes, serializer: _PDataSerializer[_TAnyStr] | None = None
self, payload: bytes, serializer: _PDataSerializer[t.Any] | None = None

The type of the passed in serializer doesn't matter, since it is re-checked.

Comment on lines +20 to +21
_TAnyStr = t.TypeVar("_TAnyStr", str, bytes, covariant=True)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_TAnyStr = t.TypeVar("_TAnyStr", str, bytes, covariant=True)

Same thing here

@@ -166,7 +168,7 @@ def validate(self, signed_value: str | bytes, max_age: int | None = None) -> boo
return False


class TimedSerializer(Serializer):
class TimedSerializer(Serializer[_TAnyStr]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TimedSerializer(Serializer[_TAnyStr]):
class TimedSerializer(Serializer[t.AnyStr]):

@Daverball
Copy link
Contributor

These variance issues are why you're getting complaints for json and pickle.

@davidism davidism mentioned this pull request Apr 13, 2024
@davidism
Copy link
Member Author

Continued in #376.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 28, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consider making Serializer generic in t.AnyStr for type checking to avoid overly ambiguous return types
2 participants