Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add typing with mypy #186

Merged
merged 1 commit into from Aug 31, 2020
Merged

add typing with mypy #186

merged 1 commit into from Aug 31, 2020

Conversation

davidism
Copy link
Member

@davidism davidism commented Aug 28, 2020

Implementation notes:

  • Didn't import types individually, used import typing as _t to shorten things.
  • Common types are aliased in a if TYPE_CHECKING: block and referenced as string names.
  • Only a few types (really just str_bytes) were common between modules, didn't bother with a common _typing module.
  • All generics that aren't in the common block are strings to avoid runtime cost. This won't be necessary once we drop 3.6.
  • The return_timestamp parameter of TimestampSigner.unsign changes the return type. To distinguish these, @overload is used, but because the method takes some other optional parameters, many overloads are needed to cover every combination. I added the overloads that matter, as mypy does use that to figure out a type elsewhere, but ignored the finding about incompatible overlap.
  • Flake8 has a special case so @typing.overload doesn't trigger a redefinition error, but it has to be literally typing.overload, _t.overload isn't recognized. So had to ignore that Flake8 finding.
  • Mypy doesn't allow assignment of modules or classes for Protocol, so Serializer.serializer has the Any type for now. See Allow using modules as subtypes of protocols python/mypy#5018.

Findings and future work:

  • TimedSerializer.loads and loads_unsafe have incompatible signatures with Serializer.loads because extra parameters were added before the salt parameter. This violates the Liskov substitution principle, and should probably be migrated with *args and a deprecation warning at some point. I added a TODO in the code.

  • Between removing Python 2 compat helpers and adding typing, I'm more convinced that accepting bytes and str interchangeably everywhere is not good. Python 3 emphasizes understanding the boundary between the two.

    Because pretty much every single point in the ItsDangerous API accepts either, want_bytes is called over and over again, even where it's redundant because an earlier function already called it. I already moved wants_bytes around to get a few spots that were missed. This isn't a huge deal in ItsDangerous, but it's probably hurting performance in Werkzeug where it happens much more often.

    It's still probably useful to accept both as the data passed to Serializer.loads, Signer.sign, and Singer.unsign, since you might be signing either bytes or text, and received data to be loaded might be bytes or text (ASGI vs WSGI, for example). Everything else should probably be bytes only since that's how they're used.

cc @pgjones

@davidism davidism added this to the 2.0.0 milestone Aug 28, 2020
from .signer import Signer

if _t.TYPE_CHECKING:
_t_str_bytes = _t.Union[str, bytes]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be better to stick commonly used types in a separate module/file?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this, but ultimately decided it wasn't worth the further indirection.

Only a few types (really just str_bytes) were common between modules, didn't bother with a common _typing module.

For a bigger library it might make more sense.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense. I find that it helps to stick them in the same file and for all usages to reference a single definition because it encourages consistency.

Copy link

@teymour-aldridge teymour-aldridge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note on possibly reorganising things.
Github made me write this comment to submit the review.

@teymour-aldridge
Copy link

Oh, looks like that comment wasn't needed???

if isinstance(s, str):
s = s.encode(encoding, errors)

return s


def base64_encode(string):
def base64_encode(string: "_t_str_bytes") -> bytes:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely a stylistic question, but I find this tricky to read and understand. The type hints for me are much better when explicit for documentation purposes.

Also is AnyStr not appropriate for this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got tired of writing the full syntax over and over again. This one is fairly simple to replace, but since I was making aliases anyway for more complex types, and this was used so much, I made it an alias too. If you think this is too distracting I can change it, although I'd want to know what the cutoff is for using an alias or not. Note that PyCharm and Mypy both report the real type, not the alias.

AnyStr is a TypeVar, so its intended use is for ensuring consistency across arguments/return. The Union alias is more "anything goes", which represents how ItsDangerous works right now. See my discussion on str and bytes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the types are an additional piece of documentation (they also show up to developers, mostly when using IDEs and sometimes in documentation) so it's nice if the names are meaningful.

Copy link
Member Author

@davidism davidism Aug 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, Sphinx does not dereference the aliases, so it does look weird there. Maybe there's a Sphinx plugin for this?

Thinking about annotating Flask.make_response, I really wouldn't want to inline the huge annotation for what it accepts, especially since that would be useful as an annotation for view functions in user code. So hopefully there's a way to make the Sphinx output clear.

@davidism
Copy link
Member Author

Moving the docs discussion out of review. It seems that Sphinx autodoc relies on typing.get_type_hints(), which fails since the aliases are guarded in a TYPE_CHECKING block. That was me just experimenting with forward references before we drop 3.6, but I don't have to do that, I'll change it to use runtime annotations.

@davidism
Copy link
Member Author

davidism commented Aug 28, 2020

The parameter annotations look really messy:

image

I think I'm going to set autodoc_typehints = 'none'.

@teymour-aldridge
Copy link

I think we should have the aliases (e.g. for Union[bytes, str] and the like) but they should be given more intelligible names (such as "StrOrBytes" or something similar) rather than "_t_str_bytes" which is harder to understand.
My guess is that the use of a secret key shows up in multiple places (I confess I have limited understanding of this library) so perhaps creating an alias to be used universally would be useful?

@davidism
Copy link
Member Author

The _t_ prefix is there for the same reason I used import typing as _t, to use a short name and keep everything related to types organized under it. The name is only referenced internally, it's not for public consumption.

Didn't import types individually, used import typing as _t to shorten things.

@teymour-aldridge
Copy link

I'd argue that the types are public because they show up to people using them in development. Keeping a distinction sounds good though.

@davidism
Copy link
Member Author

davidism commented Aug 28, 2020

The typing information is public in that it's now available when a project uses a type checker, but the exact aliases we used aren't. If there was some alias that we expected users to use in their own code, then we would make it a public name. For example, the argument to Flask.make_response would be a publicly importable alias. Type checkers like Mypy and PyCharm show the real types, not the internal aliases.

@davidism
Copy link
Member Author

davidism commented Aug 28, 2020

Sphinx 3.0 added autodoc_typehints = 'description' which looks much nicer than 'signature'. Still messy for long types, but better organization.

Although for some reason it skips __init__ annotations, which makes it less useful. Had to set autoclass_content = "both" since the default used the class doc and ignored the init doc.

@jab
Copy link
Member

jab commented Aug 28, 2020

Just in case, does https://github.com/agronholm/sphinx-autodoc-typehints help with anything here?

@davidism
Copy link
Member Author

davidism commented Aug 28, 2020

I saw it but didn't try it. autodoc_typehints = 'description' seems to do the same thing and is built-in now.

@davidism
Copy link
Member Author

davidism commented Aug 30, 2020

Since we were asking about it, I tried replacing the Union[str, bytes] with AnyStr, but it failed. The problem is that a TypeVar is collapsed to a single type that all other uses of it in the same context must match, and that's too strict for what ItsDangerous does. Unfortunately, when you do have something that uses AnyStr, Mypy doesn't recognize it as valid to pass to Union[str, bytes]. No clue what to do about that. Hey look, I ran across Teymour's question about it 😅 https://stackoverflow.com/questions/62673777/how-to-pass-a-value-of-type-unionbytes-str-to-function-taking-anystr

I'm trying to add types to the tests, but it's fairly frustrating as I do a lot of little weird things in the tests. For example, even if you annotate serializer: Serializer, Mypy thinks it doesn't have a signer attribute, even though that's annotated and works elsewhere. I have been thinking of writing a more flat test layout that doesn't rely on class fixtures and patching so much, maybe this is an indication that's a good idea. I'll try a bit more on the tests then just add ignores for now.

@teymour-aldridge
Copy link

I think it's a bug in mypy. There's a few of them in mypy which have been around for a really long time (the inability to make typed NamedTuple's generic is another one).
Hopefully they'll fix them 🤞🏼 – fixing them seems a bit beyond me at the monent 😄.

@davidism davidism force-pushed the typing branch 2 times, most recently from 21719b3 to fc948ac Compare August 31, 2020 18:30
@davidism
Copy link
Member Author

davidism commented Aug 31, 2020

I ignored the jws module since it's deprecated, and configured tests to check untyped functions without requiring types. Added a py.typed file.

@davidism davidism marked this pull request as ready for review August 31, 2020 18:50
@davidism davidism merged commit 4559e76 into master Aug 31, 2020
@davidism davidism deleted the typing branch August 31, 2020 18:50
@davidism
Copy link
Member Author

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 9, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants