-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PEP-484 "numeric tower" approach makes it hard/impossible to specify contracts in documentation #91390
Comments
Here is a major general problem with python-static-typing as it is I would like to clarify one thing in advance: this is a real problem """ So, "code should be easy to reason about". Now, let us look at this function - I am here (mostly) following the === Example 1, original form === def middle_mean(xs):
"""Compute the average of the nonterminal elements of `xs`. Args: Returns: Raises: Let's not discuss performance, or whether it makes sense to readily Given the function as it is above, I can make statements that are === Theorem 1 ===
...then we are guaranteed that
Now, following PEP-484, we would want to re-write our function, adding type annotations. === Example 1, with mechanically added type information === def middle_mean(xs: List[float]) -> float:
"""Compute the average of the nonterminal elements of `xs`. Args: Returns: Raises: (We are also deliberately not discussing another question here: given So, given the above form, we now find that there seems to be quite a === Example 1, "cleaned up" === def middle_mean(xs: List[float]) -> float:
"""Compute the average of the nonterminal elements of `xs`. Args: Returns: Raises: But now, what does this change mean for the contract? Part of the "If === Theorem 1b ===
...then we are guaranteed that
...but the actual situation is: a) This is not guaranteed. Specifically, this Python3+ example breaks it - note that PEP-484 >>> middle_mean([0.0, 1.25, 10**1000, 0.0])
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 15, in middle_mean
OverflowError: int too large to convert to float So, even the "this function call evaluates to..." is violated here - One option to address this would be to stick with the code as in Now, while this may superficially look like a weird edge case, that is "Smashing The Stack For Fun And Profit" (aleph1@underground.org, Phrack #49) "Delivering Signals for Fun and Profit" (Michal Zalewski) CAPEC-71: Using Unicode Encoding to Bypass Validation Logic So, clearly, we want to not only maintain the ability to reason about |
Thanks for your report, but I would appreciate a more concise explanation. Let me try to rephrase the problem. Given this function: def mean(x: list[float]) -> float:
return sum(x) / len(x) We want to provide a guarantee that if x is a nonempty list containing only floats, the function returns successfully and returns a float. But the type system currently doesn't give this guarantee, because PEP-484 specifies that ints are compatible with floats, and --- We generally discuss issues with the general type system over at https://github.com/python/typing/issues, but here are a few thoughts:
|
This is a partial duplicate of an issue you already filed: https://bugs.python.org/issue47121 where math.isfinite(10**1000) raises an OverflowError even though it type checks. Here was one of the comments: |
This is not a partial duplicate of https://bugs.python.org/issue47121 about math.isfinite(). The problem here is: There is a semantic discrepancy between what the issubclass(type(x), float) (I am deliberately writing it that way, given that isinstance() can, in general [but actually not for float], lie.) and what the term 'float' means in a statically-checkable type annotation like: def f(x: float) -> ... : ... ...and this causes headaches. The specific example ('middle_mean') illustrates the sort of weird So, basically, there is a choice to make between these options: Option A: Give up on the idea that "we want to be able to reason with Option B: Accept the discrepancy and tell people that they have to be Option C: Realizing that having "float" mean different things for Also, there is Option D: PEP-484 has quite a lot of other problems Basically, Option B would spell out as: 'We expect users who use def foo(x: float) -> float:
"""Returns the foo of the number `x`. Args: Returns: ...which actually is shorthand for...: def foo(x: float # Note: means float-or-int
) -> float # Note: means float-or-int
:
"""Returns the foo of the number `x`. Args: Returns: Option C (and perhaps D) appear - to me - to be the only viable Option C would amount to changing the meaning of...: def foo(x: float) -> float:
"""Returns the foo of the number `x`. Args: Returns: to "static type annotation float really means instance-of-float here" (a) ("this is supposed to strictly operate on float") Args: Returns: (b) ("this will eat any kind of real number") def foo(x: numbers.Real) -> numbers.Real:
"""Returns the foo of the number `x`. Args: Returns: (c) ("this will eat any kind of real number, but the result will always be float") def foo(x: numbers.Real) -> float:
"""Returns the foo of the number `x`. Args: Returns: (d) ("this will eat int or float, but the result will always be float") def foo(x: Union[int, float]) -> float:
"""Returns the foo of the number `x`. Args: Returns: (e) ("this will eat int or float, and the result will also be of that type") def foo(x: Union[int, float]) -> Union[int, float]:
"""Returns the foo of the number `x`. Args: Returns: (f) ("this method maps a float to a real number, but subclasses can def myfoo(self, x: float) -> numbers.Real:
"""Returns the foo of the number `x`. Args: Returns: |
Please try to make your messages more concise. |
Re AlexWaygood: If these PEP-484 related things were so obvious that they would admit a compact description of the problem in 2-3 lines, these issues would likely have been identified much earlier. We would not be seeing them now, given that Python by and large is a somewhat mature language. |
Closing this as being unspecific and unactionable. To develop the ideas further consider launching a discussion on python-ideas or the typing-sig. If something matures, it would likely require a PEP because it changes changes decisions made in earlier typing PEPs. One cannot just wish away the differences between reals in mathematics, Real in numbers.py, floats as used in typing, and the concrete float type implemented in the core language. There are practical limits on how much they can be harmonized without breaking existing code and tools. |
(I would like to comment on this claim: "One cannot just wish away the differences between reals in mathematics, Real in numbers.py, floats as used in typing, and the concrete float type implemented in the core language. There are practical limits on how much they can be harmonized without breaking existing code and tools." Right now, we are in a situation where this function...: def float_identity(x: float) -> float: ...fails to give two important guarantees:
Note that this Python-specific problem that Python actually also has a notion of IEEE-754-1985 binary64 compliant floating point numbers - these are objects satisfying isinstance(type(x), float), which includes instances of subclasses such as numpy.float64. Python is special - and indeed unique - in that it has a second notion of Ad: "To develop the ideas further consider launching a discussion on python-ideas or the typing-sig" - such procedural detail changes nothing about the observation that this quite clearly is a relevant open bug. Python is a popular programming language in higher education. One important aspect of teaching computing is teaching the skill to reason about the behavior of code. With discrepancies such as this that convey a message along the lines of "reasoning about code is difficult, so let's just forget about that", Python unnecessarily positions itself as unsuitable for teaching computing. |
Thomas, I agree with you that the current situation is broken by design.
At the time it was suggested, I had grave misgivings about the wisdom of
treating float as an alias for Union[float, int] instead of creating a
new static type, but I got seduced by the convenience of not having to
import from typing.
It means that if a value type-checks statically as a float, you cannot
tell if it has float methods fromhex() and is_integer(). You cannot even
pass such a "static float" to float() safely.
But it is very hard to change the statis quo in Python, even when it is
broken. Some people will dispute that it is broken. Expect to be accused
of prioritising theoretical purity over practicality. You cannot counter
that with theoretical arguments. You have to find practical, real-world
examples.
More reasonably, we can't easily break backwards compatibility for those
who are currently relying on small ints to statically type check as
floats. Even if we all agreed with you that this was a problem that
needs fixing, we can't easily fix it.
What can be done? On the bug tracker, nothing. If you wish to persevere,
I recommend you start with the Typing SIG mailing list. If you can get
the typing community, or the numpy/scipy/pandas community, to agree that
this design is a mistake, and agree that it should be changed, then we
can move forward with a PEP to deprecate the current behaviour and
eventually change it.
Without agreement from the typing and/or numpy communities, I will say
you have no hope of getting change here.
|
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: