Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Exponentiation should follow Python semantics (or the current semantic difference should be documented) #4936

Closed
goldenrockefeller opened this issue Jul 30, 2022 · 15 comments · Fixed by #5016

Comments

@goldenrockefeller
Copy link
Contributor

goldenrockefeller commented Jul 30, 2022

Is your feature request related to a problem? Please describe.
Exponentiation is common in scientific code. Currently, the way that Cython processes exponentiation is to have the result type match the operand type. This issue has been brought up many times before (see #2133 for example). This can be surprise to those who choose to compile their code to gain a speed boost, even in Pure Python mode. For example:

def sqrt(x: double):
  return x ** 0.5
print(sqrt(-1)) # -> Prints (6.123233995736766e-17+1j) with Python (3.10.5), and nan with Cython (3.0.0a10 on Windows)

As previously mentioned, this issue have been closed before, but the recommended remedy (which is to add documentation to the Caveats page) has not been done

Describe the solution you'd like
Personally, I would like Cython to compile Pure Python code to match Python semantics as much as reasonable, and I think, since Cython 3 is open to breaking changes, that following Python semantics for exponentiation should be the default action moving forward.

Describe alternatives you've considered
An alternative that I am also happy is to keep the current semantics, but to provide a note in the "Caveats" section of the documentation, in addition to providing a clear compile-time warning whenever the Cython compiler can not guarantee adherence to Python exponentiation semantics.

Additional context
I am willing to make a PR to the Caveats documentation if that is the route the core Cython devs want to go. But, as an aside, my current PR for documentation on __cinit__ has yet to be approved.

@scoder scoder added this to the 3.0 milestone Jul 30, 2022
@scoder
Copy link
Contributor

scoder commented Jul 30, 2022

Phew, this seems difficult to argue with these days. The intention of Cython language semantics is "Python by default, unless you opt out into C". In your example, you kind of opt out into C via typing as double, but we also require @cython.cdivision, so why not require an @cython.cpow as well?

We infer Python float as C double when possible, so having such a semantic difference between the two would render that hard to understand for users. However, ** could be considered an unsafe operation for C double, which would change the inference to apply Python semantics and return a Python float, unless @cpow is enabled. (IIRC, we only do that for integers currently and not for floats.)

I agree that Cython 3.0 is the right time to change this.

@scoder
Copy link
Contributor

scoder commented Jul 31, 2022

Marking as blocker since this is a behavioural change. It should at least be looked at again before 3.0 final is released.

@da-woods
Copy link
Contributor

It seems reasonable to me.

The one thing we might do to keep compatibility is use the existing behaviour if the output is typed (but not inferred) as a C double (which I think would be a bit unusual because normally the output of a function wouldn't depend on its return type. Possibly with a warning to use cpow if it seems like too much of an odd rule.

@da-woods
Copy link
Contributor

Thinking about this more,

The one thing we might do to keep compatibility is use the existing behaviour if the output is typed (but not inferred) as a C double (which I think would be a bit unusual because normally the output of a function wouldn't depend on its return type

Not actually sure this'd work too well, since most of the time power probably appears in more complex expressions

Also, could we make a C "complex or float" type as its return type, that's represented by a C complex, but coerces freely to double if it has 0 imaginary part.

@scoder
Copy link
Contributor

scoder commented Aug 1, 2022

Especially within non-trivial expressions, coercion to Python probably isn't intended and thus not the main issue.

It also feels like a complex result, while mathematically correct, is rather unlikely to be intended (or even expected) when calculating the power of a C double. Although I can imagine that being highly domain dependent.

I agree that changing the output type of an operator based on the values (and not their type) seems unusual. We are lucky that division always goes to floats in Python, so that's different. The power operator is definitely more complex to deal with (we'll keep running into that pun, I guess).

Another issue is the difference between these:

# values from somewhere
base: cython.double = 1234.5
exp: cython.double = 2.0

v1 = 1234.5 ** 2.0
v2 = base ** 2.0

v3 = 1234.5 ** exp
v4 = base ** exp

In the first two cases, it's clear that the result is always a C double, whereas in the expressions with variable exponents, we'd have to handle the complex result case, although the values are exactly the same. That's something that I really dislike about this change. Imagine users moving an exponent into a global C constant for clarity, and getting totally different C code from it.

I'm still not entirely convinced that this is something we should change in Cython. We had cases in the past where we decided for Python correctness even (somewhat) against practicality. But deciding that would need more user input. Maybe something to ask on cython-users.

@h-vetinari
Copy link
Contributor

h-vetinari commented Aug 3, 2022

Imagine users moving an exponent into a global C constant for clarity, and getting totally different C code from it.

Personally, I think this would be completely fine, especially if there'd be some sort of:

Warning: Based on the input types in the exponentiation on line #xyz, Cython cannot rule out that a
         complex result may appear (e.g. for a negative base and non-integral exponent). This means
         that Cython has to add checks to handle this possibility, which will make the code less efficient.
         - If the exponent will never be a non-integer, use an integer type or cast to it
         - If the base will never be negative, do <this>
         - Use types (or constants) that make it possible to infer that the result will never be complex
         - If complex results need to be handled, silence this warning by using cpow

Possible candidates for:

  • <this>: add abs() around the base? Idea by @0dminnimda
    • previously: add assert base >= 0 before -> would require a big new feature ("optimizations based on asserts")

Not sure how hard it would be to do it, but I think it would be a nice usability improvement. Exponentiation is definitely not a rare task IMO, quite the opposite (e.g. calculating distances/approximations for almost anything + moments, integration, power series, etc. etc. etc.).

@da-woods
Copy link
Contributor

da-woods commented Aug 3, 2022

add assert exp >= 0 before

Cython doesn't currently use assert statements to infer which optimizations can happen, so this'd be a new (and potentially tricky) feature (if this is what you're implying should happen)

@h-vetinari
Copy link
Contributor

Cython doesn't currently use assert statements to infer which optimizations can happen, so this'd be a new (and potentially tricky) feature (if this is what you're implying should happen)

Yeah, I expected as much, because that gets tricky. If it's not something that can be done easily (either assert or something else to opt into the non-negative case), it might be worth just documenting things as is.

@0dminnimda
Copy link
Contributor

A little hack may be to enforce abs(exp) as an indication of the exp >= 0
It's already tanslated to fabs/fabsf so shouldn't be very costly, but seems to be easier to implement right now

@h-vetinari
Copy link
Contributor

h-vetinari commented Sep 11, 2022

Marking as blocker since this is a behavioural change. It should at least be looked at again before 3.0 final is released.

Since this is becoming relevant now that #4670 (a.k.a. the previously last-known critical blocker for 3.0) is in, I wondered about the best approach here again, and I still think the warning approach might be the best option.

I updated the comment to include @0dminnimda's idea with abs, though notably this is more important around the base than the exponent, as the critical combination that potentially might produce complex numbers is: negative base & non-integral exponent.

For the purposes of not burdening 3.0.0 with another big feature or overhaul, if the abs idea is not easily doable, I also think that just adding a warning (that using a constant base / an integer exponent / ... would produce more efficient code), would already be good enough.

@0dminnimda
Copy link
Contributor

0dminnimda commented Sep 11, 2022

I don't quite like the @cpow, since it is propogated on a chunks and not signgle expressions e.g. functions or while modules, other idea may be to make ** always use python power and just supply cpow() function which will always use c power
It is least hacky of a solution, but also it is less pretty

@da-woods
Copy link
Contributor

da-woods commented Sep 11, 2022

I don't quite like the @cpow, since it is propogated on a chunks and not signgle expressions e.g. functions or while modules

I don't quite follow this. I think it'd be like cdivision which you could use as locally as you want (with cpow).

(Sorry to Chris Power, who i inadvertently tagged....)

@0dminnimda
Copy link
Contributor

I don't quite like the @cpow, since it is propogated on a chunks and not signgle expressions e.g. functions or while modules

I don't quite follow this. I think it'd be like cdivision which you could use as locally as you want (with cpow).

(Sorry to Chris Power, who i inadvertently tagged....)

Oh, yeah, I forgot about possibility to use cdivision in with statement, sorry

@da-woods
Copy link
Contributor

I'm going to give this a go today though and see if I can get something working... I'm optimistic that it won't be a huge change so can be dealt with fairly quickly

@0dminnimda
Copy link
Contributor

It still will be prettier to have a cpow() function IMHO, but now it's more about preferences and beauty of the code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants