Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lambdify not working with constant expression #24036

Closed
callegar opened this issue Sep 9, 2022 · 10 comments
Closed

lambdify not working with constant expression #24036

callegar opened this issue Sep 9, 2022 · 10 comments

Comments

@callegar
Copy link

callegar commented Sep 9, 2022

Lamdify gives wrong result when called on an expression that is constant with respect to the given argument. Namely:

import numpy as np
import sympy as sym

x=sym.symbols('x')
expr = x/x
f = sym.lambdify(x, expr, modules='numpy')
f(np.array([1,2,3]))

returns 1 rather than array([1, 1, 1]).

In other words, if the expression is constant in the arguments passed to lambdify (and you may not know in advance), then the function produced by lamdify does not return an item sized as its input as expected.

@ThePauliPrinciple
Copy link
Contributor

I don't think it is very well defined what lambdify should and should not do.

Consider for example this use-case:

from sympy *
x, i = symbols("x i")
s = Sum(Indexed('x', i), (i, 0, 2))
print(lambdify(x, s, modules='numpy')([1,2,3]))
expr = s/s
print(lambdify(x, expr, modules='numpy')([1,2,3]))

Here one may perfectly well expect just a single number (the s/s division might come from some simplification). There would not be a way for lambdify to know which case it is dealing with (yours or the above) because all the information is lost by the time it arrives there.

Maybe this helps for your particular case:

from sympy.abc import x,y
from sympy import Mul, Pow, lambdify
import numpy as np

expr = Mul(x, Pow(x,-1), evaluate = False)
f = lambdify(x, expr, modules='numpy')
f(np.array([1,2,3]))

@callegar
Copy link
Author

callegar commented Sep 9, 2022

If lambdify is meant to work with modules='numpy' then I think that it should play as consistently as possible with what numpy does for its vectorized functions. In numpy if you define

import numpy as np

@np.vectorize
def f(x):
    return 1 

that is if you try to vectorize a function whose return value is based on an expression that is constant wrt the function argument and then you look for f([1,2,3]) you get array([1, 1, 1])

Furthermore, having the output shape depending on the expression, where the expression may be unknown at the time of coding (i.e., dynamically derived by the code execution depending on its inputs) adds complexity because it compels you to check the result of the invocation of the lamdified function before using it (you cannot be sure whether it will be a scalar or an array). Keeping things unsimplified to solve the issue seems rather inefficient in many cases (unless some trick is used during the lambdification to avoid the overhead of computing a more complex expression than needed).

@bjodah
Copy link
Member

bjodah commented Sep 9, 2022

This is a recurring issue, since lambdify is akin to a swiss army knife, it's hard to get it to play nicely/intuitively with all inputs. If you manually call numpy.vectorize on the result of lambdify, do you achieve what you wanted?

I agree that we need a tool (multiple tools probably) that perform less magic compared to lambdify, I think it's mostly a matter of someone stepping up and writing those.

@callegar
Copy link
Author

callegar commented Sep 9, 2022

lambdify is akin to a swiss army knife, it's hard to get it to play nicely/intuitively with all inputs

I agree to this and understand the difficulty. Still the status quo seems to me not really a good compromise, because when expr is generated dynamically, there may be a requirement to explicitly check the output for being scalar/vector, while always having an output like the input would make the behavior simpler to understand and deal with.

@bjodah
Copy link
Member

bjodah commented Sep 9, 2022

Sure, I'm not against fixing this. Have you seen #5642?

Feel free to open a pull request based on one of the suggested fixes there. It's often easier to discuss possible remedies when we have some code to look at. I wouldn't be surprised if we need to require modules='numpy-broadcast' or something similar to make sure we don't break backwards compatibility.

@asmeurer
Copy link
Member

This is a duplicate of #5642. See the discussion there for more details about this issue and potential fixes.

@bjodah
Copy link
Member

bjodah commented Oct 11, 2022 via email

@asmeurer
Copy link
Member

I wouldn't be surprised if we need to require
modules='numpy-broadcast' or something similar to make sure we don't
break backwards compatibility.

I'm more concerned about performance than backwards compatibility. Although we can symbolically detect when it is needed and the situations are rare (only when a symbol doesn't appear in an expression, or when the expression is an array-of-arrays type construct).

@bjodah
Copy link
Member

bjodah commented Oct 12, 2022

wow, that email reply I made took almost a month to trickle through github's servers...

Yes, handling broadcasting in Python code can be surprisingly expensive, but I guess we would need benchmarks either way to say for sure.

@asmeurer
Copy link
Member

The experience with #14671 has made me very cautious. Even calling array on all inputs caused a major performance regression. I expect calling broadcast_arrays will be even worse.

Also, I just checked, and Numba doesn't support np.broadcast_arrays. So if we include it, the lambdified functions won't be jittable anymore.

So I'm definitely starting to favor the idea of adding this as a separate, non-default option/module/printer (not sure which).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants