New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lambdify of constant functions #5642
Comments
I can confirm this issue. I think we could fix the issue better than using the ways that you suggest. Remember that you can edit the lambdify code to fix this (that's why you reported the issue to us, isn't it). lambdify(x, 2) basically returns a wrapper around lambda x: 2 (see lambdastr in sympy.utilities.lambdify). Thus, I think we could add vectorization in the constant case to the lambda expression itself. Something like
where printer=numpy_lambdastr would be sent to lambdastr when numpy is used in lambdify. I hope I am using vectorize properly here. It seems to work:
numpy_lambdastr would recursively call the normal lambdastr for the internal lambda. I don't know much about numpy, so you or someone else should look to see if this looks like a good solution. Also, I admit I don't know much about lambdify (I basically just read the code for the first time just now, and I've never used it in practice). Would it be better to have a lambda inside the vectorize (outer) lambda, or to have a lambdified function in there? The latter would involve recursively calling lambdify() instead of recursively calling lambdastr(). Status: Accepted Original comment: http://code.google.com/p/sympy/issues/detail?id=2543#c1 |
So I had a little chat with some people here at the SciPy conference, one of whom is a NumPy developer, and their opinion is that you should take advantage of this trivial case and of broadcasting and use the scalar answer returned to your advantage (it has less memory). Original comment: http://code.google.com/p/sympy/issues/detail?id=2543#c2 |
Original comment: http://code.google.com/p/sympy/issues/detail?id=2543#c3 |
I am affected by this behavior since years. Should I try to submit a PR? I documented my perspective to the problem: |
Interesting. How would you propose to fix the problem. I wonder if there is a better numpy function you could use than the |
This looks relevant. |
That link solved my problem. Thank you! Still I consider this as workaround. IMHO it would be cleaner if lambdify (optionally) would return a function which is vectorized independently of the expression (which might depend on runtime data). Idea (not veryfied):
.
|
@bjodah @moorepants what do you think? Have you ever been affected by this issue? |
Kind of, if But consider also that |
Is there a recommended workaround for this issue? I'm attempting to plot some vector fields, and |
Here is a simple example, for reference >>> lambdify(Matrix([x, y]), Matrix([x, y]))(np.array([1, 2]), np.array([2, 3]))
array([[[1, 2]],
[[2, 3]]])
>>> lambdify(Matrix([x, y]), Matrix([x, 2]))(np.array([1, 2]), np.array([2, 3]))
array([[array([1, 2])],
[2]], dtype=object) |
I think to fix this we just need to modify the NumPyPrinter for Matrix. Right now it just prints it directly as an array: sympy/sympy/printing/pycode.py Lines 405 to 409 in 09a1cb3
But it needs to be smarter. I'm not sure what the proper way to print this in general should be. As a workaround, you might be able to use a wrapper. See the above comments. |
Unfortunately, I'm not able to get a jagged array out of lambdify in order to do the broadcasting recommended earlier, as it crashes during the evaluation. Here's a short working example: import sympy as sp
import numpy as np
import matplotlib.pyplot as plt
def draw_vector_field(f, x, **kwargs):
sim_dim = 20j
X2, X1 = np.mgrid[-2:2:sim_dim, -2:2:sim_dim]
# Create a computational function from the symbolic one
f_comp = sp.lambdify(x, f)
# Compute the numerical values of the function
F = f_comp(X1, X2)
print(F.shape)
F1, F2 = F[0], F[1]
# Eliminate excess dimension of size 1
F1 = np.squeeze(F1, axis=0)
F2 = np.squeeze(F2, axis=0)
fig0, ax0 = plt.subplots()
ax0.quiver(X1, X2, F1, F2, **kwargs)
x = sp.Matrix(sp.symbols('x_0, x_1'))
f1 = sp.Matrix([[x[0]],[x[1]]])
f2 = sp.Matrix([[0],[-x[1]]])
# This works
draw_vector_field(f1, x)
plt.show()
# This fails
draw_vector_field(f2, x)
plt.show() Outputs:
I guess one workaround might be to loop through all entries of |
I met this problem today, and the search lead me here. Is there any conclusion? thanks import numpy as np
import sympy as sp
def lambdify(symbols, expression):
"""Specialized lambdify routine avoiding the unused argument issue.
A common sympy/python lambda expression 'f = lambda x: 2' has the
problem that f(x) returns a float 2 even if x was a numpy array.
Thus we loose all information on the shape of x.
"""
if type(symbols) not in [type((1,)), type([])]:
symbols = [symbols]
f = sp.lambdify(symbols, expression, "numpy")
if not all([ i in expression for i in symbols ]):
f = np.vectorize(f)
return f
t = sp.symbols('t', real=True)
x = sp.cos(t)
y = sp.sin(t)
z = t - t
R = sp.Matrix([x, y, z])
lambdaR = lambdify(t, R)
s = np.arange(0, 2*np.pi, np.pi/4)
cor = lambdaR(s)
print(cor) says Traceback (most recent call last):
File "test_lambdify.py", line 34, in <module>
cor = lambdaR(s)
File "E:\prg\py\Anaconda3_64\lib\site-packages\numpy\lib\function_base.py", line 2576, in __call__
return self._vectorize_call(func=func, args=vargs)
File "E:\prg\py\Anaconda3_64\lib\site-packages\numpy\lib\function_base.py", line 2655, in _vectorize_call
res = array(outputs, copy=False, subok=True, dtype=otypes[0])
ValueError: setting an array element with a sequence. |
I got caught by this issue recently myself. I used my_grad = sympy.zeros(len(params), 1)
for i in range(len(params)):
my_grad[i] = sympy.diff(my_func, params[i]) If all elements of |
Judging from the comments here (over many years) it seems that this is an ongoing problem for many users. I don't really know what lambdify is or what it is used for though. The docs say
https://docs.sympy.org/latest/modules/utilities/lambdify.html#sympy.utilities.lambdify.lambdify However @bjodah's comment above makes it seem like we can't just assume we are dealing with numpy which makes it difficult to implement a numpy-specific fix. Does anyone have a clear idea of what the fix should be? |
This is also related to #18824 |
I wonder if lambdify should use a wrapper like the one given by @rouckas automatically whenever numpy is a module. |
I got caught by this issue as well. With
doing
while doing
Any suggested workaround? |
@EmilMadsen90 The problem in your case is in the definition of
I assume that by |
Thanks for the fast answer. I see that my examble was maybe a bit too simple based on what I'm actually doing. Sorry for that.
|
If I understand your description, you want to evaluate I see two possible solutions: first, you can apply the "hack" I suggested before to each element of N as follows: or, I would suggest reorganizing the arrays so that f and g are always applied to pairs of values and use numpy.vectorize to do the vectorization.
now
so the first matrix is at index |
I guess that the reason this issue remains unresolved is that it is not generally possible to apply numpy's broadcasting rules in lambdify. Otherwise the broadcast suggestion about would be the obvious approach: def broadcast(fun):
return lambda *x: numpy.broadcast_arrays(fun(*x), *x)[0] That should solve all the simple cases where lambdify is used to apply a scalar function to an array of values efficiently which I think is what most complaints above are about. The problem though is that lambdify is not always used to broadcast a scalar function and we can have things like: In [17]: X = IndexedBase('X')
In [18]: expr = X[1] + X[2]
In [19]: expr
Out[19]: X[1] + X[2]
In [20]: f = lambdify(X, expr, modules='numpy')
In [21]: A = np.array([1, 2, 3])
In [22]: f(A)
Out[22]: 5 Here although the input is an array of shape In [31]: B = MatrixSymbol('B', 3, 1)
In [32]: expr = B.T * B
In [33]: f = lambdify(B, expr, modules='numpy')
In [34]: f(np.array([1, 2, 3]))
Out[34]: 14
In [35]: f(np.array([[1], [2], [3]]))
Out[35]: array([[14]]) This means that where we allow an array to be substituted for an This mixed interpretation of array inputs as sometimes being for broadcasting a scalar expression and other times to replace a non-scalar component of the expression seems quite poorly defined to me. I'm sure that there must be other edge cases where it isn't really possible to come up with a unique sensible choice of output. To me it would seem cleaner if there were separate functions for broadcasting scalar expressions vs substituting arrays for non-scalars rather than having both actions in the same function |
The matrix question has come up before #19259. I wonder if there is something we can do to improve that, or at least document the correct way to achieve it. |
With my use of lambdify in the past years I'm surprised not to have ran into this issue sooner. Now that I have, and find no practical workaround (for my application), I am extremely frustrated. |
Does the broadcast workaround not work? def broadcast(fun):
return lambda *x: numpy.broadcast_arrays(fun(*x), *x)[0] |
Maybe if all symbols given to |
Since I've been living with this issue for a little bit, I thought I'd share one of my work-arounds for the matrix case as a third alternative to the two already shared by @rouckas def lambdify_matrix(M, syms, xs):
assert(isinstance(M, Matrix))
return np.block([[[np.broadcast_to(lambdify(syms, M[i,j])(xs), xs.shape)]
for j in range(M.cols)] for i in range(M.rows)]) The second of @rouckas isn't really usable for me due to the performance impact of using |
Thanks @oscarbenjamin for this quick answer. Here is a minimal example for my application:
Here it just builds a collection of parametric matrices, which I would ultimately want in a (n,m,nx) or (nx,n,m) array. As written here, I end up with an array of objects containing some integers and some arrays of floats. The simple broadcast workaround fails when the function is called because of dimensions mismatch. |
It should be possible to make this work by modifying the lambdify printer for Matrix so it gives @akirakyle's code (except just include the recursive lambdified printed code for each element). |
I tried to implement a class like I suggested 20 days ago, but the result is disappointing. It doesn't really scale to something with more than one variable, and may become complicated.
This approach seems to work and appears to be easier to implement for a flexible context. |
I'm not sure I understand what you mean. The title of the issue is about constant functions like |
I think the point is that in practice, people are using It's true if you do something like |
The primary issue here, as I understand it, is in lambdifying matrices, where the variables in the matrix are themselves are arrays. When >>> import numpy as np
>>> np.array([np.array([1, 2]), np.array([3, 4])])
array([[1, 2],
[3, 4]])
>>> np.array([np.array([1, 2]), 3])
<mypython-5>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
np.array([np.array([1, 2]), 3])
array([array([1, 2]), 3], dtype=object) When SymPy lambdifies something like >>> import inspect
>>> print(inspect.getsource(lambdify(x, Matrix([[x, 1], [1, 0]]))))
def _lambdifygenerated(x):
return (array([[x, 1], [1, 0]])) What we really want is to broadcast each list element so that it treats it in the first way. I think the issue here is the So we should either:
Personally I think the former option is better for a fix in SymPy, although for a user workaround until this is fixed, the latter is the best approach. SymPy also is capable of detecting if the inputs are constant or not with respect to inputs, so it can simplify the broadcasting calls to only those entries that need it. For example, for array([[x, full_like(x, 1)], [full_like(x, 1), full_like(x, 0)]]) In general, it would need to broadcast the inputs to get the shape for the constant expressions like shape = broadcast([x1, x2, ...]).shape # x1, x2, ... are the inputs to the lambdified function
return array(...) # Where ... uses full(..., shape) for constant inputs, or broadcast_to for inputs that use only a subset of the input variables Things to get more complicated when you have multiple variables, because some expressions might have only a subset of them, but those expressions should still be broadcast to the shape for all the input variables. One potential issue I see with this is that it assumes that the input always must be broadcast to a common shape. But I'm not sure if there might be cases when you wouldn't want this. For example, if the input is a completely constant matrix, it might be surprising if (*) Except for the n x m matrices created in this way, you end up with shapes like |
Hi @asmeurer , and thank you for all the effort you spend on this issue.
This currently works great as long as all symbols appear in the expression. But this situation a numpy user would expect I hope this helps.
You make a good point here. I think this currently ends up |
The reason I don't see this as an issue is that in this case, things should broadcast correctly. The resulting shapes of I suppose you are expecting here for lambdify to act like a >>> from sympy.utilities.autowrap import ufuncify
>>> ufunc = ufuncify((x, y, z), x*z)
>>> ufunc(xs[:,None,],ys[None,:],zs).shape
(10, 20)
I would expect printing the appropriate |
This works around an issue with sympy where its lambda returns a scalar instead of an array when the expression evaluates to a scalar: sympy/sympy#5642
* Symbolic parametric pulse This PR makes following changes. 1. add `_define` method to ParametricPulse. This method is expected to be implemented by each ParametricPulse subclass. Subclass must return Sympy or symengine symbolic equation that can be serialized. This method is called once in the constructor to fill `.definition` attribute of the instance. This definition is used for QPY serialization. 2. change behavior of `get_waveform` This method is originally implemented as abstractmethod that calls another callback that generates numpy array of samples (thus not serializable). Now this is a baseclass method that generates waveform array from the symbolic equation. 3. minor updates Now pulse parameters are not direct class attribute. Parameter names are defined as class attribute `PARAMS_DEF` and values are stored as instance variable `param_values`. This is good for writing serializer since it doesn't need to know attribute of each subclass, but just can call name def and values to get data to serialize. * Improve performance of parametrized pulse evaluation * Implement Constant pulse using Piecewise This works around an issue with sympy where its lambda returns a scalar instead of an array when the expression evaluates to a scalar: sympy/sympy#5642 * Fall back to sympy for parametric pulse evaluation * Use sympy Symbol with sympy lambdify * WIP: cache symbolic pulse lambda functions * move lambdify to init subclass for called once * turn parameter properties into attribute and remove slots * remove attribute and add __getattr__ and readd slots for better performance. * move symbolic funcs directly to _define * Convert numerical_func to staticmethod Co-authored-by: Will Shanks <willshanks@us.ibm.com> * remove symbolic add constraints, numerical func is renamed to definition for better integration with circuit instruction * revert changes to parametric pulse and create new symbolic pulse file * add ITE-like program * sympy runtime import * update test notebook * add attribute docs * fix non-constraints * remove pulse type instance variable * use descriptor * remove redundant comment * Update - Add symbolic pulse to assemble - Add symbplic pulse to parameter manager - Remove risefall ratio from GaussianSquare paramters * keep raw data for QPY encoding * update unittest * update helper function name * add reno * remove notebook * fix lint * fix unittest and logic * add more docs * review comments * lint fix * fix documentation * minor drawer fix * documentation upgrade Co-authored-by: Daniel J. Egger <38065505+eggerdj@users.noreply.github.com> * review comment misc Co-authored-by: Will Shanks <willshanks@us.ibm.com> * remove abstract class methods This commit removes class methods for symbolic expression so that SymbolicPulse can be instantiated by itself. And descriptors only saves Dict[Expr, Callable], i.e. lambda cache, rather than enforcing name-based mapping. * add error handling for amplitude * treat amp as a special parameter * Remove expressions for amplitude validation * support symengine - symengine doesn't support lamdify of complex expression, e.g. DRAG pulse - symengine doesn't support Less expression with complex value substitution - symengine Lambda function doesn't support args in mixture of complex and float object To overcome these, envelope expressions are separately stored for I, Q channel, and evaluation of 'amp' is excluded from symbolic operation. Thus in this implementation symbols are all real numbers. * use real=False option * review comment Co-authored-by: Will Shanks <willshanks@us.ibm.com> * - fix attribute - duration and amp are required - instantiate subclass with expression - remove __dict__ usage in __getattr__ and readd __slots__ - fix documentation - update private attirbute names * undo change to requirements-dev * fix __getattr__ mechanism Directly accessing to the instance variable crashes copy and deepcopy because when the object is copied the __init__ is not called before the first getattr is called. Then copied instance tries to find missing (not initialized) attribute in recursive way. use of __getattribute__ fixes this problem. * fix type hint reference * simplification * move amp from constructor to `parameters` dict * review comment Co-authored-by: Will Shanks <willshanks@us.ibm.com> * fix bug - use getattr in __eq__; e.g. Waveform doesn't have envelope - fix parameter maanger in addition, this replaces _param_vals and _param_names tuple with _params dictionary because overhead of generating a dict is significant. * fix typo * add eval_conditions to skip waveform generation * fall back to sympy lamdify when function is not supported * documentation update Co-authored-by: Will Shanks <willshanks@us.ibm.com> Co-authored-by: Daniel Egger <38065505+eggerdj@users.noreply.github.com> * replace eval_conditions with valid_amp_conditions Co-authored-by: Will Shanks <willshanks@us.ibm.com> * update hashing and equality, redefine expressions more immutably * add error message for missing parameter * cleanup * check parameter before hashing * move amp check to constructor * add envelope to hash * update docs * Update qiskit/pulse/library/symbolic_pulses.py Co-authored-by: Matthew Treinish <mtreinish@kortar.org> * lint Co-authored-by: Will Shanks <willshanks@us.ibm.com> Co-authored-by: Daniel J. Egger <38065505+eggerdj@users.noreply.github.com> Co-authored-by: Matthew Treinish <mtreinish@kortar.org>
Using sympy.lambdify together with numpy arrays
has a severe issue when the lambdified function
is a constant. Let me show an example,
first I show what I would like to have
in any case:
Now assume we want to evaluate f on a grid u:
where u.shape is (10,) and the result
f(u).shape has shape (10,) too. This is what I
expect when doing numerics. Now the example where
it goes wrong:
So the function is essentially a constant independent of x.
And this is the cause for the issue:
g(u) return a single float 2 and not a numpy array anymore!
This is not what I expect in numerics, although it's obvious
from a programmers point of view. (And the same issue is present
in plain python lambda expressions.)
Possible solutions:
h(u) then gives
array([2, 2, 2, 2, 2, 2, 2, 2, 2, 2])
Drawbacks: vectorize is SLOW, see also http://www.mail-archive.com/numpy-discussion@scipy.org/msg00587.html We should really only use it when absolutly necessary.
use an additional pseudo-dependence:
This does not work because sympy simplifies the 0*x away
A possible starting point for a patch:
Original issue for #5642: http://code.google.com/p/sympy/issues/detail?id=2543
Original author: https://code.google.com/u/107490137238222069432/
The text was updated successfully, but these errors were encountered: