Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use sum/fsum & prod/fprod in some python related printers #24685

Closed
wants to merge 5 commits into from

Conversation

bjodah
Copy link
Member

@bjodah bjodah commented Feb 8, 2023

Tentative fix for recusion error when using lambdify.

References to other Issues or PRs

Fixes #24673

Brief description of what is fixed or changed

Other comments

Perhaps this needs to be opt-in with some printer setting for backwards compatibility. (lambdify can then use that setting by default). Also, I have found functools.reduce(operators.add, [...]) to be a better replacement for sum in generic python code (but of course I can't remember an example right now).

Release Notes

NO ENTRY

@sympy-bot
Copy link

sympy-bot commented Feb 8, 2023

Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

  • No release notes entry will be added for this pull request.
Click here to see the pull request description that was parsed.
Tentative fix for recusion error when using lambdify.

#### References to other Issues or PRs
Fixes #24673

#### Brief description of what is fixed or changed


#### Other comments
Perhaps this needs to be opt-in with some printer setting for backwards compatibility. (`lambdify` can then use that setting by default). Also, I have found `functools.reduce(operators.add, [...])` to be a better replacement for `sum` in generic python code (but of course I can't remember an example right now).

#### Release Notes

<!-- Write the release notes for this release below between the BEGIN and END
statements. The basic format is a bulleted list with the name of the subpackage
and the release note for this PR. For example:

* solvers
  * Added a new solver for logarithmic equations.

* functions
  * Fixed a bug with log of integers. Formerly, `log(-x)` incorrectly gave `-log(x)`.

* physics.units
  * Corrected a semantical error in the conversion between volt and statvolt which
    reported the volt as being larger than the statvolt.

or if no release note(s) should be included use:

NO ENTRY

See https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more
information on how to write release notes. The bot will check your release
notes automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
NO ENTRY
<!-- END RELEASE NOTES -->

@@ -63,6 +63,14 @@ def _print_seq(self, seq):
delimiter=', '
return '({},)'.format(delimiter.join(self._print(item) for item in seq))

def _print_Add(self, e):
return '{}({})'.format(self._module_format(self._module + '.sum'),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't this have the problem with #5642?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe? would reduce(add, ...)) make a difference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for the particular issue

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@asmeurer I tried to figure out what a regression test would look like. You don't happen to able to construct one?

@oscarbenjamin
Copy link
Contributor

Also, I have found functools.reduce(operators.add, [...]) to be a better replacement for sum in generic python code (but of course I can't remember an example right now).

In a situation where x + 0 would give a TypeError you would also get that error from sum([x, ...]) because it implicitly adds everything starting from zero. Using reduce is equivalent to sum(items[1:], items[0]) so it does not try to add the first item to 0.

@ThePauliPrinciple
Copy link
Contributor

I think it would make sense if a setting is added to the python printer which holds the value of the recursion limit and then the behaviour is switched based on whether you would hit that limit. (then you can change that limit depending on what you want)

Not sure how easy that would be though. E.g. is the limit reached also when you have a combination of multiplications and additions?

As a side node, sum also seems to work correctly on something like this:

>>> import numpy as np
>>> A = np.array([1,2])
>>> sum([1, A])
array([2, 3])
>>> 1 + A
array([2, 3])

@bjodah
Copy link
Member Author

bjodah commented Feb 8, 2023

In a situation where x + 0 would give a TypeError you would also get that error from sum([x, ...]) because it implicitly adds everything starting from zero. Using reduce is equivalent to sum(items[1:], items[0]) so it does not try to add the first item to 0.

Right, my point is that we cannot replace our current generated x + y + z with sum([x, y, z]). So we either need to use reduce(add, [x, y, z]) or like you propose(?): sum([y, z], x)

Here's an example adapted from SO:

In [1]: class p():
   ...:     def __init__(self, x, y):
   ...:         self.x=x ; self.y=y
   ...:     def __repr__(self):
   ...:         return "(%r,%r)"%(self.x,self.y)
   ...:     def __add__(self, P):
   ...:         return p(self.x+P.x, self.y+P.y)
   ...:     

In [2]: pts=[p(1,0), p(2,1), p(-3,4)]

In [3]: pts[0] + pts[1] + pts[2]
Out[3]: (0,5)

In [4]: sum(pts)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-4-45922890d470> in <module>()
----> 1 sum(pts)

TypeError: unsupported operand type(s) for +: 'int' and 'instance'

In [5]: import operator

In [6]: import functools

In [7]: functools.reduce(operator.add, pts)
Out[7]: (0,5)

In [8]: sum(pts[1:], pts[0])
Out[8]: (0,5)

@bjodah
Copy link
Member Author

bjodah commented Feb 8, 2023

I think it would make sense if a setting is added to the python printer which holds the value of the recursion limit and then the behaviour is switched based on whether you would hit that limit. (then you can change that limit depending on what you want)

It sounds expensive to try/catch a RecursionError, no?

Another option might be to add a limit on the number of arguments, above which Add is printed using sum(...) instead of a + b + .... The question is then what a reasonable default for this value would be? 2? 3? 10? 100?

@oscarbenjamin
Copy link
Contributor

I don't think RecursionError should be caught. Using reduce seems fine to me though.

@oscarbenjamin
Copy link
Contributor

The question is then what a reasonable default for this value would be?

Timings with numpy arrays:

In [32]: a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z = [np.ones(100)] * 26

In [33]: %timeit a+b+c+d+e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+y+z
20.2 µs ± 280 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [34]: %timeit reduce(add, [a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z])
22.6 µs ± 217 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [35]: %timeit sum([b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z], a)
20 µs ± 190 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [36]: %timeit a+b+c
1.71 µs ± 4.94 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [37]: %timeit reduce(add, [a,b,c])
2.38 µs ± 21.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

In [38]: %timeit sum([b,c], a)
1.99 µs ± 20.8 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

With individual floats:

In [39]: a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z = [1.0] * 26

In [40]: %timeit a+b+c+d+e+f+g+h+i+j+k+l+m+n+o+p+q+r+s+t+u+v+w+x+y+z
903 ns ± 5.48 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [41]: %timeit reduce(add, [a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z])
1.59 µs ± 8.42 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [42]: %timeit sum([b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z], a)
829 ns ± 3.05 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [43]: %timeit a+b+c
117 ns ± 0.29 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [44]: %timeit reduce(add, [a,b,c])
361 ns ± 1.96 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

In [45]: %timeit sum([b,c], a)
233 ns ± 1.07 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

I guess that with enough terms sum becomes faster than just using +. It looks like reduce is always slower.

@bjodah bjodah changed the title Use sum/fsum in some python related printers Use sum/fsum & prod/fprod in some python related printers Feb 8, 2023
@asmeurer
Copy link
Member

asmeurer commented Feb 8, 2023

This is the sort of issue you will run into with the NumPy printer:

>>> np.array([1, 2]) + np.array([[1, 2], [1, 3]])
array([[2, 4],
       [2, 5]])
>>> np.sum([np.array([1, 2]), np.array([[1, 2], [1, 3]])])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "<__array_function__ internals>", line 180, in sum
  File "/Users/aaronmeurer/anaconda3/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 2296, in sum
    return _wrapreduction(a, np.add, 'sum', axis, dtype, out, keepdims=keepdims,
  File "/Users/aaronmeurer/anaconda3/lib/python3.9/site-packages/numpy/core/fromnumeric.py", line 86, in _wrapreduction
    return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
ValueError: could not broadcast input array from shape (2,2) into shape (2,)

NumPy does implicitly treat a list of arrays as a larger array, but it doesn't implicitly broadcast those arrays. This is essentially the same problem as #5642 where we do this with matrices. It most often occurs with scalar values because those sort of happen automatically when an expression doesn't have any variables in it, but it applies to any broadcasting.

Even when it sometimes does work, you get VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. from NumPy.

By the way, the code for NumPy should be passing axis=0 to get the equivalent result as +.

>>> np.array([1, 2]) + np.array([1, 2])
array([2, 4])
>>> np.sum((np.array([1, 2]), np.array([1, 2])))
6

I think we can do this in this case by calling broadcast_arrays on the input. That's been an issue for #5642 because it's overhead to do that in every single lambdify call, but here we would only be doing it for sums that are above a certain threshold of terms.

def __exit__(self, exc_t, exc_v, exc_tb):
sys.setrecursionlimit(self._ori_limit)

# TODO, reproduce using a smaller expression. (this takes too long...)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something like this works:

lambdify(x, Add(*[x**i for i in range(10000)]))

@asmeurer
Copy link
Member

asmeurer commented Feb 8, 2023

mpmath has fsum https://mpmath.org/doc/current/general.html#fsum. According to its docstring it's faster and more accurate for more than 2 terms.

@ThePauliPrinciple
Copy link
Contributor

This is the sort of issue you will run into with the NumPy printer:

Can't you avoid this by simply using the python buildin sum instead of the numpy one?

>>> sum([np.array([1, 2]), np.array([[1, 2], [1, 3]])])
array([[2, 4],
       [2, 5]])

@@ -94,6 +94,8 @@ class AbstractPythonCodePrinter(CodePrinter):
fully_qualified_modules=True,
contract=False,
standard='python3',
num_terms_sum=42,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly, 42 sounds pretty reasonable here, but it could also be just the actual limit such that the printer writes exactly the same code now as it did before except when that code would not be valid

@asmeurer
Copy link
Member

asmeurer commented Feb 8, 2023

Can't you avoid this by simply using the python buildin sum instead of the numpy one?

Sure but sum is just going to call + a bunch under the hood. np.sum actually is more efficient, but it just needs to be done in a way that properly mimics + in all cases.

# is already better served by the mpmath printer. And finally, subclassing and
# overriding this remains an option.
if len(expr.args) >= self._settings['num_terms_sum']:
return 'sum([%s], start=%s)' % (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should use module_format and builtins.sum like here:

def _print_Sum(self, expr):
loops = (
'for {i} in range({a}, {b}+1)'.format(
i=self._print(i),
a=self._print(a),
b=self._print(b))
for i, a, b in expr.limits)
return '(builtins.sum({function} {loops}))'.format(
function=self._print(expr.function),
loops=' '.join(loops))

(e.g. clashing import names/definitions)

Actually, now that I look at it, that code looks bugged since it doesn't use module_format :)

@asmeurer
Copy link
Member

asmeurer commented Feb 8, 2023

Can't you avoid this by simply using the python buildin sum instead of the numpy one?

Sure but sum is just going to call + a bunch under the hood. np.sum actually is more efficient, but it just needs to be done in a way that properly mimics + in all cases.

Although to be honest, the difference isn't as much as I expected, even for a huge number of arrays

In [15]: arrs = [np.empty((100,)) for i in range(10000)]

In [16]: %timeit sum(arrs)
4.91 ms ± 11.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

In [17]: %timeit np.sum(arrs, axis=0)
3.66 ms ± 52 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

@bjodah
Copy link
Member Author

bjodah commented Feb 13, 2023

I'm not sure what to do with numpy.sum:

>>> (arr_5_3_4 + arr_4 + arr_3_4).shape
(5, 3, 4)
>>> numpy.sum([arr_5_3_4, arr_4, arr_3_4], axis=0).shape
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (3,) + inhomogeneous part.

I tried wrapping the arguments with a numpy.broadcast_arrays call inside numpy.sum(...) but I didn't manage to reproduce the operator.add behavior.

EDIT: numpy.sum(numpy.broadcast_arrays(...)) seems to work in cd83991.

@bjodah
Copy link
Member Author

bjodah commented Feb 13, 2023

mpmath has fsum https://mpmath.org/doc/current/general.html#fsum. According to its docstring it's faster and more accurate for more than 2 terms.

It's already there, unless I missed something?

@bjodah
Copy link
Member Author

bjodah commented Feb 20, 2023

I won't have the time to pursue fixing this PR for the foreseeable future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Recursion error when lambdifying an expression with 108k operations
5 participants