Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hangs: integrate((3*x**3-x**2+2*x-4)/sqrt(x**2-3*x+2), (x, 0, 1)) #22863

Closed
certik opened this issue Jan 14, 2022 · 13 comments · Fixed by #23608
Closed

Hangs: integrate((3*x**3-x**2+2*x-4)/sqrt(x**2-3*x+2), (x, 0, 1)) #22863

certik opened this issue Jan 14, 2022 · 13 comments · Fixed by #23608

Comments

@certik
Copy link
Member

certik commented Jan 14, 2022

The Integral(...).n() works, but the same integrate(...) hangs:

In [1]: Integral((3*x**3-x**2+2*x-4)/sqrt(x**2-3*x+2), (x, 0, 1))
Out[1]: 
1                       
⌠                       
⎮    3    2             
⎮ 3⋅x  - x  + 2⋅x - 4   
⎮ ─────────────────── dx
⎮     ______________    
⎮    ╱  2               
⎮  ╲╱  x  - 3⋅x + 2     
⌡                       
0                       

In [2]: i = Integral((3*x**3-x**2+2*x-4)/sqrt(x**2-3*x+2), (x, 0, 1))

In [3]: i.n()
Out[3]: -2.98126694400554

In [4]: integrate((3*x**3-x**2+2*x-4)/sqrt(x**2-3*x+2), (x, 0, 1))
[HANGS]
@oscarbenjamin
Copy link
Contributor

It hangs in manulintegrate somewhere.

@smichr
Copy link
Member

smichr commented Jan 15, 2022

Perhaps assumption related? Using var('x', nonnegative=True) quickly returns an unevaluated integral.

skirpichev added a commit to skirpichev/diofant that referenced this issue Jan 16, 2022
@geekboi777
Copy link

geekboi777 commented Jan 18, 2022

#22863
Well, I believe that this may be due to the fact that SymPy initially takes up a lot of time finding for elementary anti-derivatives using the reisch algorithm or other available algorithms before it tries for the most suitable ones for definite integrals.

Because, using the Meijer G methods actually prevented the hanging and gave the correct output.

Screenshot 2022-01-19 at 2 54 30 AM

Well its one speculation from my side. There may be other reasons for the extra time consumption.

@asmeurer
Copy link
Member

It's a known issue that heurisch can hang, but manualintegrate shouldn't hang. It should be investigated. manualintegrate is done before meijerg because it's generally faster. It's likely it is stuck in an infinite loop of sort.

@asmeurer
Copy link
Member

Actually for me in master, the integral returns unevaluated in about 20 seconds. @certik were you using master or a release version of SymPy?

The fact that it can't do the integral isn't too surprising. SymPy isn't very good at algebraic integrals.

@certik
Copy link
Member Author

certik commented Jan 18, 2022

I was using c907524. You are right, it returns an unevaluated integral in about 10s. I thought it hung for some reason.

@oscarbenjamin
Copy link
Contributor

It still shouldn't be this slow.

@certik
Copy link
Member Author

certik commented Jan 19, 2022

Indeed it shouldn't. We can slowly over time build some of these algorithms in SymEngine, which can represent things very efficiently.

@asmeurer
Copy link
Member

I'm guessing it's slow because of heurisch, which is a known issue. Pretty much any integral that can't be computed in closed form is slow there. Part of it is slow symbolics (especially slow multivariate polys and slow matrices), but I also wonder if we can't somehow make the algorithm do better at failing faster.

I would close this as a duplicate of many other issues, that is, unless the integral does in fact have a closed form, in which case we should leave it open until we are able to compute it.

@asmeurer
Copy link
Member

I guess the indefinite integral is indeed computable in closed-form, so we should leave this open at least for that https://www.wolframalpha.com/input/?i=integrate+%283*x**3-x**2%2B2*x-4%29%2Fsqrt%28x**2-3*x%2B2%29&dataset=. If anything, we can add it to the failing integrals test file.

@oscarbenjamin
Copy link
Contributor

I would close this as a duplicate of many other issues

I think it would be reasonable to collect together many integrals into a single issue rather than just close as a duplicate. For example an issue with many different cases where heurisch is slow. I do intend to make heurisch faster and the parts of it that are slow are not always the same so having more examples in issues is helpful.

The things that can be slow in heurisch are:

  1. Solving a big system of linear equations. This used to be the slowest part in most cases but has been made a lot faster since it uses DomainMatrix. However I think it is still slow in some cases due to using the dense DomainMatrix routines rather than the sparse Matrix.rref.
  2. Computing derivatives. This is slow due to using Expr which has slow differentation in many cases. The slowness here is primarily due to automatic evaluation. This would be made faster by computing derivatives in the poly representation. Apparently that isn't done already because of polynomial gcd being slow.
  3. Converting to and from Expr. This is sometimes the main slow part. The conversion is done so that derivatives and cancellation of rational functions can be done with Expr in between solving the linear systems of equations. The code notes that it is better to do these conversions because polynomial gcd is slow.

Improving polynomial gcd and working purely in the poly representation might solve both 2 and 3 immediately. Some work can be done to improve 1 as well but in most cases that would only make a noticeable difference if 2 and 3 were made faster first.

In any case the issue here is caused by slowness in manualintegrate. You can check this with the profiler from isympy like:

In [1]: %prun -s cumulative integrate((3*x**3-x**2+2*x-4)/sqrt(x**2-3*x+2), (x, 0, 1))

         67007033 function calls (61730753 primitive calls) in 52.704 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   2567/1    0.208    0.000   52.802   52.802 {built-in method builtins.exec}
        1    0.000    0.000   52.671   52.671 integrals.py:1400(integrate)
      6/1    0.000    0.000   52.671   52.671 integrals.py:380(doit)
      6/1    0.001    0.000   52.541   52.541 integrals.py:814(_eval_integral)
        1    0.000    0.000   49.555   49.555 manualintegrate.py:1636(manualintegrate)
    197/1    0.008    0.000   49.552   49.552 manualintegrate.py:1229(integral_steps)
    383/1    0.002    0.000   49.529   49.529 core.py:80(do_one_rl)
   1099/4    0.003    0.000   49.529   12.382 core.py:61(null_safe_rl)
     66/1    0.003    0.000   48.243   48.243 manualintegrate.py:338(_alternatives)
     66/5    0.001    0.000   45.965    9.193 manualintegrate.py:1112(substitution_rule)
   131/22    0.001    0.000   40.285    1.831 core.py:90(switch_rl)
   264/44    0.001    0.000   35.949    0.817 core.py:28(conditioned_rl)
   264/44    0.002    0.000   35.554    0.808 manualintegrate.py:298(_rewriter)
       11    0.000    0.000   34.009    3.092 manualintegrate.py:521(add_rule)
       11    0.000    0.000   33.996    3.091 manualintegrate.py:523(<listcomp>)
1678301/1191372    2.380    0.000   29.783    0.000 cache.py:67(wrapper)
   111/55    0.001    0.000   29.348    0.534 manualintegrate.py:527(mul_rule)
22743/2314    0.234    0.000   24.537    0.011 basic.py:1642(match)
28743/2217    0.790    0.000   24.097    0.011 operations.py:177(_matches_commutative)
20639/6186    0.059    0.000   22.695    0.004 mul.py:1039(matches)
8104/2804    0.016    0.000   21.466    0.008 add.py:546(matches)
4290/1956    0.028    0.000   17.347    0.009 power.py:1572(matches)
38996/27124    0.548    0.000   16.030    0.001 operations.py:46(__new__)
       71    0.003    0.000   15.055    0.212 manualintegrate.py:771(quadratic_denom_rule)
      131    0.016    0.000   14.891    0.114 manualintegrate.py:416(special_function_rule)
967617/326122    0.631    0.000   14.637    0.000 assumptions.py:477(getit)
133704/111944    0.206    0.000   14.430    0.000 decorators.py:224(_func)
276719/7419    1.808    0.000   14.337    0.002 assumptions.py:489(_ask)
19646/14900    0.690    0.000   14.321    0.001 mul.py:191(flatten)
130046/111759    0.162    0.000   14.030    0.000 decorators.py:99(binary_op_wrapper)
4565/2790    0.043    0.000   10.963    0.004 function.py:452(__new__)
34960/31562    0.079    0.000   10.836    0.000 expr.py:233(__truediv__)
4566/2791    0.064    0.000   10.624    0.004 function.py:272(__new__)
26428/23184    0.180    0.000   10.163    0.000 simplify.py:350(signsimp)
    19548    0.389    0.000    9.722    0.000 mul.py:1205(_combine_inverse)
       66    0.002    0.000    9.330    0.141 manualintegrate.py:205(find_substitutions)
      290    0.006    0.000    8.374    0.029 manualintegrate.py:208(test_subterm)
      855    0.027    0.000    7.853    0.009 polytools.py:6649(cancel)
      557    0.005    0.000    6.783    0.012 expr.py:3727(cancel)
13530/10758    0.021    0.000    6.306    0.001 expr.py:831(_eval_is_negative)

The call to heurisch is way down the list in this case:

        6    0.000    0.000    1.035    0.173 heurisch.py:109(heurisch_wrapper)
     12/6    0.001    0.000    1.035    0.172 heurisch.py:290(heurisch)

@oscarbenjamin
Copy link
Contributor

The closed form antiderivative from WA is

In [10]: integrand = (3*x**3-x**2+2*x-4)/sqrt(x**2-3*x+2)

In [11]: anti = (2*sqrt(x**2 - 3*x + 2)*(8*x**2 + 26*x + 101) - 135*atanh((3 - 2*x)/(2*sqrt(x**2 - 3*x + 2)))) / 16

In [12]: cancel(integrand - anti.diff())
Out[12]: 0

@asmeurer
Copy link
Member

In any case the issue here is caused by slowness in manualintegrate. You can check this with the profiler from isympy like:

Indeed. You can also confirm this by trying the integral with manual=True and with heurisch=True. The former takes about as long as the integral with no flags. heurisch=True only takes about 150 ms. So we should investigate this further, as performance issues with manualintegrate aren't as well known as with heurisch. Based on pyinstrument, it seems to be trying a lot of different rules, and also spending some time in matching, which could probably be faster. The manualintegrate code is very recursive, so it's hard to profile effectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants