fix: treat floats like rationals in as_coeff_mul #26573

oscarbenjamin · 2024-05-07T15:56:16Z

References to other Issues or PRs

Fixes gh-26571

Brief description of what is fixed or changed

Treat floats like rationals in as_coeff_mul by default. This mirrors the behaviour of as_coeff_Mul.

Other comments

Release Notes

core
- The as_coeff_mul method now treats floats like rationals by default. This fixes some integration bugs relating to integrals with floats.

integrate((x**8+1)**(-1/2), x)

sympy-bot · 2024-05-07T15:56:23Z

✅

Hi, I am the SymPy bot. I'm here to help you write a release notes entry. Please read the guide on how to write release notes.

Your release notes are in good order.

Here is what the release notes will look like:

core
- The as_coeff_mul method now treats floats like rationals by default. This fixes some integration bugs relating to integrals with floats. (#26573 by @oscarbenjamin)

This will be added to https://github.com/sympy/sympy/wiki/Release-Notes-for-1.13.

Click here to see the pull request description that was parsed.

<!-- Your title above should be a short description of what
was changed. Do not include the issue number in the title. -->

#### References to other Issues or PRs
<!-- If this pull request fixes an issue, write "Fixes #NNNN" in that exact
format, e.g. "Fixes #1234" (see
https://tinyurl.com/auto-closing for more information). Also, please
write a comment on that issue linking back to this pull request once it is
open. -->

Fixes gh-26571

#### Brief description of what is fixed or changed

Treat floats like rationals in as_coeff_mul by default. This mirrors the behaviour of as_coeff_Mul.

#### Other comments


#### Release Notes

<!-- Write the release notes for this release below between the BEGIN and END
statements. The basic format is a bulleted list with the name of the subpackage
and the release note for this PR. For example:

* solvers
  * Added a new solver for logarithmic equations.

* functions
  * Fixed a bug with log of integers. Formerly, `log(-x)` incorrectly gave `-log(x)`.

* physics.units
  * Corrected a semantical error in the conversion between volt and statvolt which
    reported the volt as being larger than the statvolt.

or if no release note(s) should be included use:

NO ENTRY

See https://github.com/sympy/sympy/wiki/Writing-Release-Notes for more
information on how to write release notes. The bot will check your release
notes automatically to see if they are formatted correctly. -->

<!-- BEGIN RELEASE NOTES -->
* core
   * The `as_coeff_mul` method now treats floats like rationals by default. This fixes some integration bugs relating to integrals with floats.
<!-- END RELEASE NOTES -->

github-actions · 2024-05-07T21:13:50Z

Benchmark results from GitHub Actions

Lower numbers are good, higher numbers are bad. A ratio less than 1
means a speed up and greater than 1 means a slowdown. Green lines
beginning with + are slowdowns (the PR is slower then master or
master is slower than the previous release). Red lines beginning
with - are speedups.

Significantly changed benchmark results (PR vs master)

Significantly changed benchmark results (master vs previous release)

| Change   | Before [2487dbb5]    | After [26584616]    |   Ratio | Benchmark (Parameter)                                                |
|----------|----------------------|---------------------|---------|----------------------------------------------------------------------|
| -        | 70.1±0.9ms           | 44.2±0.3ms          |    0.63 | integrate.TimeIntegrationRisch02.time_doit(10)                       |
| -        | 68.3±1ms             | 42.9±0.2ms          |    0.63 | integrate.TimeIntegrationRisch02.time_doit_risch(10)                 |
| +        | 18.2±0.1μs           | 30.4±0.2μs          |    1.67 | integrate.TimeIntegrationRisch03.time_doit(1)                        |
| -        | 5.36±0.06ms          | 2.89±0.02ms         |    0.54 | logic.LogicSuite.time_load_file                                      |
| -        | 73.9±0.7ms           | 28.4±0.2ms          |    0.38 | polys.TimeGCD_GaussInt.time_op(1, 'dense')                           |
| -        | 25.6±0.1ms           | 16.8±0.03ms         |    0.66 | polys.TimeGCD_GaussInt.time_op(1, 'expr')                            |
| -        | 74.4±0.4ms           | 28.7±0.07ms         |    0.39 | polys.TimeGCD_GaussInt.time_op(1, 'sparse')                          |
| -        | 256±2ms              | 123±0.8ms           |    0.48 | polys.TimeGCD_GaussInt.time_op(2, 'dense')                           |
| -        | 256±1ms              | 124±0.8ms           |    0.49 | polys.TimeGCD_GaussInt.time_op(2, 'sparse')                          |
| -        | 652±7ms              | 373±0.9ms           |    0.57 | polys.TimeGCD_GaussInt.time_op(3, 'dense')                           |
| -        | 660±5ms              | 372±0.9ms           |    0.56 | polys.TimeGCD_GaussInt.time_op(3, 'sparse')                          |
| -        | 492±1μs              | 287±1μs             |    0.58 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(1, 'dense')            |
| -        | 1.80±0.04ms          | 1.05±0ms            |    0.58 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(2, 'dense')            |
| -        | 5.78±0.09ms          | 3.09±0.01ms         |    0.53 | polys.TimeGCD_LinearDenseQuadraticGCD.time_op(3, 'dense')            |
| -        | 452±2μs              | 231±2μs             |    0.51 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(1, 'dense')               |
| -        | 1.47±0.01ms          | 670±3μs             |    0.46 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(2, 'dense')               |
| -        | 4.88±0.03ms          | 1.63±0.02ms         |    0.33 | polys.TimeGCD_QuadraticNonMonicGCD.time_op(3, 'dense')               |
| -        | 376±3μs              | 205±1μs             |    0.55 | polys.TimeGCD_SparseGCDHighDegree.time_op(1, 'dense')                |
| -        | 2.44±0ms             | 1.25±0.01ms         |    0.51 | polys.TimeGCD_SparseGCDHighDegree.time_op(3, 'dense')                |
| -        | 10.2±0.1ms           | 4.47±0.02ms         |    0.44 | polys.TimeGCD_SparseGCDHighDegree.time_op(5, 'dense')                |
| -        | 363±0.4μs            | 173±1μs             |    0.48 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(1, 'dense')            |
| -        | 2.52±0.02ms          | 912±8μs             |    0.36 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(3, 'dense')            |
| -        | 9.66±0.02ms          | 2.66±0.02ms         |    0.28 | polys.TimeGCD_SparseNonMonicQuadratic.time_op(5, 'dense')            |
| -        | 1.04±0.01ms          | 426±3μs             |    0.41 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'dense')           |
| -        | 1.73±0ms             | 510±3μs             |    0.29 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(3, 'sparse')          |
| -        | 5.85±0.04ms          | 1.81±0.02ms         |    0.31 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'dense')           |
| -        | 8.35±0.04ms          | 1.51±0.01ms         |    0.18 | polys.TimePREM_LinearDenseQuadraticGCD.time_op(5, 'sparse')          |
| -        | 289±1μs              | 65.9±0.3μs          |    0.23 | polys.TimePREM_QuadraticNonMonicGCD.time_op(1, 'sparse')             |
| -        | 3.48±0.04ms          | 391±3μs             |    0.11 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'dense')              |
| -        | 4.09±0.02ms          | 279±0.7μs           |    0.07 | polys.TimePREM_QuadraticNonMonicGCD.time_op(3, 'sparse')             |
| -        | 7.18±0.03ms          | 1.28±0.01ms         |    0.18 | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'dense')              |
| -        | 8.87±0.04ms          | 844±7μs             |    0.1  | polys.TimePREM_QuadraticNonMonicGCD.time_op(5, 'sparse')             |
| -        | 5.24±0.1ms           | 2.98±0.02ms         |    0.57 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(2, 'sparse') |
| -        | 12.0±0.1ms           | 6.59±0.06ms         |    0.55 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'dense')  |
| -        | 22.4±0.1ms           | 9.09±0.02ms         |    0.41 | polys.TimeSUBRESULTANTS_LinearDenseQuadraticGCD.time_op(3, 'sparse') |
| -        | 5.24±0.02ms          | 872±7μs             |    0.17 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(1, 'sparse')    |
| -        | 12.6±0.04ms          | 7.07±0.04ms         |    0.56 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(2, 'sparse')    |
| -        | 103±1ms              | 25.7±0.2ms          |    0.25 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'dense')     |
| -        | 169±2ms              | 54.0±0.3ms          |    0.32 | polys.TimeSUBRESULTANTS_QuadraticNonMonicGCD.time_op(3, 'sparse')    |
| -        | 176±0.8μs            | 111±0.8μs           |    0.63 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'dense')      |
| -        | 364±1μs              | 216±3μs             |    0.59 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(1, 'sparse')     |
| -        | 4.28±0.02ms          | 843±5μs             |    0.2  | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'dense')      |
| -        | 5.36±0.1ms           | 381±2μs             |    0.07 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(3, 'sparse')     |
| -        | 20.5±0.05ms          | 2.78±0.01ms         |    0.14 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'dense')      |
| -        | 23.2±0.1ms           | 627±1μs             |    0.03 | polys.TimeSUBRESULTANTS_SparseGCDHighDegree.time_op(5, 'sparse')     |
| -        | 482±4μs              | 139±2μs             |    0.29 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(1, 'sparse') |
| -        | 4.90±0.08ms          | 621±2μs             |    0.13 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'dense')  |
| -        | 5.31±0.03ms          | 138±1μs             |    0.03 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(3, 'sparse') |
| -        | 13.4±0.1ms           | 1.29±0ms            |    0.1  | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'dense')  |
| -        | 14.1±0.08ms          | 142±1μs             |    0.01 | polys.TimeSUBRESULTANTS_SparseNonMonicQuadratic.time_op(5, 'sparse') |
| -        | 133±1μs              | 77.2±0.3μs          |    0.58 | solve.TimeMatrixOperations.time_rref(3, 0)                           |
| -        | 250±0.8μs            | 89.7±0.8μs          |    0.36 | solve.TimeMatrixOperations.time_rref(4, 0)                           |
| -        | 24.1±0.08ms          | 10.2±0.03ms         |    0.42 | solve.TimeSolveLinSys189x49.time_solve_lin_sys                       |
| -        | 28.0±0.1ms           | 15.5±0.09ms         |    0.55 | solve.TimeSparseSystem.time_linsolve_Aaug(20)                        |
| -        | 54.5±0.2ms           | 24.8±0.2ms          |    0.46 | solve.TimeSparseSystem.time_linsolve_Aaug(30)                        |
| -        | 27.9±0.2ms           | 15.0±0.08ms         |    0.54 | solve.TimeSparseSystem.time_linsolve_Ab(20)                          |
| -        | 54.1±0.3ms           | 24.4±0.2ms          |    0.45 | solve.TimeSparseSystem.time_linsolve_Ab(30)                          |

Full benchmark results can be found as artifacts in GitHub Actions
(click on checks at the top of the PR).

smichr · 2024-05-08T21:57:19Z

sympy/core/mul.py

@@ -2156,8 +2162,12 @@ def _keep_coeff(coeff, factors, clear=True, sign=False):
        return coeff
    if coeff is S.One:
        return factors
-    elif coeff is S.NegativeOne and not sign:
+    elif coeff.is_Float and equal_valued(coeff, S.One):
+        return Add(*[coeff*term for term in Add.make_args(factors)])


Suggested change

return Add(*[coeff*term for term in Add.make_args(factors)])

if (c,a:=factors.as_coeff_add())[0].is_Number and c:

return coeff*c + a

return return Add(*[coeff*term for term in Add.make_args(factors)])

And perhaps similar for below. I don't see the need to store 1.0 as a coefficient of every term in an expression that looks like x + y + 3. Isn't x + y + 3.0 sufficient?

Also, as_coeff_add should be modified consistent with as_coeff_mul, perhaps:

>>> (x+1.).as_coeff_add() (0, (1.00000000000000, x)) >>> (x+1.).as_coeff_Add() (1.00000000000000, x)

I actually thought this would break things, however. I am surprised that it didn't.

It did break some things. Hence the other changes.

It probably fixes lots of untested float-related bugs though. We don't make it easy for ourselves to handle floats in high-level routines by arbitrarily treating them differently in low-level ones.

smichr · 2024-05-09T13:56:04Z

I was distracted by the other changes the first time I looked at this. Now I am seeing that as_coeff_add doesn't have a flag to allow Rational discrimination while as_coeff_mul does. And all that you are doing is changing the default value of the flag. I would prefer to leave the flag unchanged since the notion of Rational coefficient (or slot 0 is for the Rational) is pretty well baked into SymPy (I think). All that needs to be done without breaking the behavior is to use the flag rational=True within the integration routines that are failing.

If you think the default should be changed, then as_coeff_add should be changed, too, to see how significant a change that will be in terms of codebase expectations. If it doesn't make a difference there, then perhaps my misgivings are unfounded. But if it does make a difference then we will have painted ourselves into an awkward corner, having as_coeff_mul behaving differently than as_coeff_add.

smichr · 2024-05-09T14:08:01Z

sympy/solvers/ode/tests/test_single.py

@@ -1825,7 +1825,10 @@ def _get_examples_ode_sol_separable():

    'separable_26': {
        'eq': f1 - k * (v(t) ** 2) - m * Derivative(v(t)),
-        'sol': [Eq(v(t), -68.585712797928991/tanh(C1 - 0.14288690166235204*t))],
+        'sol': [Eq(v(t), (
+            -1.0*exp(0.2857738033247041*C1 - 0.2857738033247041*t) - 1.0)


This just looks bad for a CAS. Perhaps this should be handled at some low level, but if there are already floats in a factor or term, then floats that are equivalent to ints can be shown as ints, e.g.

1.23*x + 1 not 1.23*x + 1.0
exp(1.23*x) + 1 not 1.0*exp(1.23*x) + 1.0 or 1.0*(exp(1.23*x) + 1)

Perhaps this is too fine a detail to worry about.

oscarbenjamin · 2024-05-09T14:46:33Z

But if it does make a difference then we will have painted ourselves into an awkward corner, having as_coeff_mul behaving differently than as_coeff_add.

We have already painted ourselves into an awkward corner by handling floats differently from rationals for no reason in such widely used basic routines.

There are endless float bugs in situations where a float should really be no different from a rational. A primary source is the fact that low-level functions like as_coeff_mul that are used throughout the codebase handle rationals differently for no reason. That is why I have changed the default value for the flag rather than overriding the default at particular call sites. Anywhere that uses as_coeff_mul should probably treat floats and rationals the same way.

oscarbenjamin · 2024-05-09T15:21:16Z

I just realised this needs to change the default in Mul.as_coeff_mul as well.

smichr · 2024-05-10T22:22:48Z

We have already painted ourselves into an awkward corner

The onus is on the user to select the correct flags for the function being used. We are not yet in a corner because the user can ask for the Rational coefficient if desired. Anyone writing a low level routine should be informed about the methods being used.

I suspect it will just be a matter of time before changing the default starts causing new bugs to arise because the routine thought it was handling a Rational coefficient when it was actually a Float. A case in point (as I review how as_coeff_mul is used) is in trigsimp.py where the following code appears:

trigterms = [(g.args[0].as_coeff_mul(), g.func) for g in gens
                     if g.func in allfuncs]

That is supposed to break up something like sin(3*x) into ((3,(x,)), sin) and igcd will be run on the coefficients. If those coefficients are Floats that are not equal to their int counterparts (e.g. sin(3.1 x)), that is going to raise an error. You did not run into this in tests, probably because test weren't written for cases that didn't apply -- i.e. thought being something like,

if as_coeff_mul()[0] only returns Rational then we don't have to test for Floats; and we already tested as_coeff_mul in test_expr to verify that that is the case

The seemingly innocuous change that this PR makes to test_expr is, in fact, a major change to the expectation of that low level routine. This has heretofore been an invariant: coeff will be Rational unless explicitly indicated otherwise. I do not see a good reason to change this when the routines that are failing can simply allow non-Rational to be returned instead.

oscarbenjamin added 2 commits May 7, 2024 16:43

fix: handle floats like rationals in as_coeff_mul

58d0281

integrals: add test for sympygh-26571

d1eedd0

integrate((x**8+1)**(-1/2), x)

oscarbenjamin added core floats integrals.meijerg labels May 7, 2024

oscarbenjamin added 3 commits May 7, 2024 19:01

fix(core): apply ndiv to floats as well

53f91da

test: update dsolve tests

45eee96

flake8: remove unneeded import

e011d90

smichr reviewed May 8, 2024

View reviewed changes

oscarbenjamin added 2 commits May 8, 2024 23:20

fix(core): avoid distributing float if possible

a5bf7d2

fix(core): avoid distributing -1.0 if possible

911afef

smichr reviewed May 9, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: treat floats like rationals in as_coeff_mul #26573

fix: treat floats like rationals in as_coeff_mul #26573

oscarbenjamin commented May 7, 2024

sympy-bot commented May 7, 2024

github-actions bot commented May 7, 2024 •

edited

smichr May 8, 2024

smichr May 8, 2024

oscarbenjamin May 8, 2024

smichr commented May 9, 2024

smichr May 9, 2024

oscarbenjamin commented May 9, 2024

oscarbenjamin commented May 9, 2024

smichr commented May 10, 2024

-        return Add(*[coeff*term for term in Add.make_args(factors)])
+        if (c,a:=factors.as_coeff_add())[0].is_Number and c:
+            return coeff*c + a
+        return return Add(*[coeff*term for term in Add.make_args(factors)])

fix: treat floats like rationals in as_coeff_mul #26573

Are you sure you want to change the base?

fix: treat floats like rationals in as_coeff_mul #26573

Conversation

oscarbenjamin commented May 7, 2024

References to other Issues or PRs

Brief description of what is fixed or changed

Other comments

Release Notes

sympy-bot commented May 7, 2024

github-actions bot commented May 7, 2024 • edited

smichr May 8, 2024

Choose a reason for hiding this comment

smichr May 8, 2024

Choose a reason for hiding this comment

oscarbenjamin May 8, 2024

Choose a reason for hiding this comment

smichr commented May 9, 2024

smichr May 9, 2024

Choose a reason for hiding this comment

oscarbenjamin commented May 9, 2024

oscarbenjamin commented May 9, 2024

smichr commented May 10, 2024

github-actions bot commented May 7, 2024 •

edited