Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

factor produces wrong output #26497

Closed
iarayad opened this issue Apr 13, 2024 · 17 comments · Fixed by #26515
Closed

factor produces wrong output #26497

iarayad opened this issue Apr 13, 2024 · 17 comments · Fixed by #26515
Labels

Comments

@iarayad
Copy link

iarayad commented Apr 13, 2024

sympy.factor(expr) produces the wrong output if an underscore is used in the name of one of the symbols in expr.
EDIT: The issue is more general. The correctness of the factorization output depends on the names of the variables. Using symbol names with more than one character also changes the output, for example.

Observed behavior:
For example,

a_n, m = sympy.symbols('a_n m', real=True, positive=True)
expr = (-a_n**4 *m**2 / (a_n**2 + 1) - 2 * a_n**4 *m / (a_n**2 + 1) - a_n**4 - a_n**4 / (a_n**2 + 1) + 2 * sympy.I * a_n**3 *m**2 / (a_n**2 + 1) + 4 * sympy.I * a_n**3 *m / (a_n**2 + 1) + 2 * sympy.I * a_n**3 + 2 * sympy.I * a_n**3 / (a_n**2 + 1) + 2 * sympy.I * a_n *m**2 / (a_n**2 + 1) + 4 * sympy.I * a_n *m / (a_n**2 + 1) + 2 * sympy.I * a_n + 2 * sympy.I * a_n / (a_n**2 + 1) +m**2 / (a_n**2 + 1) + 2 *m / (a_n**2 + 1) + 1 + 1 / (a_n**2 + 1))

gives different numerical results if I factor expr and replace a_n and m with numbers.
expr.subs({a_n: 1, m: 1}) produces 12i, whereas expr.factor().subs({a_n: 1, m: 1}) produces -6.
A correct behavior would result in 12i, or equivalent numerical expressions, in both cases.

On the other hand, defining

a_n, m = sympy.symbols('a m', real=True, positive=True)

correctly produces the equivalent outputs 12i in both cases.

Sympy version
I observe this in the stable branch, but also in 1.12 and 1.10. Didn't check other versions.

Other modifications that produce the same error
Using a_n, m = sympy.symbols('an m', real=True, positive=True), so a symbol with a name of more than one character, also reproduces the error.

skirpichev added a commit to skirpichev/diofant that referenced this issue Apr 13, 2024
@akhmerov
Copy link
Contributor

akhmerov commented Apr 14, 2024

The bug seems to be related to factoring of multivariate expressions including sympy.I. A shorter reproduction example is

import sympy
a, b = sympy.symbols("a b")
((b - sympy.I)**2 * (b + sympy.I) * (a + 1)).expand().factor()

which produces an incorrect expression $(a+1)(b^2 + 1)$. The originally reported role of variable naming seems to be related to the ordering of terms or factors: swapping $a$ and $b$ in the expression above makes the output correct.

skirpichev added a commit to diofant/diofant that referenced this issue Apr 14, 2024
@oscarbenjamin
Copy link
Contributor

This is a serious bug.

It has been around since SymPy started factorising such polynomials automatically: d3eab46.

Before that you would need to use extension=True to factorise this. Now all mehods give the same:

In [1]: import sympy
   ...: a, b = sympy.symbols("a b")
   ...: e = ((b - sympy.I)**2 * (b + sympy.I) * (a + 1))

In [6]: e.expand().factor()
Out[6]: 
        ⎛ 2    ⎞
(a + 1)⋅⎝b  + 1In [7]: e.expand().factor(extension=True)
Out[7]: 
        ⎛ 2    ⎞
(a + 1)⋅⎝b  + 1In [8]: e.expand().factor(domain=QQ_I)
Out[8]: 
        ⎛ 2    ⎞
(a + 1)⋅⎝b  + 1In [11]: e.expand().factor(domain=QQ.algebraic_field(I))
Out[11]: 
        ⎛ 2    ⎞
(a + 1)⋅⎝b  + 1

There must be a bug in the code for factorisation over algebraic fields:

def dmp_ext_factor(f, u, K):
"""Factor multivariate polynomials over algebraic number fields. """
if not u:
return dup_ext_factor(f, K)
lc = dmp_ground_LC(f, u, K)
f = dmp_ground_monic(f, u, K)
if all(d <= 0 for d in dmp_degree_list(f, u)):
return lc, []
f, F = dmp_sqf_part(f, u, K), f
s, g, r = dmp_sqf_norm(f, u, K)
factors = dmp_factor_list_include(r, u, K.dom)
if len(factors) == 1:
factors = [f]
else:
H = dmp_raise([K.one, s*K.unit], u, 0, K)
for i, (factor, _) in enumerate(factors):
h = dmp_convert(factor, u, K.dom, K)
h, _, g = dmp_inner_gcd(h, g, u, K)
h = dmp_compose(h, H, u, K)
factors[i] = h
return lc, dmp_trial_division(F, factors, u, K)

@oscarbenjamin
Copy link
Contributor

oscarbenjamin commented Apr 16, 2024

The bug is in dmp_sqf_list:

In [25]: sqf(expand(((y - I)**2 * (y + I) * (z + 1))))
Out[25]: 
       2        
(y - ) ⋅(y + )

In [26]: sqf(expand(((y - I)**2 * (y + I) * (x + 1))))
Out[26]: x + 1

Both of those answers are wrong. The factors for one variable are missing in each case.

The bug is here:

result, i = [], 1
h = dmp_diff(f, 1, u, K)
g, p, q = dmp_inner_gcd(f, h, u, K)
while True:
d = dmp_diff(p, 1, u, K)
h = dmp_sub(q, d, u, K)
if dmp_zero_p(h, u):
result.append((p, i))
break
g, p, q = dmp_inner_gcd(p, h, u, K)
if all or dmp_degree(g, u) > 0:
result.append((g, i))
i += 1
return coeff, result

That seems to use the same logic as for the univariate case that you can compute the square free factorisation by differentiating and then computing the gcd. The difference though is that in the multivariate case there is more than one choice of variable to differentiate with respect to and this seems to make an arbitrary choice. We might then end up with a factor like x + 1 and then differentiate wrt y and then exit the loop without collecting all factors.

What should probably happen is that once p gets down to degree 0 in the first variable it should recurse like it would at the start if the polynomial was already degree 0 in the first variable at the start:

elif deg == 0:
coeff2, factors = dmp_sqf_list(dmp_LC(f, u), u-1, K, all=all)
factors = [([fac], exp) for fac, exp in factors]
return coeff*coeff2, factors

The implemented algorithm seems to be Yun's algorithm but applying the univariate version of the algorithm naively to multivariate polynomials. In Yun's paper it is stated that the algorithm requires the multivariate polynomials to be primitive in the given varaible that is used for differentiation:

On square free decomposition algorithms, Yun, 1976
https://dl.acm.org/doi/10.1145/800205.806320

I think this means that to apply this algorithm in the multivariate case you would first need to use dmp_primitive to factor out the content of the coefficients wrt x and then also compute the square free factorisation of the content recursively. Currently it uses dmp_ground_primitive which is not the same e.g.:

In [12]: p = expand((y - I)**2 * (y + I) * (z + 1))

In [13]: p
Out[13]: 
 3      3      2        2                    
yz + y  - yz - y  + yz + y - z - 

In [14]: p.as_poly().primitive()
Out[14]: (1, Poly(y**3*z + y**3 - I*y**2*z - I*y**2 + y*z + y - I*z - I, y, z, domain='ZZ_I'))

In [15]: p.as_poly(y).primitive()
Out[15]: (z + 1, Poly(y**3 - I*y**2 + y - I, y, domain='ZZ_I[z]'))

@oscarbenjamin
Copy link
Contributor

The bug for sqf does not require there to be any complex numbers i.e. it happens with ZZ as well as ZZ_I:

In [1]: sqf(expand(((y - 2)**2 * (y + 2) * (x + 1))))
Out[1]: x + 1

In [2]: sqf(expand(((y - 2)**2 * (y + 2) * (z + 1))))
Out[2]: 
       2        
(y - 2) ⋅(y + 2)

I think that the difference in the case of factor is just that for ZZ_I it uses dmp_sqf_list whereas for ZZ it does not.

To be clear the expected output for sqf is like:

In [3]: sqf(expand((x-1)*(x-2)*(x-3)**2*(x-4)**5))
Out[3]: 
       5        22          ⎞
(x - 4) ⋅(x - 3) ⋅⎝x  - 3x + 2

It is supposed to be a factorisation of the expression like with factor except that it is not a full factorisation into irreducibles.

@oscarbenjamin
Copy link
Contributor

This fixes the bug in dmp_sqf_list makes all examples above with sqf work as intended:

diff --git a/sympy/polys/sqfreetools.py b/sympy/polys/sqfreetools.py
index e11eebc7b7..2fd4f0b8ca 100644
--- a/sympy/polys/sqfreetools.py
+++ b/sympy/polys/sqfreetools.py
@@ -23,7 +23,7 @@
 from sympy.polys.euclidtools import (
     dup_inner_gcd, dmp_inner_gcd,
     dup_gcd, dmp_gcd,
-    dmp_resultant)
+    dmp_resultant, dmp_primitive)
 from sympy.polys.galoistools import (
     gf_sqf_list, gf_sqf_part)
 from sympy.polys.polyerrors import (
@@ -401,30 +401,36 @@ def dmp_sqf_list(f, u, K, all=False):
     deg = dmp_degree(f, u)
     if deg < 0:
         return coeff, []
-    elif deg == 0:
-        coeff2, factors = dmp_sqf_list(dmp_LC(f, u), u-1, K, all=all)
-        factors = [([fac], exp) for fac, exp in factors]
-        return coeff*coeff2, factors
 
-    result, i = [], 1
+    content, f = dmp_primitive(f, u, K)
 
-    h = dmp_diff(f, 1, u, K)
-    g, p, q = dmp_inner_gcd(f, h, u, K)
+    result = []
 
-    while True:
-        d = dmp_diff(p, 1, u, K)
-        h = dmp_sub(q, d, u, K)
+    if deg != 0:
 
-        if dmp_zero_p(h, u):
-            result.append((p, i))
-            break
+        h = dmp_diff(f, 1, u, K)
+        g, p, q = dmp_inner_gcd(f, h, u, K)
 
-        g, p, q = dmp_inner_gcd(p, h, u, K)
+        i = 1
 
-        if all or dmp_degree(g, u) > 0:
-            result.append((g, i))
+        while True:
+            d = dmp_diff(p, 1, u, K)
+            h = dmp_sub(q, d, u, K)
 
-        i += 1
+            if dmp_zero_p(h, u):
+                result.append((p, i))
+                break
+
+            g, p, q = dmp_inner_gcd(p, h, u, K)
+
+            if all or dmp_degree(g, u) > 0:
+                result.append((g, i))
+
+            i += 1
+
+    coeff_content, result_content = dmp_sqf_list(content, u-1, K, all=all)
+    coeff *= coeff_content
+    result += [([fac], exp) for fac, exp in result_content]
 
     return coeff, result

Unfortunately that does not fix the original problem. It turns out that dmp_sqf_list is not even being called but rather dmp_sqf_part which does seem to return correct output:

In [3]: a, b = symbols("a b")

In [4]: e = ((b - I)**2 * (b + I) * (a + 1))

In [5]: e
Out[5]: 
               2        
(a + 1)⋅(b - ) ⋅(b + )

In [6]: expand(e)
Out[6]: 
   3        2                3      2        
ab  - ab  + ab - a + b  - b  + b - 

In [7]: sqf_part(expand(e))
Out[7]: 
   2        2    
ab  + a + b  + 1

In [8]: sqf_part(expand(e)).factor()
Out[8]: 
        ⎛ 2    ⎞
(a + 1)⋅⎝b  + 1In [9]: expand(e).factor() # wrong
Out[9]: 
        ⎛ 2    ⎞
(a + 1)⋅⎝b  + 1

Here the output from sqf_part is correct for sqf_part but the same output is not correct for factor. The problem then is that factor is returning the sqf_part.

The bug is somewhere in dmp_ext_factor:

def dmp_ext_factor(f, u, K):
"""Factor multivariate polynomials over algebraic number fields. """
if not u:
return dup_ext_factor(f, K)
lc = dmp_ground_LC(f, u, K)
f = dmp_ground_monic(f, u, K)
if all(d <= 0 for d in dmp_degree_list(f, u)):
return lc, []
f, F = dmp_sqf_part(f, u, K), f
s, g, r = dmp_sqf_norm(f, u, K)
factors = dmp_factor_list_include(r, u, K.dom)
if len(factors) == 1:
factors = [f]
else:
H = dmp_raise([K.one, s*K.unit], u, 0, K)
for i, (factor, _) in enumerate(factors):
h = dmp_convert(factor, u, K.dom, K)
h, _, g = dmp_inner_gcd(h, g, u, K)
h = dmp_compose(h, H, u, K)
factors[i] = h
return lc, dmp_trial_division(F, factors, u, K)

I'm not familiar with the algorithm in use here. Again it looks very similar to the corresponding univariate algorithm. In the debugger we already have a wrong output from dmp_factor_list_include because we only get two factors when there should be three. The output of dmp_factor_list is correct for the given input so the problem needs to be earlier. The input f seems correct and dmp_sqf_part gets the right square free part.

@oscarbenjamin
Copy link
Contributor

Maybe the bug is triggered because the square free part has only real coefficients i.e. we're trying to factorise a polynomial in QQ(I)[a,b] but its square free part is in QQ[a,b]:

In [11]: e
Out[11]: 
               2        
(a + 1)⋅(b - ) ⋅(b + )

In [12]: sqf_part(e)
Out[12]: 
   2        2    
ab  + a + b  + 1

It should be possible for dmp_trial_division to use the factors that are found to find all of the factors e.g.:

In [20]: quo(e, sqf_part(expand(e)))
Out[20]: b - 

I'm not sure that's how the algorithm is supposed to work though. I don't actually know where this particular algorithm comes from.

@oscarbenjamin
Copy link
Contributor

The algorithm in dup_ext_factor seems to be algorithm 3.6.4 in Cohen's "A course in computational number theory":
http://tomlr.free.fr/Math%E9matiques/Math%20Complete/Number%20theory/A%20course%20in%20computational%20algebraic%20number%20theory%20-%20Cohen%20H..pdf

That algorithm is written for univariate polynomials though. I'm not sure if dmp_ext_factor is a valid generalisation of the algorithm to the multivariate case.

@oscarbenjamin
Copy link
Contributor

Possibly the dmp_ext_factor algorithm is from:

Algebraic factoring and rational function integration
Barry M. Trager 1976

https://dl.acm.org/doi/10.1145/800205.806338

@oscarbenjamin
Copy link
Contributor

The problem with sqf was fixed in gh-26514 but the problem with factor (dmp_ext_factor) remains.

@oscarbenjamin
Copy link
Contributor

The algorithm in dmp_ext_factor is the one in the Trager paper looks correct per the paper. It also calls dmp_sqf_norm which is also defined in the paper and also looks correct. However dmp_sqf_norm calls dmp_sqf_p to check if the polynomial is square free and that seems to have a similar bug to the one that dmp_sqf_list had:

def dmp_sqf_p(f, u, K):
"""
Return ``True`` if ``f`` is a square-free polynomial in ``K[X]``.
Examples
========
>>> from sympy.polys import ring, ZZ
>>> R, x,y = ring("x,y", ZZ)
>>> R.dmp_sqf_p(x**2 + 2*x*y + y**2)
False
>>> R.dmp_sqf_p(x**2 + y**2)
True
"""
if dmp_zero_p(f, u):
return True
else:
return not dmp_degree(dmp_gcd(f, dmp_diff(f, 1, u, K), u, K), u)

It is not enough just to differentiate wrt one arbitrarily chosen variable in the multivariate case.

Hence:

In [22]: p = x**2*y**4 + 2*x**2*y**2 + x**2 + 2*x*y**4 + 4*x*y**2 + 2*x + 2*y**4 + 4*y**2 + 2

In [23]: p
Out[23]: 
 2  4      2  2    2        4        2            4      2    
xy  + 2xy  + x  + 2xy  + 4xy  + 2x + 2y  + 4y  + 2

In [24]: p.factor()
Out[24]: 
        22    ⎞  ⎛ 2          ⎞
⎝y  + 1⎠ ⋅⎝x  + 2x + 2In [25]: Poly(p).is_sqf  # wrong
Out[25]: True

I think that wrong result from dmp_sqf_p leads to the wrong result in dmp_sqf_norm and in turn dmp_ext_factor.

@oscarbenjamin
Copy link
Contributor

We can fix dmp_sqf_p like this:

diff --git a/sympy/polys/sqfreetools.py b/sympy/polys/sqfreetools.py
index e11eebc7b7..e00907b84d 100644
--- a/sympy/polys/sqfreetools.py
+++ b/sympy/polys/sqfreetools.py
@@ -10,7 +10,7 @@
 from sympy.polys.densebasic import (
     dup_strip,
     dup_LC, dmp_ground_LC,
-    dmp_zero_p,
+    dmp_zero_p, dmp_ground_p,
     dmp_ground, dmp_LC,
     dup_degree, dmp_degree,
     dmp_raise, dmp_inject,
@@ -70,8 +70,20 @@ def dmp_sqf_p(f, u, K):
     """
     if dmp_zero_p(f, u):
         return True
-    else:
-        return not dmp_degree(dmp_gcd(f, dmp_diff(f, 1, u, K), u, K), u)
+
+    for i in range(u+1):
+
+        fp = dmp_diff_in(f, 1, i, u, K)
+
+        if dmp_zero_p(fp, u):
+            continue
+
+        gcd = dmp_gcd(f, fp, u, K)
+
+        if not dmp_ground_p(gcd, None, u):
+            return False
+
+    return True
 
 
 def dup_sqf_norm(f, K):

I am wondering though whether there should actually be a distinction between the existing function that basically determines if a polynomial in K[x][y] is square free in y over K[x] and this fixed version that determines whether a polynomial in K[x,y] is square free in the multivariate polynomial ring over K.

In any case fixing dmp_sqf_p like this prevents the wrong results seen above but instead sends dmp_sqf_norm into an infinite loop. Looking at it dmp_sqf_norm comes from the Trager paper which claims

This paper .presents a new, simple, and efficient algorithm for factoring polynomials in several variables over an algebraic number field.

However the algorithm presented as alg_factor is clearly only described for univariate polynomials. The sqfr_norm algorithm assumes that we can make substitutions like f(x) -> f(x-a) to find a square free result. We have several variables though like f(x, y) and it is not clear if we should use f(x-a, y) or f(x, y-a) or whether both need to be tried. I am not completely sure if the basic approach described is still valid for multivariate polynomials at all or if it needs some significant modification.

The simplest fix I can imagine is to try shifts in turn like:

f(x,y) -> f(x-a,y)
f(x,y) -> f(x,y-a)

We would then need to return a list of shifts for s rather than a single integer. I am not sure if this approach would satisfy the termination conditions for the algorithm though.

@oscarbenjamin
Copy link
Contributor

It might just be sufficient to do this repeatedly:

f(x,y) -> f(x-a,y-a)

Possibly that is even what the paper is describing...

@oscarbenjamin
Copy link
Contributor

I believe all issues are fixed by gh-26515.

The OP example gives:

In [2]: import sympy

In [3]: a_n, m = sympy.symbols('a_n m', real=True, positive=True)
   ...: expr = (-a_n**4 *m**2 / (a_n**2 + 1) - 2 * a_n**4 *m / (a_n**2 + 1) - a_n**4 - a_n**4 / (a_n**2 +
   ...:  1) + 2 * sympy.I * a_n**3 *m**2 / (a_n**2 + 1) + 4 * sympy.I * a_n**3 *m / (a_n**2 + 1) + 2 * sy
   ...: mpy.I * a_n**3 + 2 * sympy.I * a_n**3 / (a_n**2 + 1) + 2 * sympy.I * a_n *m**2 / (a_n**2 + 1) + 4
   ...:  * sympy.I * a_n *m / (a_n**2 + 1) + 2 * sympy.I * a_n + 2 * sympy.I * a_n / (a_n**2 + 1) +m**2 /
   ...:  (a_n**2 + 1) + 2 *m / (a_n**2 + 1) + 1 + 1 / (a_n**2 + 1))

In [4]: expr.subs({a_n: 1, m: 1})
Out[4]: 12

In [5]: expr.factor().subs({a_n: 1, m: 1})
Out[5]: 
          3        
-3⋅(1 - ) ⋅(1 + )

In [6]: expr.factor().subs({a_n: 1, m: 1}).expand()
Out[6]: 12

The PR still needs tests and things to finish off but I don't have time right now. At least I think that the problem is identified and we have a fix.

If anyone wants to finish the PR then go ahead.

@oscarbenjamin
Copy link
Contributor

It might just be sufficient to do this repeatedly:

f(x,y) -> f(x-a,y-a)

This is the fix I went with. I think it's valid. I think there are only finitely many bad shifts a as long as we avoid a variable that is of degree 0 in the polynomial.

@oscarbenjamin
Copy link
Contributor

After fixing this in gh-26515 I find the following for the OP example:

In [16]: expr
Out[16]: 
    4  2        4               4           3  2         3                     3           2aₙm     2aₙm     4     aₙ      2aₙm    4aₙm         3   2aₙ    2aₙm    4⋅ ↪
- ─────── - ─────── - aₙ  - ─────── + ────────── + ───────── + 2aₙ  + ─────── + ───────── + ──── ↪
    2         2               2          2            2                    2          2          2aₙ  + 1   aₙ  + 1         aₙ  + 1    aₙ  + 1      aₙ  + 1              aₙ  + 1    aₙ  + 1    aₙ   ↪

↪                              2aₙm            2aₙ      m         2m            1   
↪ ──── + 2aₙ + ─────── + ─────── + ─────── + 1 + ───────
↪                   2         2         2             2+ 1             aₙ  + 1   aₙ  + 1   aₙ  + 1       aₙ  + 1

In [17]: expr.factor()
Out[17]: 
         32    2-(aₙ - ) ⋅(aₙ + )⋅⎝aₙ  + m  + 2m + 2⎠ 
─────────────────────────────────────────
                   2                     
                 aₙ  + 1                 

In [18]: expr.cancel()
Out[18]: 
    4         3     2  2       2       2           2                        2          
- aₙ  + 2aₙ  - aₙm  - 2aₙm - aₙ  + 2aₙm  + 4aₙm + 4aₙ + m  + 2m + 2

In [19]: expr.cancel().factor()
Out[19]: 
         22    2-(aₙ - ) ⋅⎝aₙ  + m  + 2m + 2

I believe that all of these expressions are equivalent but ideally expr.cancel.factor() and expr.factor() would give the same result. In other words factor should make cancel redundant and both should be able to eliminate any cancelable demoninators. The fact that factor does not remove the denominator is because the numerator and denominator are not being factored over the same field. Otherwise an**2 + 1 in the denominator would factor:

In [21]: factor(a_n**2 + 1)
Out[21]: 
  2    
aₙ  + 1

In [22]: factor(a_n**2 + 1, domain=QQ_I)
Out[22]: (aₙ - )⋅(aₙ + )

This is also a bug although much less serious.

@oscarbenjamin
Copy link
Contributor

This reference gives a multivariate form of Yun's algorithm on page 8:

M M-D Lee, Factorization of multivariate polynomials, Ph.D. thesis, University of Kaiserslautern, 2013.

https://d-nb.info/1036637972/34

That looks possibly better than the fix I used in #26514

@oscarbenjamin
Copy link
Contributor

I have completed the PR #26515 that fixes this. All issues identified should be fixed I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants