Factor only some terms of an Add using connected components #19386

oscarbenjamin · 2020-05-20T22:08:47Z

I keep seeing the same problem come up that I want to "factor" something that has an extra term e.g.:

In [130]: e = expand((x+y)**3 + z)                                                                                                

In [131]: e                                                                                                                       
Out[131]: 
 3      2          2    3    
x  + 3⋅x ⋅y + 3⋅x⋅y  + y  + z

In [132]: factor(e)                                                                                                               
Out[132]: 
 3      2          2    3    
x  + 3⋅x ⋅y + 3⋅x⋅y  + y  + z

The z-term is preventing factoring so if we temporarily remove it then factoring becomes possible:

In [133]: factor(e-z) + z                                                                                                         
Out[133]: 
           3
z + (x + y)

In general it would be nice if there was a way to infer which terms of an Add could be factored together and factor those. I've made a quick function using connected components that does this. Terms are grouped together based on having symbols in common. Then the separate groups are factored:

from sympy.utilities.iterables import connected_components

def factor_components(expr):
    assert isinstance(expr, Add)

    V = expr.args
    E = []
    for n1, t1 in enumerate(V):
        for t2 in V[:n1]:
            if t1.free_symbols & t2.free_symbols:
                E.append((t1, t2))

    components = connected_components((V, E))

    return Add(*(factor(Add(*c)) for c in components))

With that we have:

In [134]: factor_components(e)                                                                                                    
Out[134]: 
           3
z + (x + y)

Thefactor_components function seems to work well where you want to factor subterms that all have symbols so that there is no constant term. When there is a constant term it is not clear which component it should be factored with.

A contrived bigger example:

In [144]: e = expand((x+y)**10 + z*(t+1)**20)                                                                                     

In [145]: e                                                                                                                       
Out[145]: 
 20         19          18           17           16            15            14            13             12             11    
t  ⋅z + 20⋅t  ⋅z + 190⋅t  ⋅z + 1140⋅t  ⋅z + 4845⋅t  ⋅z + 15504⋅t  ⋅z + 38760⋅t  ⋅z + 77520⋅t  ⋅z + 125970⋅t  ⋅z + 167960⋅t  ⋅z +

         10             9             8            7            6            5           4           3          2               
 184756⋅t  ⋅z + 167960⋅t ⋅z + 125970⋅t ⋅z + 77520⋅t ⋅z + 38760⋅t ⋅z + 15504⋅t ⋅z + 4845⋅t ⋅z + 1140⋅t ⋅z + 190⋅t ⋅z + 20⋅t⋅z + x

10       9         8  2        7  3        6  4        5  5        4  6        3  7       2  8         9    10    
   + 10⋅x ⋅y + 45⋅x ⋅y  + 120⋅x ⋅y  + 210⋅x ⋅y  + 252⋅x ⋅y  + 210⋅x ⋅y  + 120⋅x ⋅y  + 45⋅x ⋅y  + 10⋅x⋅y  + y   + z

In [146]: factor(e)                                                                                                               
Out[146]: 
 20         19          18           17           16            15            14            13             12             11    
t  ⋅z + 20⋅t  ⋅z + 190⋅t  ⋅z + 1140⋅t  ⋅z + 4845⋅t  ⋅z + 15504⋅t  ⋅z + 38760⋅t  ⋅z + 77520⋅t  ⋅z + 125970⋅t  ⋅z + 167960⋅t  ⋅z +

         10             9             8            7            6            5           4           3          2               
 184756⋅t  ⋅z + 167960⋅t ⋅z + 125970⋅t ⋅z + 77520⋅t ⋅z + 38760⋅t ⋅z + 15504⋅t ⋅z + 4845⋅t ⋅z + 1140⋅t ⋅z + 190⋅t ⋅z + 20⋅t⋅z + x

10       9         8  2        7  3        6  4        5  5        4  6        3  7       2  8         9    10    
   + 10⋅x ⋅y + 45⋅x ⋅y  + 120⋅x ⋅y  + 210⋅x ⋅y  + 252⋅x ⋅y  + 210⋅x ⋅y  + 120⋅x ⋅y  + 45⋅x ⋅y  + 10⋅x⋅y  + y   + z

In [147]: factor_components(e)                                                                                                    
Out[147]: 
         20          10
z⋅(t + 1)   + (x + y)

Most often when I see this issue coming up there is just a single additional term preventing factorisation.

The text was updated successfully, but these errors were encountered:

oscarbenjamin · 2020-05-21T22:37:27Z

Here's a trickier example:

In [17]: e = c1**2*r1**2 + 2*c1**2*r1*r2 + c1**2*r2**2 - 2*c1*c2*r1*r2 + 2*c1*c2*r2**2 + c2**2*r2**2                              

In [18]: e                                                                                                                        
Out[18]: 
  2   2       2           2   2                             2     2   2
c₁ ⋅r₁  + 2⋅c₁ ⋅r₁⋅r₂ + c₁ ⋅r₂  - 2⋅c₁⋅c₂⋅r₁⋅r₂ + 2⋅c₁⋅c₂⋅r₂  + c₂ ⋅r₂ 

In [19]: collect(e, c1)                                                                                                           
Out[19]: 
  2 ⎛  2               2⎞      ⎛                     2⎞     2   2
c₁ ⋅⎝r₁  + 2⋅r₁⋅r₂ + r₂ ⎠ + c₁⋅⎝-2⋅c₂⋅r₁⋅r₂ + 2⋅c₂⋅r₂ ⎠ + c₂ ⋅r₂ 

In [20]: t1, t2, t3 = collect(e, c1).args                                                                                         

In [21]: t1p = factor_terms(t1)                                                                                                   

In [22]: t2p = factor(t2)                                                                                                         

In [23]: e2 = t1p + t2p + t3                                                                                                      

In [24]: e2                                                                                                                       
Out[24]: 
  2          2                             2   2
c₁ ⋅(r₁ + r₂)  + 2⋅c₁⋅c₂⋅r₂⋅(-r₁ + r₂) + c₂ ⋅r₂ 

In [25]: expand(e2) == e                                                                                                          
Out[25]: True

The factor_components function above doesn't handle this because all terms are connected.

This kind of factorisation isn't unique so there can be better ways of factoring the expression.

oscarbenjamin mentioned this issue May 22, 2020

[GSOC] Constant coefficient non-homogeneous system of ODEs solver #19341

Merged

sylee957 added the polys label May 23, 2020

oscarbenjamin mentioned this issue Aug 24, 2021

cse improvement possibilities #11577

Open

oscarbenjamin mentioned this issue Dec 17, 2023

How to convince factor to factorize -- factor(a**2 − 2*a + b**2 + 1) does not return (a - 1)**2 + b**2 #25995

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Factor only some terms of an Add using connected components #19386

Factor only some terms of an Add using connected components #19386

oscarbenjamin commented May 20, 2020

oscarbenjamin commented May 21, 2020

Factor only some terms of an Add using connected components #19386

Factor only some terms of an Add using connected components #19386

Comments

oscarbenjamin commented May 20, 2020

oscarbenjamin commented May 21, 2020