Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Factor only some terms of an Add using connected components #19386

Open
oscarbenjamin opened this issue May 20, 2020 · 1 comment
Open

Factor only some terms of an Add using connected components #19386

oscarbenjamin opened this issue May 20, 2020 · 1 comment
Labels

Comments

@oscarbenjamin
Copy link
Contributor

I keep seeing the same problem come up that I want to "factor" something that has an extra term e.g.:

In [130]: e = expand((x+y)**3 + z)                                                                                                

In [131]: e                                                                                                                       
Out[131]: 
 3      2          2    3    
x  + 3x y + 3xy  + y  + z

In [132]: factor(e)                                                                                                               
Out[132]: 
 3      2          2    3    
x  + 3x y + 3xy  + y  + z

The z-term is preventing factoring so if we temporarily remove it then factoring becomes possible:

In [133]: factor(e-z) + z                                                                                                         
Out[133]: 
           3
z + (x + y) 

In general it would be nice if there was a way to infer which terms of an Add could be factored together and factor those. I've made a quick function using connected components that does this. Terms are grouped together based on having symbols in common. Then the separate groups are factored:

from sympy.utilities.iterables import connected_components

def factor_components(expr):
    assert isinstance(expr, Add)

    V = expr.args
    E = []
    for n1, t1 in enumerate(V):
        for t2 in V[:n1]:
            if t1.free_symbols & t2.free_symbols:
                E.append((t1, t2))

    components = connected_components((V, E))

    return Add(*(factor(Add(*c)) for c in components))

With that we have:

In [134]: factor_components(e)                                                                                                    
Out[134]: 
           3
z + (x + y) 

Thefactor_components function seems to work well where you want to factor subterms that all have symbols so that there is no constant term. When there is a constant term it is not clear which component it should be factored with.

A contrived bigger example:

In [144]: e = expand((x+y)**10 + z*(t+1)**20)                                                                                     

In [145]: e                                                                                                                       
Out[145]: 
 20         19          18           17           16            15            14            13             12             11    
t  z + 20t  z + 190t  z + 1140t  z + 4845t  z + 15504t  z + 38760t  z + 77520t  z + 125970t  z + 167960t  z +

         10             9             8            7            6            5           4           3          2               
 184756t  z + 167960t z + 125970t z + 77520t z + 38760t z + 15504t z + 4845t z + 1140t z + 190t z + 20tz + x

10       9         8  2        7  3        6  4        5  5        4  6        3  7       2  8         9    10    
   + 10x y + 45x y  + 120x y  + 210x y  + 252x y  + 210x y  + 120x y  + 45x y  + 10xy  + y   + z

In [146]: factor(e)                                                                                                               
Out[146]: 
 20         19          18           17           16            15            14            13             12             11    
t  z + 20t  z + 190t  z + 1140t  z + 4845t  z + 15504t  z + 38760t  z + 77520t  z + 125970t  z + 167960t  z +

         10             9             8            7            6            5           4           3          2               
 184756t  z + 167960t z + 125970t z + 77520t z + 38760t z + 15504t z + 4845t z + 1140t z + 190t z + 20tz + x

10       9         8  2        7  3        6  4        5  5        4  6        3  7       2  8         9    10    
   + 10x y + 45x y  + 120x y  + 210x y  + 252x y  + 210x y  + 120x y  + 45x y  + 10xy  + y   + z

In [147]: factor_components(e)                                                                                                    
Out[147]: 
         20          10
z(t + 1)   + (x + y)  

Most often when I see this issue coming up there is just a single additional term preventing factorisation.

@oscarbenjamin
Copy link
Contributor Author

Here's a trickier example:

In [17]: e = c1**2*r1**2 + 2*c1**2*r1*r2 + c1**2*r2**2 - 2*c1*c2*r1*r2 + 2*c1*c2*r2**2 + c2**2*r2**2                              

In [18]: e                                                                                                                        
Out[18]: 
  2   2       2           2   2                             2     2   2
c₁ r₁  + 2c₁ r₁r₂ + c₁ r₂  - 2c₁c₂r₁r₂ + 2c₁c₂r₂  + c₂ r₂ 

In [19]: collect(e, c1)                                                                                                           
Out[19]: 
  22               2⎞      ⎛                     22   2
c₁ ⎝r₁  + 2r₁r₂ + r₂ ⎠ + c₁-2c₂r₁r₂ + 2c₂r₂ ⎠ + c₂ r₂ 

In [20]: t1, t2, t3 = collect(e, c1).args                                                                                         

In [21]: t1p = factor_terms(t1)                                                                                                   

In [22]: t2p = factor(t2)                                                                                                         

In [23]: e2 = t1p + t2p + t3                                                                                                      

In [24]: e2                                                                                                                       
Out[24]: 
  2          2                             2   2
c₁ (r₁ + r₂)  + 2c₁c₂r₂(-r₁ + r₂) + c₂ r₂ 

In [25]: expand(e2) == e                                                                                                          
Out[25]: True

The factor_components function above doesn't handle this because all terms are connected.

This kind of factorisation isn't unique so there can be better ways of factoring the expression.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants