Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the solver doesn't work if some equations are multiplied by -1 #25507

Open
wsdookadr opened this issue Aug 14, 2023 · 5 comments
Open

the solver doesn't work if some equations are multiplied by -1 #25507

wsdookadr opened this issue Aug 14, 2023 · 5 comments

Comments

@wsdookadr
Copy link
Contributor

wsdookadr commented Aug 14, 2023

Someone on HN drew my attention to this.
If one multiplies some equations by -1, the solver doesn't work anymore.
This happens if e = P[i]-P[j] is rewritten as e = P[j]-P[i].

According to Wikipedia this is a Type 2 elementary row operation and shouldn't affect the result.

The system of equations is built by the following program.
It should have a unique solution with yh=9.
But after the change mentioned above it finds yh=-3.

Tested this on the SymPy 1.12 release.

# o,y,p,b,g
from sympy import *
ow,oh,yw,yh,pw,ph,bw,bh,gw,gh = symbols('o_w o_h y_w y_h p_w p_h b_w b_h g_w g_h')
u1 = ow+bw
u2 = yw+pw+bw
u3 = oh+yh
u4 = bh
u5 = oh+gh+ph
u6 = yw+gw+bw
P = [ow*oh,yw*yh,pw*ph,bw*bh,gw*gh]
U=[u1,u2,u3,u4,u5,u6]
l=len(U)
E=[]
E.append(Eq(oh,3))
for i in range(l):
    for j in range(i):
        e = U[i]-U[j]
        E.append(Eq(e,0))
l=len(P)
for i in range(l):
    for j in range(i):
        e = P[i]-P[j]
        E.append(Eq(e,0))
S = solve(E)
print(S)
L = 3 + S[0][yh]
print("L=",L)
@wsdookadr wsdookadr changed the title multiplying some equations by -1 in a system the solver doesn't work if some equations are multiplied by -1 Aug 14, 2023
@wsdookadr
Copy link
Contributor Author

Please disregard my bug report. I just realized/remembered this is not a linear system.

@oscarbenjamin
Copy link
Contributor

Testing with master I get a single solution with yh=9 (I guess you mean L=12?) and it seems to be a valid solution:

In [9]: syms = ow,oh,yw,yh,pw,ph,bw,bh,gw,gh

In [10]: [sol] = solve(E, syms, dict=True)

In [11]: sol
Out[11]: {bₕ: 12, b_w: 12/5, gₕ: 9/2, g_w: 32/5, oₕ: 3, o_w: 48/5, pₕ: 9/2, p_w: 32/5, yₕ: 9, y_w: 16/5}

In [12]: [e.subs(sol) for e in E]
Out[12]: 
[True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True]

Solving the Groebner basis also reveals a degenerate case:

In [16]: [sol1, sol2] = solve(groebner(E, syms), syms, dict=True)

In [17]: sol1
Out[17]: {bₕ: 0, b_w: 0, gₕ: -pₕ - 3, g_w: 0, oₕ: 3, o_w: 0, p_w: 0, yₕ: -3, y_w: 0}

In [18]: sol2
Out[18]: {bₕ: 12, b_w: 12/5, gₕ: 9/2, g_w: 32/5, oₕ: 3, o_w: 48/5, pₕ: 9/2, p_w: 32/5, yₕ: 9, y_w: 16/5}

In [19]: [e.subs(sol1) for e in E]
Out[19]: 
[True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, Tr
ue, True, True, True, True, True, True, True, True, True]

In [20]: [e.subs(sol2) for e in E]
Out[20]: 
[True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, True, Tr
ue, True, True, True, True, True, True, True, True, True]

Changing to e = P[j]-P[i] gives the same solutions as for the Groebner basis.

I'm not sure why the degenerate case is not picked up by default.

It should have a unique solution with yh=12.

I don't think this is true:

In [22]: groebner(E, syms).is_zero_dimensional
Out[22]: False

The fact that the degenerate case can be missed by solve is an issue though so after this code:

# o,y,p,b,g
from sympy import *
ow,oh,yw,yh,pw,ph,bw,bh,gw,gh = symbols('o_w o_h y_w y_h p_w p_h b_w b_h g_w g_h')
u1 = ow+bw
u2 = yw+pw+bw
u3 = oh+yh
u4 = bh
u5 = oh+gh+ph
u6 = yw+gw+bw
P = [ow*oh,yw*yh,pw*ph,bw*bh,gw*gh]
U=[u1,u2,u3,u4,u5,u6]
l=len(U)
E=[]
E.append(Eq(oh,3))
for i in range(l):
    for j in range(i):
        e = U[i]-U[j]
        E.append(Eq(e,0))
l=len(P)
for i in range(l):
    for j in range(i):
        e = P[i]-P[j]
        E.append(Eq(e,0))
S = solve(E)
print(S)
L = 3 + S[0][yh]
print("L=",L)

I think that these should should all print 2 (1 is incorrect but I have not verified that 2 is correct - there could be more cases):

In [24]: len(solve(E, syms))
Out[24]: 1

In [25]: len(solve(E))
Out[25]: 1

In [26]: len(solve(groebner(E)))
Out[26]: 2

In [27]: len(solve(groebner(E), syms))
Out[27]: 2

With the terms swapped i.e. e = P[j]-P[i] we have:

In [30]: len(solve(E, syms))
Out[30]: 3

In [31]: len(solve(E))
Out[31]: 3

In [32]: len(solve(groebner(E), syms))
Out[32]: 2

In [33]: len(solve(groebner(E)))
Out[33]: 2

So now we can get 3 cases:

In [36]: [s1,s2,s3] = solve(E, syms, dict=True)

In [37]: s1
Out[37]: {bₕ: 0, b_w: 0, gₕ: -pₕ - 3, g_w: 0, oₕ: 3, o_w: 0, p_w: 0, yₕ: -3, y_w: 0}

In [38]: s2
Out[38]: {bₕ: 0, b_w: 0, gₕ: -9/2, g_w: 0, oₕ: 3, o_w: 0, pₕ: 3/2, p_w: 0, yₕ: -3, y_w: 0}

In [39]: s3
Out[39]: {bₕ: 12, b_w: 12/5, gₕ: 9/2, g_w: 32/5, oₕ: 3, o_w: 48/5, pₕ: 9/2, p_w: 32/5, yₕ: 9, y_w: 16/5}

This second solution is correct but redundant as it is a special case of the first:

In [47]: {s: e.subs(ph,Rational(3,2)) for s, e in s1.items()} | {ph:Rational(3,2)} == s2
Out[47]: True

In any case these should all return the same result so somewhere there is a bug.

Let's try to verify what the solution should be:

In [50]: gb = groebner(E, syms)

In [51]: for e in gb: pprint(e.factor())
-3g_w + 2o_w
oₕ - 3
-g_w + 2y_w
-bₕ + yₕ + 3
-g_w + p_w
-bₕ + gₕ + pₕ + 3
-2bₕ + 2b_w + 3g_w
    2         
2bₕ  - 45g_w
g_w⋅(bₕ - 12)
g_w⋅(5g_w - 32)
g_w⋅(2gₕ - 9)

The final 3 equations factor giving two cases g_w = 0 and g_w != 0. Each case is then linear. This is the first case:

In [52]: sys1 = [e.subs(gw,0) for e in gb]

In [53]: for e in sys1: pprint(e)
2o_w
oₕ - 3
2y_w
-bₕ + yₕ + 3
p_w
-bₕ + gₕ + pₕ + 3
-2bₕ + 2b_w
    2
2bₕ 
0
0
0

In [54]: solve(sys1, syms)
Out[54]: [(0, 3, 0, -3, 0, pₕ, 0, 0, g_w, -pₕ - 3)]

This is the second case:

In [57]: sys2 = gb[:-3] + [cancel(e/gw) for e in gb[-3:]]

In [58]: for e in sys2: pprint(e)
-3g_w + 2o_w
oₕ - 3
-g_w + 2y_w
-bₕ + yₕ + 3
-g_w + p_w
-bₕ + gₕ + pₕ + 3
-2bₕ + 2b_w + 3g_w
    2         
2bₕ  - 45g_w
bₕ - 12
5g_w - 32
2gₕ - 9

In [59]: solve(sys2, syms)
Out[59]: [(48/5, 3, 16/5, 9, 32/5, 9/2, 12/5, 12, 32/5, 9/2)]

All returned solutions are correct in all cases but some cases miss the degenerate solution and others are redundant.

@oscarbenjamin
Copy link
Contributor

Also nonlinsolve misses the zero dimensional solution:

In [61]: nonlinsolve(E, syms)
Out[61]: {(0, 3, 0, -3, 0, pₕ, 0, 0, 0, -pₕ - 3)}

@oscarbenjamin
Copy link
Contributor

I have not verified that 2 is correct - there could be more cases

I wrote that before writing the part below. The demonstration with factorising the Groebner basis proves that there should be two solutions returned in all cases.

@wsdookadr
Copy link
Contributor Author

So this multiplying by scalar should work for systems of polynomial equations too. Okay.
I don't know Groebner basis so can't comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants