Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large floating point differences in results when using xreplace() and autowrap() #20340

Open
moorepants opened this issue Oct 26, 2020 · 18 comments

Comments

@moorepants
Copy link
Member

I have a large matrix of expressions with 800k operations. I've stored it in this file:

large_matrix.txt [10 Mb]

The matrix is generated in the code in this PR: pydy/pydy#122

If I use xreplace() to evaluate this matrix with floating point values I get a result. If I then use autowrap() to numerically evaluate the matrix I get a different result. I would expect the results to be similar within values close to machine precision, but they all aren't. I'm getting errors as large as 1e-01 in some entries but others are as expected 1e-14 to 1e-17. Here are the results from the script below:

In [1]: run compare_xreplace_and_autowrap.py
Loading the large matrix.
Xreplacing.
Autowrapping with C
Comparison of autowrapped C and xreplace
[[ 0.00000000e+00 -2.88657986e-15 -2.77555756e-17 -1.06581410e-14
  -1.15463195e-14 -1.38777878e-17]
 [ 2.63929901e-02 -3.21755654e-02  3.18323442e-04 -6.14592598e-02
  -1.27230835e-01 -3.76451545e-04]
 [ 1.21555293e-03 -1.48187464e-03  1.46606728e-05 -2.83056156e-03
  -5.85973070e-03 -1.73378150e-05]
 [-1.38777878e-17  0.00000000e+00 -6.93889390e-18  0.00000000e+00
  -1.42247325e-16 -5.55111512e-17]
 [ 1.90819582e-17  0.00000000e+00 -2.77555756e-17  0.00000000e+00
   1.11022302e-16 -1.38777878e-17]
 [-2.77555756e-17  0.00000000e+00 -1.38777878e-17  2.77555756e-17
   0.00000000e+00  2.77555756e-17]]
Autowrapping with Fortran
Comparison of autowrapped Fortran and xreplace
[[ 0.00000000e+00 -2.88657986e-15 -2.77555756e-17 -1.06581410e-14
  -1.15463195e-14 -1.38777878e-17]
 [ 2.63929901e-02 -3.21755654e-02  3.18323442e-04 -6.14592598e-02
  -1.27230835e-01 -3.76451545e-04]
 [ 1.21555293e-03 -1.48187464e-03  1.46606728e-05 -2.83056156e-03
  -5.85973070e-03 -1.73378150e-05]
 [-1.38777878e-17  0.00000000e+00 -6.93889390e-18  0.00000000e+00
  -1.42247325e-16 -5.55111512e-17]
 [ 1.90819582e-17  0.00000000e+00 -2.77555756e-17  0.00000000e+00
   1.11022302e-16 -1.38777878e-17]
 [-2.77555756e-17  0.00000000e+00 -1.38777878e-17  2.77555756e-17
   0.00000000e+00  2.77555756e-17]]
Comparison of autowrapped Fortran and autowrapping C
[[0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]]

With other large matrices, for example those from pydy.models.n_link_pendulum_with_cart I don't get discrepancies between xreplace and autowrap. I understand that floating point errors can occur and can multiply for specific operations and that could be an explanation. But I don't know exactly what xreplace() does and I can't compare it to evalf() or subs()*. I'd really like to be able to evaluate this expression to a 1e-17 precision with SymPy's arbitrary precision capabilities so that I have the correct answer and then compare the C/Fortran results to see what kind of floating point errors arise.

*Note that .evalf(subs={}) does not complete in a reasonable amount of time. Neither does subs(). I've yet to wait it out. Trying lambdify() kills my Python interpreter. I'm not sure Python lambda functions can handle such large expressions.

**It is worth nothing that CSE will reduce the number of operations to something like a 1000 but I also get discrepancies between xreplace() and the CSE'd expressions. This is done with PyDy's CythonMatrixGenerator() which CSE's the expressions and generates C code. Autowrap doesn't CSE yet.

Questions:

  • What does xreplace() do and should I expect it to not generate the same floating point errors as the C and Fortran code?
  • Is there any way to evaluate this expression to an arbitrary precision (in a reasonable amount of time)?
  • Are there good ways to examine these expressions to find out if there are (sub)expressions that will result in large floating point errors?

Here is the script to reproduce the results:

import sympy as sm
from sympy.utilities.autowrap import autowrap

rf, rr = sm.symbols('rf rr')
d1, d2, d3 = sm.symbols('d1 d2 d3')
l1, l2, l3, l4 = sm.symbols('l1 l2 l3 l4')
g = sm.symbols('g')
mc, md, me, mf = sm.symbols('mc md me mf')
ic11, ic22, ic33, ic31 = sm.symbols('ic11 ic22 ic33 ic31')
id11, id22 = sm.symbols('id11 id22')
ie11, ie22, ie33, ie31 = sm.symbols('ie11 ie22 ie33 ie31')
if11, if22 = sm.symbols('if11 if22')
q1, q2, q3, q4, q5, q6, q7, q8 = sm.symbols('q1, q2, q3, q4, q5, q6, q7, q8')

float_subs = {
    d1: 0.9534570696121849,
    d2: 0.2676445084476887,
    d3: 0.03207142672761929,
    g: 9.81,
    ic11: 7.178169776497895,
    ic22: 11.0,
    ic31: 3.8225535938357873,
    ic33: 4.821830223502103,
    id11: 0.0603,
    id22: 0.12,
    ie11: 0.05841337700152972,
    ie22: 0.06,
    ie31: 0.009119225261946298,
    ie33: 0.007586622998470264,
    if11: 0.1405,
    if22: 0.28,
    l1: 0.4707271515135145,
    l2: -0.47792881146460797,
    l3: -0.00597083392418685,
    l4: -0.3699518200282974,
    mc: 85.0,
    md: 2.0,
    me: 4.0,
    mf: 3.0,
    q1: 0.0,
    q2: -0.17447337661787718,
    q3: 0.0,
    q4: 0.6206670416476966,
    q5: 0.3300446174593725,
    q6: 0.0,
    q7: -0.2311385135743,
    q8: 0.0,
    rf: 0.35,
    rr: 0.3,
}

print('Loading the large matrix.')
with open('large_matrix.txt') as f:
    M = sm.sympify(f.read())

M_from_xreplace = sm.matrix2numpy(M.xreplace(float_subs), dtype=float)

print('Autowrapping with C')
eval_autowrapped_c = autowrap(M,
                              language='C',
                              backend='cython',
                              args=float_subs.keys())
M_from_autowrap_c = eval_autowrapped_c(*float_subs.values())

print('Comparison of autowrapped C and xreplace')
print(M_from_autowrap_c - M_from_xreplace)

print('Autowrapping with Fortran')
eval_autowrapped_fortran = autowrap(M,
                                    language='F95',
                                    backend='f2py',
                                    args=float_subs.keys())
M_from_autowrap_f = eval_autowrapped_fortran(*float_subs.values())

print('Comparison of autowrapped Fortran and xreplace')
print(M_from_autowrap_f - M_from_xreplace)
@moorepants
Copy link
Member Author

moorepants commented Oct 26, 2020

evalf(subs=float_subs) does eventually finish. It completed after some hours of computing. Interestingly xreplace(float_subs) and evalf(subs=float_subs) don't return the same results to machine precision. Here are the results:

moorepants@nandi:bicycle(whipple)$ SYMPY_CACHE_SIZE=5000 ipython
Python 3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) 
Type 'copyright', 'credits' or 'license' for more information
IPython 7.18.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: run compare_xreplace_and_autowrap.py
Loading the large matrix.
Xreplacing.
Subsing.
Comparison of xreplace and subs
[[ 1.42108547e-014  6.66133815e-016  2.77555756e-017  7.10542736e-015
   3.99680289e-015  0.00000000e+000]
 [-2.63929901e-002  3.21755654e-002 -3.18323442e-004  6.14592598e-002
   1.27230835e-001  3.76451545e-004]
 [-1.22460222e-003  1.49290659e-003 -1.47698155e-005  2.85163393e-003
   5.90335399e-003  1.74668879e-005]
 [-5.90910632e-126  0.00000000e+000  3.46944695e-018 -5.55111512e-017
   2.08166817e-017  0.00000000e+000]
 [-3.77189949e-017  0.00000000e+000 -1.38777878e-017  1.11022302e-016
  -1.11022302e-016  1.38777878e-017]
 [ 1.38777878e-017  0.00000000e+000  0.00000000e+000 -2.77555756e-017
  -1.11022302e-016 -1.47727658e-126]]
Autowrapping with C
Comparison of autowrapped C and xreplace
[[ 0.00000000e+00 -2.88657986e-15 -2.77555756e-17 -1.06581410e-14
  -1.15463195e-14 -1.38777878e-17]
 [ 2.63929901e-02 -3.21755654e-02  3.18323442e-04 -6.14592598e-02
  -1.27230835e-01 -3.76451545e-04]
 [ 1.21555293e-03 -1.48187464e-03  1.46606728e-05 -2.83056156e-03
  -5.85973070e-03 -1.73378150e-05]
 [-1.38777878e-17  0.00000000e+00 -6.93889390e-18  0.00000000e+00
  -1.42247325e-16 -5.55111512e-17]
 [ 1.90819582e-17  0.00000000e+00 -2.77555756e-17  0.00000000e+00
   1.11022302e-16 -1.38777878e-17]
 [-2.77555756e-17  0.00000000e+00 -1.38777878e-17  2.77555756e-17
   0.00000000e+00  2.77555756e-17]]
Autowrapping with Fortran
Comparison of autowrapped Fortran and xreplace
[[ 0.00000000e+00 -2.88657986e-15 -2.77555756e-17 -1.06581410e-14
  -1.15463195e-14 -1.38777878e-17]
 [ 2.63929901e-02 -3.21755654e-02  3.18323442e-04 -6.14592598e-02
  -1.27230835e-01 -3.76451545e-04]
 [ 1.21555293e-03 -1.48187464e-03  1.46606728e-05 -2.83056156e-03
  -5.85973070e-03 -1.73378150e-05]
 [-1.38777878e-17  0.00000000e+00 -6.93889390e-18  0.00000000e+00
  -1.42247325e-16 -5.55111512e-17]
 [ 1.90819582e-17  0.00000000e+00 -2.77555756e-17  0.00000000e+00
   1.11022302e-16 -1.38777878e-17]
 [-2.77555756e-17  0.00000000e+00 -1.38777878e-17  2.77555756e-17
   0.00000000e+00  2.77555756e-17]]
Comparison of autowrapped Fortran and autowrapping C
[[0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]]

In [2]: M_from_subs
Out[2]: 
array([[ 8.11335130e+001,  7.28710462e-001,  1.56113195e-001,
         2.44463494e+001,  2.95440245e+000,  6.92079181e-002],
       [ 1.91518220e+000,  9.73185031e+000,  7.65801142e-003,
         1.79212064e+001,  2.93558364e+001,  2.61750220e-001],
       [ 3.92853788e+000,  3.40955206e-001,  1.96725156e-001,
         2.85774351e+000,  1.41590405e+000,  2.71934957e-002],
       [ 5.90910632e-126, -3.00000000e-001, -2.73277450e-002,
        -3.55097215e-002, -2.06516701e-002,  3.38776131e-001],
       [-4.01714365e-014,  0.00000000e+000, -1.05294641e-001,
         9.73805398e-001,  5.66343720e-001, -8.79245864e-002],
       [ 3.55097215e-002,  0.00000000e+000, -6.37686991e-002,
         1.18182126e-125, -7.92181636e-001,  1.47727658e-126]])

In [3]: M_from_subs - M_from_autowrap_c
Out[3]: 
array([[-1.42108547e-014,  2.22044605e-015,  0.00000000e+000,
         3.55271368e-015,  7.54951657e-015,  1.38777878e-017],
       [ 1.99840144e-015,  3.55271368e-015,  3.03576608e-017,
         3.55271368e-015,  1.06581410e-014, -5.55111512e-017],
       [ 9.04929222e-006, -1.10319480e-005,  1.09142686e-007,
        -2.10723681e-005, -4.36232879e-005, -1.29072910e-007],
       [ 1.38777878e-017,  0.00000000e+000,  3.46944695e-018,
         5.55111512e-017,  1.21430643e-016,  5.55111512e-017],
       [ 1.86370367e-017,  0.00000000e+000,  4.16333634e-017,
        -1.11022302e-016,  0.00000000e+000,  0.00000000e+000],
       [ 1.38777878e-017,  0.00000000e+000,  1.38777878e-017,
         1.18182126e-125,  1.11022302e-016, -2.77555756e-017]])

In [4]: M_from_subs - M_from_autowrap_f
Out[4]: 
array([[-1.42108547e-014,  2.22044605e-015,  0.00000000e+000,
         3.55271368e-015,  7.54951657e-015,  1.38777878e-017],
       [ 1.99840144e-015,  3.55271368e-015,  3.03576608e-017,
         3.55271368e-015,  1.06581410e-014, -5.55111512e-017],
       [ 9.04929222e-006, -1.10319480e-005,  1.09142686e-007,
        -2.10723681e-005, -4.36232879e-005, -1.29072910e-007],
       [ 1.38777878e-017,  0.00000000e+000,  3.46944695e-018,
         5.55111512e-017,  1.21430643e-016,  5.55111512e-017],
       [ 1.86370367e-017,  0.00000000e+000,  4.16333634e-017,
        -1.11022302e-016,  0.00000000e+000,  0.00000000e+000],
       [ 1.38777878e-017,  0.00000000e+000,  1.38777878e-017,
         1.18182126e-125,  1.11022302e-016, -2.77555756e-017]])

The results from autowrap are closer to the results from evalf than they are to xreplace. But the third row of the matrix still has precision loss.

@isuruf
Copy link
Member

isuruf commented Oct 26, 2020

Can you print relative errors instead of absolute errors?

@isuruf
Copy link
Member

isuruf commented Oct 26, 2020

It's definitely numerical issues.

Here's what I did,

import symengine as se
M2 = se.sympify(M)

rat_subs = {}
for k, v in float_subs.items():
    v = se.Integer(int(v * 10**18))/10**18
    rat_subs[k] = v

M3 = M2.xreplace(rat_subs).applyfunc(lambda x: x.n(100, real=True))

print((M3 - M2.xreplace(float_subs))/M3)

@isuruf
Copy link
Member

isuruf commented Oct 26, 2020

You can increase the number of bits above from 100 to 1000 and symengine can easily do the computation. I would take the values of M3 above as the reference and compare it with others.

@oscarbenjamin
Copy link
Contributor

Here's a simpler reproducer of the possible problem:

from sympy import *
import sympy as sm
from sympy.utilities.autowrap import autowrap

rf, rr = sm.symbols('rf rr')
d1, d2, d3 = sm.symbols('d1 d2 d3')
l1, l2, l3, l4 = sm.symbols('l1 l2 l3 l4')
g = sm.symbols('g')
mc, md, me, mf = sm.symbols('mc md me mf')
ic11, ic22, ic33, ic31 = sm.symbols('ic11 ic22 ic33 ic31')
id11, id22 = sm.symbols('id11 id22')
ie11, ie22, ie33, ie31 = sm.symbols('ie11 ie22 ie33 ie31')
if11, if22 = sm.symbols('if11 if22')
q1, q2, q3, q4, q5, q6, q7, q8 = sm.symbols('q1, q2, q3, q4, q5, q6, q7, q8')

float_subs = {
    d1: 0.9534570696121849,
    d2: 0.2676445084476887,
    d3: 0.03207142672761929,
    g: 9.81,
    ic11: 7.178169776497895,
    ic22: 11.0,
    ic31: 3.8225535938357873,
    ic33: 4.821830223502103,
    id11: 0.0603,
    id22: 0.12,
    ie11: 0.05841337700152972,
    ie22: 0.06,
    ie31: 0.009119225261946298,
    ie33: 0.007586622998470264,
    if11: 0.1405,
    if22: 0.28,
    l1: 0.4707271515135145,
    l2: -0.47792881146460797,
    l3: -0.00597083392418685,
    l4: -0.3699518200282974,
    mc: 85.0,
    md: 2.0,
    me: 4.0,
    mf: 3.0,
    q1: 0.0,
    q2: -0.17447337661787718,
    q3: 0.0,
    q4: 0.6206670416476966,
    q5: 0.3300446174593725,
    q6: 0.0,
    q7: -0.2311385135743,
    q8: 0.0,
    rf: 0.35,
    rr: 0.3,
}

expr = 1/(d1*sin(q4)*cos(q5) + (-d2*sin(q7) - rf*((-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*cos(q5)*cos(q7) + (sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))*sin(q7)/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7)) + (d2*cos(q7) + rf*((-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*cos(q5)*cos(q7) + (sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))*cos(q7)/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4)) + (d3*cos(q7) + rf*(-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) +
sin(q4)*sin(q7)*cos(q5)**2)*cos(q7)/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*sin(q4)*cos(q5) - (-d1*sin(q5) - rr +
(d2*sin(q7) + rf*((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7)
+ (sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))*sin(q7)/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*sin(q7)*cos(q5) + (d2*cos(q7) +
rf*((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))*cos(q7)/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*cos(q5)*cos(q7) + (-d3*cos(q7) -
rf*(-(-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*sin(q5) +
sin(q4)*sin(q7)*cos(q5)**2)*cos(q7)/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*sin(q5))*(d1*sin(q4)**2*cos(q5) +
d1*cos(q4)**2*cos(q5) + (d2*(sin(q4)*cos(q7) + sin(q5)*sin(q7)*cos(q4)) +
rf*(sin(q4)*cos(q7) + sin(q5)*sin(q7)*cos(q4))*((-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*cos(q5)*cos(q7) + (sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4)) + (d3*(sin(q4)*cos(q7) + sin(q5)*sin(q7)*cos(q4)) +
rf*(-(-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*sin(q5) +
sin(q4)*sin(q7)*cos(q5)**2)*(sin(q4)*cos(q7) +
sin(q5)*sin(q7)*cos(q4))/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*sin(q4)*cos(q5) +
(-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*(-d2*(sin(q4)*sin(q7) -
sin(q5)*cos(q4)*cos(q7)) + d3*cos(q4)*cos(q5) + rf*(-(-sin(q4)*sin(q5)*sin(q7)
+ cos(q4)*cos(q7))*sin(q5) +
sin(q4)*sin(q7)*cos(q5)**2)*cos(q4)*cos(q5)/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) + sin(q7)*cos(q4))*sin(q7)*cos(q5))**2) -
rf*(sin(q4)*sin(q7) - sin(q5)*cos(q4)*cos(q7))*((-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*cos(q5)*cos(q7) + (sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2)))/(-d1*sin(q4)*sin(q5) - rr*sin(q4) +
(d2*(sin(q4)*cos(q7) + sin(q5)*sin(q7)*cos(q4)) + rf*(sin(q4)*cos(q7) +
sin(q5)*sin(q7)*cos(q4))*((-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*cos(q5)*cos(q7) + (sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*cos(q5)*cos(q7) + (-d3*(sin(q4)*cos(q7)
+ sin(q5)*sin(q7)*cos(q4)) - rf*(-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)*(sin(q4)*cos(q7) +
sin(q5)*sin(q7)*cos(q4))/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) + sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*sin(q5) +
(d2*(sin(q4)*sin(q7) - sin(q5)*cos(q4)*cos(q7)) - d3*cos(q4)*cos(q5) -
rf*(-(-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*sin(q5) +
sin(q4)*sin(q7)*cos(q5)**2)*cos(q4)*cos(q5)/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) + sin(q7)*cos(q4))*sin(q7)*cos(q5))**2) +
rf*(sin(q4)*sin(q7) - sin(q5)*cos(q4)*cos(q7))*((-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*cos(q5)*cos(q7) + (sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))/sqrt((-(-sin(q4)*sin(q5)*sin(q7) +
cos(q4)*cos(q7))*sin(q5) + sin(q4)*sin(q7)*cos(q5)**2)**2 +
((-sin(q4)*sin(q5)*sin(q7) + cos(q4)*cos(q7))*cos(q5)*cos(q7) +
(sin(q4)*sin(q5)*cos(q7) +
sin(q7)*cos(q4))*sin(q7)*cos(q5))**2))*sin(q7)*cos(q5)))


print(expr.xreplace(float_subs))
print(expr.evalf(subs=float_subs))

The output is

1114062987599.38
1115848243936.78

It is of course possible that xreplace is suffering from some cancellation somewhere. The result from evalf should be accurate unless there is a bug.

@moorepants
Copy link
Member Author

@isuruf You say "I would take the values of M3 above as the reference and compare it with others.". Just to be clear, take the values from your symengine calc as the reference?

@isuruf
Copy link
Member

isuruf commented Oct 26, 2020

Just to be clear, take the values from your symengine calc as the reference?

Yes, they are calculated with high enough precision (change 100->1000)

@moorepants
Copy link
Member Author

Ok, I'll give that a try to check things further in the PyDy PR.

I still am confused about these numerical errors because this particular problem I'm solving has been benchmarked numerically from independent algorithms and CAS systems with matches to 64 bit machine precision without special attention to the numerical errors by several different independent formulations on different machines. I wonder if my algorithm formulates the symbolics in a way that is more likely prone to the floating point errors.

@oscarbenjamin
Copy link
Contributor

If you look at the expression I just showed then it is Pow(a, -1) where a is around 1e-12. The a expression is an Add whose args are:

In [42]: a = expr.args[0]                                                                                                                                                         

In [43]: type(a)                                                                                                                                                                  
Out[43]: sympy.core.add.Add

In [44]: a.evalf(subs=float_subs)                                                                                                                                                 
Out[44]: 8.96179212033297e-13

In [45]: [arg.evalf(subs=float_subs) for arg in a.args]                                                                                                                           
Out[45]: [-0.00163173331378261, 0.110925590339285, 0.524581526210368, -0.0675316631618426, -0.566343720073132]

Adding these large numbers that cancel to a small number leads to a loss of precision. The evalf routine can overcome this: it understands about catastrophic cancellation so it increases the precision internally until the overall result is computed to the desired precision. That's why increasing the requested precision adds new digits but does not normally change the digits that are also shown at a lower precision:

In [47]: a.evalf(5, subs=float_subs)                                                                                                                                              
Out[47]: 8.9618e-13

In [48]: a.evalf(10, subs=float_subs)                                                                                                                                             
Out[48]: 8.961792120e-13

In [49]: a.evalf(20, subs=float_subs)                                                                                                                                             
Out[49]: 8.9617921203329702100e-13

In [50]: a.evalf(100, subs=float_subs)                                                                                                                                            
Out[50]: 0.0000000000008961792120332970209953859969957426259471149722386434155515462179021971187602703650011160171218489552

Of course evalf is slower but that's the price paid for this level of accuracy. By contrast naively evaluating at a fixed precision of p digits will give less than p correct digits. Simply summing the numbers in a different order can give a different result:

In [51]: nums = [arg.evalf(subs=float_subs) for arg in a.args]                                                                                                                    

In [52]: sum(nums)                                                                                                                                                                
Out[52]: 8.96172025477426e-13

In [53]: sum(nums[::-1])                                                                                                                                                          
Out[53]: 8.96164869743088e-13

Python provides math.fsum to minimise catastrophic cancellation when adding more than two 64-bit floats:

In [55]: import math                                                                                                                                                              

In [56]: math.fsum(nums)                                                                                                                                                          
Out[56]: 8.961926253187036e-13

That's still 0.001% out and it's as close as you can get when evaluating the Add in this form without using higher precision arithmetic to compute its arguments.

It is often possible to perform an algebraic rearrangement that will avoid catastrophic cancellation. A classic demonstration is a transform like replacing 1 - cos(x)**2 with sin(x)**2. When x is close to zero sin(x)**2 can be computed much more accurately:

In [107]: rep = {x: 1e-6}                                                                                                                                                         

In [108]: (1 - cos(x)**2).subs(rep)                                                                                                                                               
Out[108]: 1.00008890058234e-12

In [109]: sin(x**2).subs(rep)                                                                                                                                                     
Out[109]: 1.00000000000000e-12

In [110]: (1 - cos(x)**2).evalf(subs=rep)                                                                                                                                         
Out[110]: 9.99999999999667e-13

In [111]: sin(x**2).evalf(subs=rep)                                                                                                                                               
Out[111]: 1.00000000000000e-12

There are algorithms for optimising the form of an expression for numerical accuracy but they aren't implemented in sympy. It is still possible that you could take a different approach for generating the equations to reduce these effects.

I expect that it's also possible to get these expressions into a simpler form in general. A perhaps unrelated point but are you generating these with the Cholesky decomposition? You might find that the LDL decomposition gives something that simplifies better because it avoids the square roots.

@moorepants
Copy link
Member Author

Thanks @oscarbenjamin, very helpful. I've got some ideas about the derivation of the expressions that may be helpful now. I can use these methods for examining it carefully.

I can see that using evalf and @isuruf's manual evaluation both result in higher precision, which I will use, but It stills seems like xreplace() should at least do the same thing as autowrap(), for this particular issue. Shouldn't xreplace() be executing float operations with python floats?

@isuruf
Copy link
Member

isuruf commented Oct 27, 2020

Shouldn't xreplace() be executing float operations with python floats?

It depends. As @oscarbenjamin said, the order of operations matter. The C/Fortran code generator might be generating code that changes the operation order (sympy doesn't care about the order of operations) and therefore the results might change.

@moorepants
Copy link
Member Author

If I use the symengine computed values with the higher precision, I get the expected results from the model. That bug has been hanging in that PR in PyDy for like 5 years. So, my dynamics model formulation is correct, i.e. symbolic results are correct, but they are apparently in a form that is susceptible to floating point arithmetic errors, as Oscar showed. This really narrowed things down. I was looking in all the wrong places because this model has been formulated with numerous symbolic systems now and I've never seen any floating point issues associated with it. Thanks so much for the help. Now at least I know where to focus attention to getting the floating point errors resolved.

@isuruf I can't find any documentation for symengine and thus don't understand completely what your code is doing. For the .n() function what is the first argument? number of decimal digits or number of bits as you mentioned above? And why do you choose 18 when converting the floats to ratios of integers?

Thanks for the help.

@moorepants
Copy link
Member Author

I expect that it's also possible to get these expressions into a simpler form in general. A perhaps unrelated point but are you generating these with the Cholesky decomposition? You might find that the LDL decomposition gives something that simplifies better because it avoids the square roots.

The square roots present come from calculating magnitudes of 3D vectors. Solving a linear system isn't involved in getting the provided matrix.

@isuruf
Copy link
Member

isuruf commented Oct 30, 2020

For the .n() function what is the first argument? number of decimal digits or number of bits as you mentioned above?

Number of bits

And why do you choose 18 when converting the floats to ratios of integers?

Should have been 17. https://stackoverflow.com/questions/6118231/why-do-i-need-17-significant-digits-and-not-16-to-represent-a-double/

@isuruf
Copy link
Member

isuruf commented Oct 30, 2020

To be precise, you can do se.Integer(int(v * 2**52))/2**52 where 52 is the number of bits in a double in the mantissa after the dot.

@moorepants
Copy link
Member Author

Should I leave this issue open to address whether xreplace should have the same order of operations as autowrap? Otherwise, I can close.

@oscarbenjamin
Copy link
Contributor

I don't know anything about autowrap so I can't judge whether this is an issue or not. In general though xreplace should be expected to be subject to the kinds of numerical problems mentioned above.

@moorepants
Copy link
Member Author

autowrap, ufuncify, and lambdify (using module=numpy) all effectively do the same thing: code generate and evaluate with floats. So you can ask the same question for lambdify: Should xreplace use the same order of operations for floating point evaluation as lambdify?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants