Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test3D fails on several arches #1674

Open
giallu opened this issue Nov 24, 2017 · 10 comments
Open

test3D fails on several arches #1674

giallu opened this issue Nov 24, 2017 · 10 comments

Comments

@giallu
Copy link
Member

giallu commented Nov 24, 2017

Currentlly, test3D fails on ppc64, ppc64le, aarc64 and s390x arches.
The log from the test is:
test 65
Start 65: test3D

65: Test command: /home/fedora/giallu/rdkit/Code/GraphMol/Descriptors/test3D
65: Test timeout computed to be: 9.99988e+06
65: [11:17:31] -------------------------------------
65: [11:17:31] Basic PMI tests.
65: [11:17:31] done
65: [11:17:31] -------------------------------------
65: [11:17:31] More PMI/NPR tests.
65: [11:17:31] done
65: [11:17:31] -------------------------------------
65: [11:17:31] Basic NPR tests.
65: [11:17:31] done
65: [11:17:31] -------------------------------------
65: [11:17:31] PMI edge cases.
65: [11:17:31] done
65: [11:17:31] -------------------------------------
65: [11:17:31] NPR edge cases.
65: [11:17:31] done
65: [11:17:31] -------------------------------------
65: [11:17:31] 3D descriptors.
65: [11:17:31] done
65: [11:17:31] -------------------------------------
65: [11:17:31] 3D descriptor edge cases.
65: [11:17:31]
65:
65: ****
65: Test Assert
65: Expression Failed:
65: Violation occurred on line 456 in file /home/fedora/giallu/rdkit/Code/GraphMol/Descriptors/test3D.cpp
65: Failed Expression: fabs(val) < 1e-4
65: ****
65:
65: terminate called after throwing an instance of 'Invar::Invariant'
65: what(): Test Assert
1/1 Test #65: test3D ...........................***Exception: Other 0.70 sec

0% tests passed, 1 tests failed out of 1

Total Test time (real) = 0.72 sec

The following tests FAILED:
65 - test3D (OTHER_FAULT)

@giallu
Copy link
Member Author

giallu commented Nov 24, 2017

This is actually pretty weird, since a debug session does not reach anything suspicious:

Breakpoint 1, test3DEdges () at /home/fedora/giallu/rdkit/Code/GraphMol/Descriptors/test3D.cpp:455
455         val = RDKit::Descriptors::eccentricity(m);
(gdb) step
RDKit::Descriptors::eccentricity (mol=..., confId=-1, useAtomicMasses=true)
    at /home/fedora/giallu/rdkit/Code/GraphMol/Descriptors/PMI.cpp:184
184       PRECONDITION(mol.getNumConformers() >= 1, "molecule has no conformers");
(gdb) next
183     double eccentricity(const ROMol& mol, int confId, bool useAtomicMasses) {
(gdb) next
184       PRECONDITION(mol.getNumConformers() >= 1, "molecule has no conformers");
(gdb) next
186       if (!getMoments(mol, confId, useAtomicMasses, pm1, pm2, pm3)) {
(gdb) next
190       if (pm3 < 1e-4) {
(gdb) next
194         return sqrt(pm3 * pm3 - pm1 * pm1) / pm3;
(gdb) print pm1
$6 = 4.032
(gdb) print pm2
$7 = 4.032
(gdb) print pm3
$8 = 4.032
(gdb) print sqrt(pm3 * pm3 - pm1 * pm1) / pm3
$9 = 0
(gdb) next
196     }
(gdb) next
test3DEdges () at /home/fedora/giallu/rdkit/Code/GraphMol/Descriptors/test3D.cpp:456
456         TEST_ASSERT(fabs(val) < 1e-4);
(gdb) print (fabs(val) < 1e-4)
$10 = true
(gdb) next
[12:08:19] 

****
Test Assert
Expression Failed: 
Violation occurred on line 456 in file /home/fedora/giallu/rdkit/Code/GraphMol/Descriptors/test3D.cpp
Failed Expression: fabs(val) < 1e-4
****

terminate called after throwing an instance of 'Invar::Invariant'
  what():  Test Assert

Program received signal SIGABRT, Aborted.

Maybe the problem lies with the TEST_ASSERT macro? If so, I can't see how this can be specific of some platoforms and not others

@Clyde-fare
Copy link

Hitting the same problem on PPC64le

@greglandrum
Copy link
Member

Two suggestions to try here (I don't have access to a machine that has the problem, so I can't do this myself):

  1. try adding:
std::cerr<<fabs(val)<<std::endl;

just above line 456 and replacing the 1e-4 on line 456 with 1e-3 (or 1e-2 or whatever).

  1. try changing fabs to std::abs. that shouldn't make any difference...

@smoors
Copy link

smoors commented Mar 13, 2020

I get the same failure on a Skylake node when setting -march=native.
the suggested fix by @greglandrum does not work, as fabs(val) returns nan.
with -mavx2, all tests pass.

@bp-kelley
Copy link
Contributor

@smoors what is the compiler you are using? There was a reported clang bug where it couldn't pick up the right arch.

@smoors
Copy link

smoors commented Mar 13, 2020

@bp-kelley GCC-8.3.0 on CentOS 7.7
for completeness, these are all the flags I'm using:

-O2 -ftree-vectorize -march=native -fno-math-errno

@greglandrum
Copy link
Member

Just to be sure: do you have the same problem when you just use -O2 as the flags (i.e. skip all the architecture dependent stuff)

@greglandrum
Copy link
Member

greglandrum commented Mar 16, 2020

FWIW: my new linux box has a Coffee Lake CPU and I see the error when I compile with:

-O3 -march=native -DNDEBUG 

but I do not see it with

-O3 -DNDEBUG 

Since I can reproduce things now, it means that I at least have a chance to try and track this down/fix it.

@e-kwsm
Copy link
Contributor

e-kwsm commented Mar 16, 2020

GCC 8.3.0 says -march=avx2 is invalid:

$ g++ -march=avx2 -xc++ /dev/null
cc1plus: error: bad value ('avx2') for '-march=' switch
cc1plus: note: valid arguments to '-march=' switch are: nocona core2 nehalem corei7 westmere sandybridge corei7-avx ivybridge core-avx-i haswell core-avx2 broadwell skylake skylake-avx512 cannonlake icelake-client icelake-server bonnell atom silvermont slm knl knm x86-64 eden-x2 nano nano-1000 nano-2000 nano-3000 nano-x2 eden-x4 nano-x4 k8 k8-sse3 opteron opteron-sse3 athlon64 athlon64-sse3 athlon-fx amdfam10 barcelona bdver1 bdver2 bdver3 bdver4 znver1 btver1 btver2 native

g++ -c -Q -march=native --help=target prints details (https://wiki.gentoo.org/wiki/GCC_optimization#-march) but there are many options...

@smoors
Copy link

smoors commented Mar 16, 2020

@e-kwsm my bad, I actually used -mavx2, sorry for the confusion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants