Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestNan and TestInf fail on -O3 #3944

Closed
DeannaGelbart opened this issue Feb 19, 2020 · 7 comments
Closed

TestNan and TestInf fail on -O3 #3944

DeannaGelbart opened this issue Feb 19, 2020 · 7 comments
Labels

Comments

@DeannaGelbart
Copy link

To reproduce:



cd src

CXXFLAGS='-g -O3' ./configure 
make clean; make depend
cd util
make 
make test

Result:


Running text-utils-test .../bin/bash: line 1: 33755 Segmentation fault      (core dumped) ./$x > $x.testlog 2>&1
 0s... FAIL text-utils-test

Testlog is empty:

 ll text-utils-test.testlog
-rw-r--r-- 1 dee dee 0 Feb 19 12:21 text-utils-test.testlog




gdb output:



$ gdb text-utils-test                                                                                                                              
(gdb) r
Starting program: /mnt/disk1/dee/github/kaldi/src/util/text-utils-test
warning: Error disabling address space randomization: Operation not permitted
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Program received signal SIGSEGV, Segmentation fault.
0x0000562a8de396b5 in std::char_traits<char>::copy (__n=4, __s2=0x7ffca9c92d30 "-nan", __s1=<optimized out>) at /usr/include/c++/7/bits/char_traits.h:350
350             return static_cast<char_type*>(__builtin_memcpy(__s1, __s2, __n));
(gdb) thread apply all bt
Thread 1 (Thread 0x7f3f5d1da740 (LWP 34589)):
#0  0x0000562a8de396b5 in std::char_traits<char>::copy (__n=4, __s2=0x7ffca9c92d30 "-nan", __s1=<optimized out>) at /usr/include/c++/7/bits/char_traits.h:350
#1  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_copy (__n=4, __s=0x7ffca9c92d30 "-nan", __d=<optimized out>) at /usr/include/c++/7/bits/basic_string.h:340
#2  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_S_copy_chars (__k2=0x7ffca9c92d34 "", __k1=0x7ffca9c92d30 "-nan", __p=<optimized out>)
    at /usr/include/c++/7/bits/basic_string.h:382
#3  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*> (__end=0x7ffca9c92d34 "", __beg=0x7ffca9c92d30 "-nan", this=0x7ffca9c92fc0)
    at /usr/include/c++/7/bits/basic_string.tcc:225
#4  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct_aux<char*> (__end=0x7ffca9c92d34 "", __beg=0x7ffca9c92d30 "-nan", this=0x7ffca9c92fc0)
    at /usr/include/c++/7/bits/basic_string.h:236
#5  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_construct<char*> (__end=0x7ffca9c92d34 "", __beg=0x7ffca9c92d30 "-nan", this=0x7ffca9c92fc0)
    at /usr/include/c++/7/bits/basic_string.h:255
#6  std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string<char*, void> (__a=..., __end=0x7ffca9c92d34 "", __beg=0x7ffca9c92d30 "-nan", this=0x7ffca9c92fc0)
    at /usr/include/c++/7/bits/basic_string.h:607
#7  __gnu_cxx::__to_xstring<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, char> (__convf=<optimized out>, __n=328, __fmt=0x562a8de54904 "%f", __fmt=0x562a8de54904 "%f",
    __n=328, __convf=<optimized out>) at /usr/include/c++/7/ext/string_conversions.h:115
#8  0x0000562a8de4414c in std:: (__val=-nan(0x8000000000000)) at /usr/include/c++/7/bits/basic_string.h:6462
#9  kaldi::TestNan<float> () at text-utils-test.cc:238
#10 0x00007ffca9c92de0 in ?? ()
#11 0x00007ffca9c92e58 in ?? ()
#12 0x00007ffca9c92d00 in ?? ()
#13 0x00007f3f5ca00b50 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#14 0x00007f3f5c73c501 in std::ios_base::~ios_base() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#15 0x0000562a8de508cf in std::basic_ios<char, std::char_traits<char> >::~basic_ios (this=0x7ffca9c930f8, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/basic_ios.h:282
#16 std::__cxx11::basic_istringstream<char, std::char_traits<char>, std::allocator<char> >::~basic_istringstream (this=0x7ffca9c92de0, __in_chrg=<optimized out>, __vtt_parm=<optimized out>)
    at /usr/include/c++/7/sstream:446
#17 kaldi::ConvertStringToReal<double> (str=..., out=0x7ffca9c92fb8) at text-utils.cc:240
#18 0xfff8000000000000 in ?? ()
#19 0x007920302e312078 in ?? ()
#20 0x000031342e362079 in ?? ()
#21 0x00007ffca9c92fd0 in ?? ()
#22 0x0000562a0000000a in ?? ()
#23 0x00007f006e616e2d in ?? ()
#24 0x0000000000000000 in ?? ()
(gdb) quit 


valgrind output:


$  valgrind ./text-utils-test
==34652== Memcheck, a memory error detector
==34652== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==34652== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==34652== Command: ./text-utils-test
==34652==
==34652== Invalid write of size 8
==34652==    at 0x10E6B5: copy (char_traits.h:350)
==34652==    by 0x10E6B5: _S_copy (basic_string.h:340)
==34652==    by 0x10E6B5: _S_copy_chars (basic_string.h:382)
==34652==    by 0x10E6B5: _M_construct<char*> (basic_string.tcc:225)
==34652==    by 0x10E6B5: _M_construct_aux<char*> (basic_string.h:236)
==34652==    by 0x10E6B5: _M_construct<char*> (basic_string.h:255)
==34652==    by 0x10E6B5: basic_string<char*> (basic_string.h:607)
==34652==    by 0x10E6B5: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > __gnu_cxx::__to_xstring<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ch
ar>(int (*)(char*, unsigned long, char const*, __va_list_tag*), unsigned long, char const*, ...) [clone .constprop.100] (string_conversions.h:115)
==34652==    by 0x11914B: to_string (basic_string.h:6462)
==34652==    by 0x11914B: void kaldi::TestNan<float>() (text-utils-test.cc:238)
==34652==  Address 0x1fff001000 is not stack'd, malloc'd or (recently) free'd
==34652==
==34652==
==34652== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==34652==  Access not within mapped region at address 0x1FFF001000
==34652==    at 0x10E6B5: copy (char_traits.h:350)
==34652==    by 0x10E6B5: _S_copy (basic_string.h:340)
==34652==    by 0x10E6B5: _S_copy_chars (basic_string.h:382)
==34652==    by 0x10E6B5: _M_construct<char*> (basic_string.tcc:225)
==34652==    by 0x10E6B5: _M_construct_aux<char*> (basic_string.h:236)
==34652==    by 0x10E6B5: _M_construct<char*> (basic_string.h:255)
==34652==    by 0x10E6B5: basic_string<char*> (basic_string.h:607)
==34652==    by 0x10E6B5: std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > __gnu_cxx::__to_xstring<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ch
ar>(int (*)(char*, unsigned long, char const*, __va_list_tag*), unsigned long, char const*, ...) [clone .constprop.100] (string_conversions.h:115)
==34652==    by 0x11914B: to_string (basic_string.h:6462)
==34652==    by 0x11914B: void kaldi::TestNan<float>() (text-utils-test.cc:238)
==34652==  If you believe this happened as a result of a stack
==34652==  overflow in your program's main thread (unlikely but
==34652==  possible), you can try to increase the size of the
==34652==  main thread stack using the --main-stacksize= flag.
==34652==  The main thread stack size used in this run was 8388608.
==34652==
==34652== HEAP SUMMARY:
==34652==     in use at exit: 0 bytes in 0 blocks
==34652==   total heap usage: 2,092 allocs, 2,092 frees, 203,287 bytes allocated
==34652==
==34652== All heap blocks were freed -- no leaks are possible
==34652==
==34652== For counts of detected and suppressed errors, rerun with: -v
==34652== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
@danpovey
Copy link
Contributor

danpovey commented Feb 20, 2020 via email

@DeannaGelbart
Copy link
Author

The error still happens with if I configure with CXXFLAGS='-g -O3 -fno-fast-math' ./configure, which results in compilations like this:

g++ -std=c++11 -I.. -isystem /mnt/disk1/dee/git/kaldi/tools/openfst-1.6.7/include -O1  -Wall -Wno-sign-compare -Wno-unused-local-typedefs -Wno-deprecated-declarations -Winit-self -DKALDI_DOUBLEPRECISION=0 -DHAVE_EXECINFO_H=1 -DHAVE_CXXABI_H -DHAVE_MKL -I/opt/intel/mkl/include -m64 -msse -msse2 -pthread -g -DHAVE_CUDA -I/usr/local/cuda/include -fPIC -pthread -isystem /mnt/disk1/dee/git/kaldi/tools/openfst-1.6.7/include -g -O3 -fno-fast-math   -c -o text-utils-test.o text-utils-test.cc

@danpovey
Copy link
Contributor

I assume either you failed to do make clean or -fno-fast-math is somehow failing to override -O3.

@DeannaGelbart
Copy link
Author

I did the clean. Should I take the test failure as a warning not to use kaldi on this optimization setting? Thanks.

@danpovey
Copy link
Contributor

danpovey commented Feb 20, 2020 via email

@entn-at
Copy link
Contributor

entn-at commented Feb 20, 2020

Oddly enough, I'm using -O3 without problems:

Running const-integer-set-test ... 0s... SUCCESS const-integer-set-test
Running stl-utils-test ... 1s... SUCCESS stl-utils-test
Running text-utils-test ... 0s... SUCCESS text-utils-test
Running edit-distance-test ... 0s... SUCCESS edit-distance-test
Running hash-list-test ... 0s... SUCCESS hash-list-test
Running kaldi-io-test ... 2s... SUCCESS kaldi-io-test
Running parse-options-test ... 0s... SUCCESS parse-options-test
Running kaldi-table-test ... 2s... SUCCESS kaldi-table-test
Running simple-options-test ... 0s... SUCCESS simple-options-test
Running kaldi-thread-test ... 0s... SUCCESS kaldi-thread-test

I think the problematic options, such as -ffast-math get enabled when using -Ofast (https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html).

@jtrmal
Copy link
Contributor

jtrmal commented Feb 20, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants