Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dataframe_performance stack overflow #68

Closed
young66 opened this issue May 11, 2020 · 7 comments
Closed

dataframe_performance stack overflow #68

young66 opened this issue May 11, 2020 · 7 comments

Comments

@young66
Copy link

young66 commented May 11, 2020

when i try to run the test sample of ./dataframe_performance
it seen to crashed.
the core stack is like this:

Program received signal SIGSEGV, Segmentation fault.
hmdf::DateTime::maketime_ (this=<error reading variable: Cannot access memory at address 0xffffff7ffff8>, ltime=<error reading variable: Cannot access memory at address 0xffffff7ffff0>)
at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:973
973 DateTime::EpochType DateTime::maketime_ (struct tm &ltime) const noexcept {
#0 hmdf::DateTime::maketime_ (this=<error reading variable: Cannot access memory at address 0xffffff7ffff8>,
ltime=<error reading variable: Cannot access memory at address 0xffffff7ffff0>) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:973
#1 0x000000000041f3a4 in hmdf::DateTime::time (this=0xffffffffe5c8) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:662
#2 0x000000000041f260 in hmdf::DateTime::sec (this=0xffffffffe5c8) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:615
#3 0x00000000004200a4 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:975
#4 0x000000000041f3a4 in hmdf::DateTime::time (this=0xffffffffe5c8) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:662
#5 0x000000000041f260 in hmdf::DateTime::sec (this=0xffffffffe5c8) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:615
#6 0x00000000004200a4 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:975
................repeat.................
#130963 0x000000000041f3a4 in hmdf::DateTime::time (this=0xffffffffe5c8) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:662
#130964 0x000000000041e95c in hmdf::DateTime::compare (this=0xffffffffe5c8, rhs=...) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:348
#130965 0x0000000000405870 in hmdf::operator< (lhs=..., rhs=...) at /home/user/dataframe/DataFrame-1.9.0/include/DataFrame/Utils/DateTime.h:343
#130966 0x00000000004074e8 in hmdf::DataFrame<long, hmdf::HeteroVector>::gen_datetime_index (start_datetime=0x420c58 "01/01/1970", end_datetime=0x420c48 "08/15/2019",
t_freq=hmdf::time_frequency::secondly, increment=1, tz=hmdf::DT_TIME_ZONE::LOCAL) at /home/user/dataframe/DataFrame-1.9.0/include/DataFrame/Internals/DataFrame_set.tcc:250
#130967 0x0000000000403fd8 in main (argc=1, argv=0xffffffffe988) at /home/user/dataframe/DataFrame-1.9.0/test/dataframe_performance.cc:45

(gdb) f 0
#0 hmdf::DateTime::maketime_ (this=<error reading variable: Cannot access memory at address 0xffffff7ffff8>,
ltime=<error reading variable: Cannot access memory at address 0xffffff7ffff0>) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:973
973 DateTime::EpochType DateTime::maketime_ (struct tm &ltime) const noexcept {
(gdb) f 1
#1 0x000000000041f3a4 in hmdf::DateTime::time (this=0xffffffffe5c8) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:662
662 const_cast<DateTime *>(this)->time_ = maketime_ (ltime);
(gdb) f 2
#2 0x000000000041f260 in hmdf::DateTime::sec (this=0xffffffffe5c8) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:615
615 const_cast<DateTime *>(this)->breaktime_ (this->time (), nanosec ());
(gdb) f 3
#3 0x00000000004200a4 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:975
975 ltime.tm_sec = sec ();
(gdb) f 4
#4 0x000000000041f3a4 in hmdf::DateTime::time (this=0xffffffffe5c8) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:662
662 const_cast<DateTime *>(this)->time_ = maketime_ (ltime);
(gdb) f 130966
#130966 0x00000000004074e8 in hmdf::DataFrame<long, hmdf::HeteroVector>::gen_datetime_index (start_datetime=0x420c58 "01/01/1970", end_datetime=0x420c48 "08/15/2019",
t_freq=hmdf::time_frequency::secondly, increment=1, tz=hmdf::DT_TIME_ZONE::LOCAL) at /home/user/dataframe/DataFrame-1.9.0/include/DataFrame/Internals/DataFrame_set.tcc:250
250 while (start_di < end_di)
(gdb) l
245 break;
246 default:
247 throw NotFeasible ("ERROR: gen_datetime_index()");
248 }
249
250 while (start_di < end_di)
251 generate_ts_index(index_vec, start_di, t_freq, increment);
252
253 return (index_vec);
254 }
(gdb) f 130965
#130965 0x0000000000405870 in hmdf::operator< (lhs=..., rhs=...) at /home/user/dataframe/DataFrame-1.9.0/include/DataFrame/Utils/DateTime.h:343
343 return (lhs.compare (rhs) < 0);
(gdb) f 130964
#130964 0x000000000041e95c in hmdf::DateTime::compare (this=0xffffffffe5c8, rhs=...) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:348
348 const EpochType t = this->time() - rhs.time();
(gdb) f 130963
#130963 0x000000000041f3a4 in hmdf::DateTime::time (this=0xffffffffe5c8) at /home/user/dataframe/DataFrame-1.9.0/src/Utils/DateTime.cc:662
662 const_cast<DateTime *>(this)->time_ = maketime_ (ltime);
(gdb)

=========================
some info of my server :

gcc -v

Using built-in specs.
COLLECT_GCC=gcc
COLLECT_LTO_WRAPPER=/usr/libexec/gcc/aarch64-linux-gnu/7.3.0/lto-wrapper
Target: aarch64-linux-gnu
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --enable-shared --enable-threads=posix --enable-checking=release -with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-linker-hash-style=gnu --enable-languages=c,c++,objc,obj-c++,fortran,lto --enable-plugin --enable-initfini-array --disable-libgcj --without-isl --without-cloog --enable-gnu-indirect-function --build=aarch64-linux-gnu --with-stage1-ldflags=' -Wl,-z,relro,-z,now' --with-boot-ldflags=' -Wl,-z,relro,-z,now' --with-multilib-list=lp64
Thread model: posix
gcc version 7.3.0 (GCC)

uname -a

Linux arm1 4.19.36-vhulk1907.1.0.h619.eulerosv2r8.aarch64 #1 SMP Mon Jul 22 00:00:00 UTC 2019 aarch64 aarch64 aarch64 GNU/Linux

ulimit -a

core file size (blocks, -c) unlimited
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 2054314
max locked memory (kbytes, -l) 2097152
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 2054314
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

cat /proc/cpuinfo

processor : 0
BogoMIPS : 200.00
cpu MHz : 2600.000
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0
........repeat......
processor : 95
BogoMIPS : 200.00
cpu MHz : 2600.000
Features : fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm
CPU implementer : 0x48
CPU architecture: 8
CPU variant : 0x1
CPU part : 0xd01
CPU revision : 0

free -g

total used free shared buff/cache available
Mem: 501 279 217 0 4 215
Swap: 0 0 0

@hosseinmoein
Copy link
Owner

The stack is not making sense to me. It is pointing to blank lines in the source code. And I cannot reproduce. I assume you are running this off of master.

But in general, dataframe_performance uses a very large memory footprint. It is very possible that you are running out of memory and things get messed up. Change the dataframe_performance to use half of interval it uses and see how that runs.

@young66
Copy link
Author

young66 commented May 12, 2020

i dont think this is out of memory, my server total memory 512GB, free memory 217GB.
the core stack depth is 130966, i think it run into "Recursive loop" .
i hava update the code from 1.9.0 to lastest , and the stack is same :

#130947 0x00000000004200a8 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at DataFrame/src/Utils/DateTime.cc:1012
#130948 0x000000000041f3a8 in hmdf::DateTime::time (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:699
#130949 0x000000000041f264 in hmdf::DateTime::sec (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:652
#130950 0x00000000004200a8 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at DataFrame/src/Utils/DateTime.cc:1012
#130951 0x000000000041f3a8 in hmdf::DateTime::time (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:699
#130952 0x000000000041f264 in hmdf::DateTime::sec (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:652
#130953 0x00000000004200a8 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at DataFrame/src/Utils/DateTime.cc:1012
#130954 0x000000000041f3a8 in hmdf::DateTime::time (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:699
#130955 0x000000000041f264 in hmdf::DateTime::sec (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:652
#130956 0x00000000004200a8 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at DataFrame/src/Utils/DateTime.cc:1012
#130957 0x000000000041f3a8 in hmdf::DateTime::time (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:699
#130958 0x000000000041f264 in hmdf::DateTime::sec (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:652
#130959 0x00000000004200a8 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at DataFrame/src/Utils/DateTime.cc:1012
#130960 0x000000000041f3a8 in hmdf::DateTime::time (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:699
#130961 0x000000000041f264 in hmdf::DateTime::sec (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:652
#130962 0x00000000004200a8 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at DataFrame/src/Utils/DateTime.cc:1012
#130963 0x000000000041f3a8 in hmdf::DateTime::time (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:699
#130964 0x000000000041e960 in hmdf::DateTime::compare (this=0xffffffffe5c8, rhs=...) at DataFrame/src/Utils/DateTime.cc:385
#130965 0x0000000000405870 in hmdf::operator< (lhs=..., rhs=...) at DataFrame/include/DataFrame/Utils/DateTime.h:343
#130966 0x00000000004074e8 in hmdf::DataFrame<long, hmdf::HeteroVector>::gen_datetime_index (start_datetime=0x420c58 "01/01/1970", end_datetime=0x420c48 "08/15/2019",
t_freq=hmdf::time_frequency::secondly, increment=1, tz=hmdf::DT_TIME_ZONE::LOCAL) at DataFrame/include/DataFrame/Internals/DataFrame_set.tcc:250
#130967 0x0000000000403fd8 in main (argc=1, argv=0xffffffffe988) at DataFrame/test/dataframe_performance.cc:45
(gdb) f 130962
#130962 0x00000000004200a8 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at DataFrame/src/Utils/DateTime.cc:1012
1012 ltime.tm_sec = sec ();
(gdb) l
1007
1008 // ----------------------------------------------------------------------------
1009
1010 DateTime::EpochType DateTime::maketime_ (struct tm &ltime) const noexcept {
1011
1012 ltime.tm_sec = sec ();
1013 ltime.tm_isdst = -1;
1014 ltime.tm_min = minute ();
1015 ltime.tm_hour = hour ();
1016 ltime.tm_mday = dmonth ();
(gdb) f 130963
#130963 0x000000000041f3a8 in hmdf::DateTime::time (this=0xffffffffe5c8) at DataFrame/src/Utils/DateTime.cc:699
699 const_cast<DateTime *>(this)->time_ = maketime_ (ltime);
(gdb) l
694 struct tm ltime;
695
696 // It always makes me sad to use const_cast<>. But then I get
697 // over it.
698 //
699 const_cast<DateTime *>(this)->time_ = maketime_ (ltime);
700 }
701
702 return (time_);
703 }
(gdb) f 130964
#130964 0x000000000041e960 in hmdf::DateTime::compare (this=0xffffffffe5c8, rhs=...) at DataFrame/src/Utils/DateTime.cc:385
385 const EpochType t = this->time() - rhs.time();
(gdb) l
380
381 // ----------------------------------------------------------------------------
382
383 DateTime::EpochType DateTime::compare (const DateTime &rhs) const {
384
385 const EpochType t = this->time() - rhs.time();
386
387 return (t == 0 ? nanosec () - rhs.nanosec () : t);
388 }
389
(gdb) f 130965
#130965 0x0000000000405870 in hmdf::operator< (lhs=..., rhs=...) at DataFrame/include/DataFrame/Utils/DateTime.h:343
343 return (lhs.compare (rhs) < 0);
(gdb) l
338
339 // ----------------------------------------------------------------------------
340
341 inline bool operator < (const DateTime &lhs, const DateTime &rhs) noexcept {
342
343 return (lhs.compare (rhs) < 0);
344 }
345
346 // ----------------------------------------------------------------------------
347
(gdb) f 130966
#130966 0x00000000004074e8 in hmdf::DataFrame<long, hmdf::HeteroVector>::gen_datetime_index (start_datetime=0x420c58 "01/01/1970", end_datetime=0x420c48 "08/15/2019",
t_freq=hmdf::time_frequency::secondly, increment=1, tz=hmdf::DT_TIME_ZONE::LOCAL) at DataFrame/include/DataFrame/Internals/DataFrame_set.tcc:250
250 while (start_di < end_di)
(gdb) l
245 break;
246 default:
247 throw NotFeasible ("ERROR: gen_datetime_index()");
248 }
249
250 while (start_di < end_di)
251 generate_ts_index(index_vec, start_di, t_freq, increment);
252
253 return (index_vec);
254 }
(gdb) f 130967
#130967 0x0000000000403fd8 in main (argc=1, argv=0xffffffffe988) at DataFrame/test/dataframe_performance.cc:45
45 MyDataFrame::gen_datetime_index("01/01/1970",
(gdb) l
40 int main(int argc, char *argv[]) {
41
42 MyDataFrame df;
43 const size_t index_sz =
44 df.load_index(
45 MyDataFrame::gen_datetime_index("01/01/1970",
46 "08/15/2019",
47 time_frequency::secondly,
48 1));
49 RandGenParams p;

A debugging session is active.

@hosseinmoein
Copy link
Owner

It is not possible for me to debug from here, since I cannot reproduce it. Your stack trace doesn't make sense to me or I am not seeing it. Did you say you run out of stack space? But there is no recursion anywhere.

Anyway, I think the best way is for you to edit dataframe_performance.cc and make the interval much smaller and see if it runs. than increase it incrementally.

Also, how do you compile this? do compile it with -g or -O?

@young66
Copy link
Author

young66 commented May 13, 2020

It seen to be out of stack space.

When i complie the code by default, the dataframe_performance crashed, but i cannot see any source info.

I add SET(CMAKE_BUILD_TYPE "Debug") in CMakeLists.txt, so i can see the source info of the callstack.

I try to read the code, i think i've got the problem
the problem is in this stack
in the class DateTime the stack is like this:
compare -> time -> maketime_ -> sec ->time -> maketime_ -> ...etc..again and again

in gen_datetime_index function of frame #130966, i print the start_di and end_di.time_
the result is

(gdb) p start_di
$10 = {static TIMEZONES_ = 0x4502c8 hmdf::DateTime::TIMEZONES_, static INVALID_TIME_T_ = -1, static dt_init_ = {}, date_ = 4294967295, hour_ = 65535, minute_ = 65535,
second_ = 65535, nanosecond_ = 0, time_ = -1, week_day_ = hmdf::DT_WEEKDAY::BAD_DAY, time_zone_ = hmdf::DT_TIME_ZONE::LOCAL, static MONTH_ = 0x44f980 hmdf::DateTime::MONTH_,
static MONTH_BR_ = , static WDAY_ = ,
static WDAY_BR_ = }
(gdb) p end_di.time_
$3 = 1565798400
(gdb)

when time_ equal to INVALID_TIME_T_ with -1,
DateTime::time (line 699) try to call maketime_;
DateTime::maketime_(line 1012) call sec
DateTime::sec call time because second_ equal to 65535, equal to (SecondType)-1;
so this stack is in a loop, because of time_ and second_ equal to -1.

so i add a breakpoint in DataFrame::gen_datetime_index and try to trace constructor of the start_di with the para "01/01/1970".
i have step the code of DateTime::DateTime (line 243), i have not see any assignment of the time_ , by default the value is -1.
i think it should assignment in the constructor of DateTime.

and i add a watch point of start_di.second_ and start_di.time_
i have see time_ change -1 to -28800 in DateTime::time (DateTime.cc:699), because timezone is +8 , and mktime return -28800.
in the loop of DataFrame_set.tcc:251, time_ with value -28800 increase to -1.
now time_ and second_ equal to -1 in the same time , so i got the stack.

i try to set the timezone to New_York(), the problem disappear;
ln /usr/share/zoneinfo/America/New_York /etc/localtime -s
i try to set the timezone to Berlin (+1), the problem happen again.
ln /usr/share/zoneinfo/Europe/Berlin /etc/localtime -s

#130957 in hmdf::DateTime::time (this=0xffffffffe5c8) at src/Utils/DateTime.cc:699
#130958 in hmdf::DateTime::sec (this=0xffffffffe5c8) at src/Utils/DateTime.cc:652
#130959 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at src/Utils/DateTime.cc:1012
#130960 in hmdf::DateTime::time (this=0xffffffffe5c8) at src/Utils/DateTime.cc:699
#130961 in hmdf::DateTime::sec (this=0xffffffffe5c8) at src/Utils/DateTime.cc:652
#130962 in hmdf::DateTime::maketime_ (this=0xffffffffe5c8, ltime=...) at src/Utils/DateTime.cc:1012
#130963 in hmdf::DateTime::time (this=0xffffffffe5c8) at src/Utils/DateTime.cc:699
#130964 0x000000000041e960 in hmdf::DateTime::compare (this=0xffffffffe5c8, rhs=...) at src/Utils/DateTime.cc:385

@hosseinmoein
Copy link
Owner

Interesting, thanks for pointing it out. I have to fix that. What timezone do you run it on?
Also, you don't need to change the CMake file for debug vs. release modes

cmake -DCMAKE_BUILD_TYPE=[Debug | Release]

It is in the readme file

@hosseinmoein
Copy link
Owner

@young66 , I fixed the issue in master. Please try it again
thx

@young66
Copy link
Author

young66 commented May 14, 2020

I use the timezone of /usr/share/zoneinfo/Asia/Shanghai
I update the code, i think the problem is solved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants