Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GoAccess 0.9.3 crashed by Signal 11 when loading persisted data #296

Closed
TempleNode opened this issue Sep 8, 2015 · 10 comments
Closed

GoAccess 0.9.3 crashed by Signal 11 when loading persisted data #296

TempleNode opened this issue Sep 8, 2015 · 10 comments

Comments

@TempleNode
Copy link

==2644== GoAccess 0.9.3 crashed by Signal 11
==2644==
==2644== VALUES AT CRASH POINT
==2644==
==2644== Line number: 29224396
==2644== Offset: 29224396
==2644== Invalid data: 284 
==2644== Piping: 0
==2644== Response size: 4001711694 bytes
==2644==
==2644== STACK TRACE:
==2644==
==2644== 0 /usr/local/bin/goaccess(sigsegv_handler+0x12f) [0x40829f]
==2644== 1 /lib64/libc.so.6() [0x3e12a326a0]
==2644== 2 /usr/local/bin/goaccess() [0x41436a]
==2644== 3 /lib64/libc.so.6() [0x3e12a34ae4]
==2644== 4 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 5 /lib64/libc.so.6() [0x3e12a348c8]
==2644== 6 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 7 /lib64/libc.so.6() [0x3e12a348c8]
==2644== 8 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 9 /lib64/libc.so.6() [0x3e12a348c8]
==2644== 10 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 11 /lib64/libc.so.6() [0x3e12a348c8]
==2644== 12 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 13 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 14 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 15 /lib64/libc.so.6() [0x3e12a348c8]
==2644== 16 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 17 /lib64/libc.so.6() [0x3e12a348c8]
==2644== 18 /lib64/libc.so.6() [0x3e12a348c8]
==2644== 19 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 20 /lib64/libc.so.6() [0x3e12a348c8]
==2644== 21 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 22 /lib64/libc.so.6() [0x3e12a348b8]
==2644== 23 /lib64/libc.so.6(qsort_r+0x29c) [0x3e12a34f1c]
==2644== 24 /usr/local/bin/goaccess(sort_raw_data+0x19) [0x4145d9]
==2644== 25 /usr/local/bin/goaccess(parse_raw_data+0x5a) [0x419e3a]
==2644== 26 /usr/local/bin/goaccess() [0x40c4b3]
==2644== 27 /usr/local/bin/goaccess(main+0x18a) [0x40cf6a]
==2644== 28 /lib64/libc.so.6(__libc_start_main+0xfd) [0x3e12a1ed5d]
==2644== 29 /usr/local/bin/goaccess() [0x4062c9]
==2644==
==2644== Please report it by opening an issue on GitHub:
==2644== https://github.com/allinurl/goaccess/issues

Server is centos 6

goaccess.conf

time-format %H:%M:%S
date-format %d/%b/%Y
log-format %h - %^[%d:%t %^] "%r" %s %b "%R" "%u"
config-dialog false
color-scheme 1
hl-header true
no-color false
no-column-names false
no-progress false
with-mouse false
no-csv-summary false
static-file .css
static-file .CSS
static-file .dae
static-file .DAE
static-file .eot
static-file .EOT
static-file .gif
static-file .GIF
static-file .ico
static-file .ICO
static-file .jpeg
static-file .JPEG
static-file .jpg
static-file .JPG
static-file .js
static-file .JS
static-file .map
static-file .MAP
static-file .mp3
static-file .MP3
static-file .pdf
static-file .PDF
static-file .png
static-file .PNG
static-file .svg
static-file .SVG
static-file .swf
static-file .SWF
static-file .ttf
static-file .TTF
static-file .txt
static-file .TXT
static-file .woff
static-file .WOFF


agent-list true
http-method true
http-protocol true
no-query-string false
no-term-resolver false
real-os true
with-output-resolver false
444-as-404 false
4xx-to-unique-count false
double-decode false
ignore-crawlers false
geoip-database /usr/share/GeoIP/GeoIP.dat
keep-db-files true
load-from-disk true
db-path /opt/goaccess-db/
@allinurl
Copy link
Owner

allinurl commented Sep 8, 2015

A few questions for you.

  1. Are you using the on-disk store?
  2. Are you outputting to a terminal or a file? and did it break upon sorting a specific field?
  3. Are you able to replicate this with a smaller subset of requests/hits? e.g., 1000 requests from your access log?

Thanks

@TempleNode
Copy link
Author

  1. Yes using on-disk store.
  2. I had a bashed script to run script on each domain to output to a html file
  3. From what I see only when database increase the problem appear. If I run first time or second time it runs ok but after database increase I got that error.
    If I set load-from-disk false it is running ok.

@allinurl
Copy link
Owner

allinurl commented Sep 9, 2015

I'm looking into this, but, could you please try to replicate it in gdb? and preferably with a smaller data set.

./configure --enable-debug --enable-utf8 --enable-geoip --enable-tcb=btree
# make
# gdb ./goaccess
// use args needed to replicate the issue
(gdb) set args -f access.log --keep-db-files --load-from-disk <enter>
(gdb) r <enter>
// after it crashes
(gdb) bt <enter>

@TempleNode
Copy link
Author

Program received signal SIGSEGV, Segmentation fault.
0x000000000041480a in cmp_raw_num_desc ()
Missing separate debuginfos, use: debuginfo-install GeoIP-1.6.5-1.el6.x86_64 bzip2-libs-1.0.5-7.el6_0.x86_64 glibc-2.12-1.166.el6_7.1.x86_64 ncurses-libs-5.7-4.20090207.el6.x86_64 tokyocabinet-1.4.33-6.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0 0x000000000041480a in cmp_raw_num_desc ()
#1 0x0000003e12a34ae4 in msort_with_tmp () from /lib64/libc.so.6
#2 0x0000003e12a348c8 in msort_with_tmp () from /lib64/libc.so.6
#3 0x0000003e12a348c8 in msort_with_tmp () from /lib64/libc.so.6
#4 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#5 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#6 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#7 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#8 0x0000003e12a348c8 in msort_with_tmp () from /lib64/libc.so.6
#9 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#10 0x0000003e12a348c8 in msort_with_tmp () from /lib64/libc.so.6
#11 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#12 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#13 0x0000003e12a348c8 in msort_with_tmp () from /lib64/libc.so.6
#14 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#15 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#16 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#17 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#18 0x0000003e12a348b8 in msort_with_tmp () from /lib64/libc.so.6
#19 0x0000003e12a34f1c in qsort_r () from /lib64/libc.so.6
#20 0x0000000000414a79 in sort_raw_data ()
#21 0x000000000041a5ba in parse_raw_data ()
#22 0x000000000040c683 in allocate_holder ()
#23 0x000000000040d13d in main ()

@allinurl
Copy link
Owner

I've been trying to reproduce this with a similar environment, CentOS 6.7, 2.6.32-573 x86_64 and using the same config file. No crashes so far. I assume your database files were created on the same architecture as the one reading them.

Does it break with a much smaller data set? e.g., creating new database files with 1000 log lines and then appending another 1000 to it?

Can you please give more info on this issue? I mean how did you configure/execute it and what steps did you follow to reach here? This will help in reproducing this bug.

Also, If you could try v0.9.4, it would be great! Thanks!

@TempleNode
Copy link
Author

I had tried with 0.9.4 it is same thing.
I have many domains on the server and I had used goaccess to create one html file for each of the domain

/usr/local/bin/goaccess --html-report-title=domain -f /var/log/nginx/domain.cache.log.1 > /var/www/html/domain/index.html
The problem appear with large databse. for smaller datababe seem to work fine.

@allinurl
Copy link
Owner

I'm starting to think this could be a data corruption on one of the database files. I just tested this with two data sets, first one a log of 50M lines and the second one with 30M, it didn't crash.

BTW, does this happen only when parsing the same log file or with any log? and have you tried deleting the db files and running everything again?

It would be great if you could post some additional data from gdb. However, you need to compile with debugging symbols. To do this you need to configure with --enable-debug:

# ./configure --enable-debug --enable-utf8 --enable-geoip --enable-tcb=btree
# make

then run it through gdb:

# gdb ./goaccess
(gdb) set args -f access.log --keep-db-files --load-from-disk <enter>
(gdb) r <enter>
// after it crashes
(gdb) bt <enter>
(gdb) info args <enter>
(gdb) info locals <enter>
 // select parse_raw_data frame #
(gdb) frame 21
(gdb) p raw_data->idx

Thanks!

@allinurl allinurl changed the title GoAccess 0.9.3 crashed by Signal 11 GoAccess 0.9.3 crashed by Signal 11 when loading persisted data Sep 13, 2015
@allinurl
Copy link
Owner

@TempleNode I pushed a commit to ensure the raw data structure is not NULL upon sorting it.

Could you please pull the latest changes from upstream and see if that fixes the issue? Thanks.

@allinurl
Copy link
Owner

Also, please make sure to specify a debug log file and post its content in here.

goaccess --debug-file=debug.log --html-report-title=domain -f /var/log/nginx/domain.cache.log.1

@allinurl
Copy link
Owner

Closing this since I wasn't able to replicate it. Several changes have been made since v0.9.3.

Feel free to reopen it if needed. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants