Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coordinator crashed frequency #7093

Closed
zqhxuyuan opened this issue Oct 26, 2018 · 14 comments
Closed

coordinator crashed frequency #7093

zqhxuyuan opened this issue Oct 26, 2018 · 14 comments
Labels
1 Analyzing 2 Solved Resolution 3 Cluster 3 RocksDB Storage engine related

Comments

@zqhxuyuan
Copy link

zqhxuyuan commented Oct 26, 2018

Environment:

  • ArangoDB Version: 3.3.17
  • Storage Engine: RocksDB
  • Deployment Mode: cluster(3 node db-ssd/3 node coordinator-sata)
  • Configuration: -
  • Infrastructure: own(24core/64g mem)
  • Operating System: CentOS 6.9
  • Used Package: CentOS rpm

we use supervisor mode to start coordinator. and notice cooordinator node crash frequency.
we have tried coordinator version 3.3.17 and 3.3.19. our agency/dbserver version is 3.3.17.

# cat arangod.log|grep "Have fun"
2018-10-25T11:49:54 [21961] INFO ArangoDB (version 3.3.17 [linux]) is ready for business. Have fun!
2018-10-25T11:53:48 [24035] INFO ArangoDB (version 3.3.17 [linux]) is ready for business. Have fun!
2018-10-25T13:23:32 [12688] INFO ArangoDB (version 3.3.17 [linux]) is ready for business. Have fun!
2018-10-25T13:38:30 [19127] INFO ArangoDB (version 3.3.17 [linux]) is ready for business. Have fun!
2018-10-25T13:49:49 [24051] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T13:50:53 [24720] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T13:51:47 [25341] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T13:52:07 [25775] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T14:40:23 [45862] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T15:19:35 [13444] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T16:09:42 [34824] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T17:50:54 [28294] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T18:21:33 [41055] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T18:41:55 [942] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T20:16:24 [40379] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T20:27:31 [45498] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T20:42:11 [3612] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T20:44:11 [4552] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T21:25:47 [21976] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-25T22:50:50 [8678] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T00:31:28 [1726] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T02:11:04 [42847] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T04:33:45 [4821] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T04:44:39 [9343] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T04:53:57 [13312] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!

when coordinator restart by supervisor mode, the coordinator.log file have little useful information:

2018-10-26T04:44:39 [9343] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!

2018-10-26T04:53:57 [13312] INFO ArangoDB 3.3.19 [linux] 64bit, using jemalloc, build tags/v3.3.19-0-gfe9657c-dirty, VPack 0.1.30, RocksDB 5.6.0, ICU 58.1, V8 5.7.492.77, OpenSSL 1.0.1e-fips 11 Feb 2013
2018-10-26T04:53:57 [13312] INFO detected operating system: Linux version 2.6.32-696.6.3.el6.x86_64 (mockbuild@c1bl.rdu2.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC) ) #1 SMP Wed Jul 12 14:17:22 UTC 2017
2018-10-26T04:53:57 [13312] INFO using storage engine rocksdb
2018-10-26T04:53:57 [13312] INFO {cluster} Starting up with role COORDINATOR
2018-10-26T04:53:57 [13312] INFO {cluster} Waiting for DBservers to show up...
2018-10-26T04:53:57 [13312] INFO {cluster} Found 3 DBservers.
2018-10-26T04:53:57 [13312] INFO {syscall} file-descriptors (nofiles) hard limit is 655350, soft limit is 655350
2018-10-26T04:53:57 [13312] INFO {authentication} Authentication is turned on (system only), authentication for unix sockets is turned on
2018-10-26T04:53:57 [13312] WARNING found existing lockfile '/data/program/arangodb/coordinator8529/data/LOCK' of previous process with pid 9343, but that process seems to be dead already
2018-10-26T04:53:57 [13312] INFO {cluster} Cluster feature is turned on. Agency version: {"server":"arango","version":"3.3.17","license":"community"}, Agency endpoints: http+tcp://10.1.96.119:8531, http+tcp://10.1.96.116:8531, http+tcp://10.1.96.117:8531, server id: 'CRDN-cd435632-1ce8-4724-9402-264152cff855', internal address: tcp://10.1.128.69:8529, role: COORDINATOR
2018-10-26T04:53:57 [13312] INFO {cluster} using heartbeat interval value '1000 ms' from agency
2018-10-26T04:53:57 [13312] INFO using endpoint 'http+tcp://[::]:8529' for non-encrypted requests
2018-10-26T04:53:57 [13312] INFO bootstrapped coordinator CRDN-cd435632-1ce8-4724-9402-264152cff855
2018-10-26T04:53:57 [13312] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!

and arangod.log.supervisor log file:

2018-10-26T02:11:03 [21975] INFO {startup} supervisor has forked a child process with pid 42847
2018-10-26T04:33:44 [21975] INFO {startup} waitpid woke up with return value 42847 and status 139 and DONE = false
2018-10-26T04:33:44 [21975] ERROR {startup} child process 42847 terminated unexpectedly, signal 11 (SIGSEGV). will now start a new child process
2018-10-26T04:33:44 [21975] INFO {startup} supervisor has forked a child process with pid 4821
2018-10-26T04:44:38 [21975] INFO {startup} waitpid woke up with return value 4821 and status 139 and DONE = false
2018-10-26T04:44:38 [21975] ERROR {startup} child process 4821 terminated unexpectedly, signal 11 (SIGSEGV). will now start a new child process
2018-10-26T04:44:38 [21975] INFO {startup} supervisor has forked a child process with pid 9343
2018-10-26T04:53:57 [21975] INFO {startup} waitpid woke up with return value 9343 and status 139 and DONE = false
2018-10-26T04:53:57 [21975] ERROR {startup} child process 9343 terminated unexpectedly, signal 11 (SIGSEGV). will now start a new child process
2018-10-26T04:53:57 [21975] INFO {startup} supervisor has forked a child process with pid 13312

the process was killed by send signal 11.

so we tried start by normal mode(un-supervisor),when coordinator crashed, we analyze the core dump:

# gdb arangod core.45498

Core was generated by `/usr/sbin/arangod -c /data/program/arangodb/coordinator8529/arangod.conf --data'.
Program terminated with signal 11, Segmentation fault.
#0  0x0000000001c3361b in v8::internal::compiler::AccessInfoFactory::ComputePropertyAccessInfo(v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::compiler::AccessMode, v8::internal::compiler::PropertyAccessInfo*) ()
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64
(gdb) bt
#0  0x0000000001c3361b in v8::internal::compiler::AccessInfoFactory::ComputePropertyAccessInfo(v8::internal::Handle<v8::internal::Map>, v8::internal::Handle<v8::internal::Name>, v8::internal::compiler::AccessMode, v8::internal::compiler::PropertyAccessInfo*) ()
#1  0x0000000001a07b4c in v8::internal::compiler::JSNativeContextSpecialization::Reduce(v8::internal::compiler::Node*) ()
#2  0x00000000019c5c4b in v8::internal::compiler::GraphReducer::ReduceTop() ()
#3  0x00000000019c6083 in v8::internal::compiler::GraphReducer::ReduceNode(v8::internal::compiler::Node*) ()
#4  0x00000000013de501 in v8::internal::compiler::PipelineImpl::CreateGraph() ()
#5  0x00000000013df2a2 in v8::internal::compiler::PipelineCompilationJob::PrepareJobImpl() ()
#6  0x0000000001447714 in v8::internal::CompilationJob::PrepareJob() ()
#7  0x000000000144bd5d in v8::internal::(anonymous namespace)::GetOptimizedCode ()
#8  0x000000000144c0b1 in v8::internal::Compiler::CompileOptimized(v8::internal::Handle<v8::internal::JSFunction>, v8::internal::Compiler::ConcurrencyMode) ()
#9  0x0000000001ae6b1d in v8::internal::Runtime_CompileOptimized_Concurrent(int, v8::internal::Object**, v8::internal::Isolate*) ()
#10 0x00002100848040e7 in ?? ()
...
#35 0x00007f8064e03068 in ?? ()
#36 0x0000000000000002 in ?? () at /home/abuild/rpmbuild/BUILD/gcc-5.4.0/build/x86_64-redhat-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:1083
#37 0x00000000000001a1 in ?? ()
...
#70 0x00007f8064e03190 in ?? ()
#71 0x0000000000000002 in ?? () at /home/abuild/rpmbuild/BUILD/gcc-5.4.0/build/x86_64-redhat-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:1083
#72 0x0000000000000572 in ?? ()
..
#104 0x0000000000000000 in ?? ()
(gdb)

Thanks!

@dothebart
Copy link
Contributor

Hm,
this would be the V8 javascript engine crashing.
Could this be an OOM related crash? do you see it using much memory before these issues?

@zqhxuyuan
Copy link
Author

zqhxuyuan commented Oct 26, 2018

no, memory is normal, keep on 50%

# sar -r
Linux 2.6.32-696.6.3.el6.x86_64 (db-128-69-hzqsh.node.hzqsh.wacai.sdc) 	10/26/2018 	_x86_64_	(24 CPU)

12:00:01 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
12:10:01 AM  32811072  33245652     50.33     99124  31332632   1656400      2.22
12:20:01 AM  32845576  33211148     50.28     99796  31332892   1599992      2.15
12:30:01 AM  32936248  33120476     50.14    100508  31333176   1515992      2.04
12:40:01 AM  32509652  33547072     50.79    101352  31784192   1434508      1.93
12:50:01 AM  32516008  33540716     50.78    102024  31784464   1414204      1.90
01:00:01 AM  32504960  33551764     50.79    102688  31784720   1445160      1.94
01:10:01 AM  32613984  33442740     50.63    103328  31784980   1348216      1.81
01:20:01 AM  32482276  33574448     50.83    104000  31785248   1465820      1.97
01:30:01 AM  32571604  33485120     50.69    104860  31785524   1381448      1.86
01:40:01 AM  32503096  33553628     50.80    105480  31785788   1448484      1.95
01:50:01 AM  32586148  33470576     50.67    106084  31786052   1368548      1.84
02:00:01 AM  32603964  33452760     50.64    106672  31786308   1360840      1.83
02:10:01 AM  32522324  33534400     50.77    107428  31786564   1449900      1.95
02:20:01 AM  32061704  33995020     51.46    108152  32202840   1450640      1.95
02:30:01 AM  32009960  34046764     51.54    108864  32203172   1487100      2.00
02:40:01 AM  32040536  34016188     51.50    109572  32203440   1436524      1.93
02:50:01 AM  32116912  33939812     51.38    110268  32203696   1378060      1.85
03:00:01 AM  32005596  34051128     51.55    110976  32203952   1493204      2.01
03:10:01 AM  32080056  33976668     51.44    111604  32204216   1424172      1.91
03:20:01 AM  31984028  34072696     51.58    123512  32265928   1417600      1.90
03:30:01 AM  31946376  34110348     51.64    124328  32266268   1454180      1.95
03:40:01 AM  31968744  34087980     51.60    125076  32266528   1437616      1.93
03:50:01 AM  32007536  34049188     51.55    125852  32266780   1400080      1.88
04:00:01 AM  31996156  34060568     51.56    126584  32267036   1407284      1.89
04:10:01 AM  31996404  34060320     51.56    127136  32267308   1415900      1.90
04:20:01 AM  31995836  34060888     51.56    127752  32267556   1409880      1.89
04:30:01 AM  31916464  34140260     51.68    128532  32267852   1460116      1.96
04:40:01 AM  31571384  34485340     52.21    129188  32660168   1419528      1.91
04:50:01 AM  31226944  34829780     52.73    129736  33076136   1324168      1.78
05:00:01 AM  30742588  35314136     53.46    130360  33496880   1404832      1.89
05:10:01 AM  30702008  35354716     53.52    130960  33497144   1429708      1.92
05:20:01 AM  30647576  35409148     53.60    131568  33497404   1476684      1.98
05:30:01 AM  30810756  35245968     53.36    132168  33497692   1321188      1.77
05:40:02 AM  30714028  35342696     53.50    132684  33497952   1435672      1.93
05:50:01 AM  30663068  35393656     53.58    133240  33498216   1464116      1.97
06:00:01 AM  30709544  35347180     53.51    133796  33498476   1453984      1.95
06:10:01 AM  30638056  35418668     53.62    134376  33498740   1501952      2.02
06:20:01 AM  30702080  35354644     53.52    134896  33498992   1436144      1.93
06:30:01 AM  30693076  35363648     53.54    135600  33499276   1453216      1.95
06:40:01 AM  30657604  35399120     53.59    136188  33499544   1476512      1.98
06:50:01 AM  30704820  35351904     53.52    136668  33499796   1437360      1.93
07:00:01 AM  30741560  35315164     53.46    137204  33500060   1386380      1.86
07:10:01 AM  30734364  35322360     53.47    137788  33500324   1409792      1.89
07:20:01 AM  30678000  35378724     53.56    138364  33500576   1447740      1.94

07:20:01 AM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
07:30:01 AM  30633236  35423488     53.63    138964  33500868   1511996      2.03
07:40:01 AM  30762224  35294500     53.43    139604  33501132   1357068      1.82
07:50:01 AM  30730888  35325836     53.48    140240  33501392   1395304      1.87
08:00:01 AM  30691220  35365504     53.54    140812  33501656   1447272      1.94
08:10:01 AM  30624212  35432512     53.64    141376  33501912   1498968      2.01
08:20:01 AM  30647436  35409288     53.60    141884  33502172   1481412      1.99
08:30:01 AM  30626184  35430540     53.64    142504  33502464   1493928      2.01
08:40:01 AM  30672988  35383736     53.57    143080  33502728   1452076      1.95
08:50:01 AM  30703788  35352936     53.52    143604  33502984   1409972      1.89
09:00:01 AM  30612604  35444120     53.66    144072  33503256   1504616      2.02
09:10:01 AM  30710424  35346300     53.51    144508  33503516   1401824      1.88
09:20:01 AM  30602512  35454212     53.67    145092  33503772   1515224      2.04
09:30:01 AM  30740180  35316544     53.46    145720  33504044   1359968      1.83
09:40:01 AM  30643340  35413384     53.61    146216  33504860   1465552      1.97
09:50:01 AM  30761608  35295116     53.43    146672  33505112   1363308      1.83
10:00:01 AM  30479512  35577212     53.86    147228  33505892   1641200      2.20
10:10:01 AM  30445136  35611588     53.91    147732  33506156   1911316      2.57
10:20:01 AM  30320532  35736192     54.10    148168  33506428   2039160      2.74
10:30:01 AM  30514004  35542720     53.81    148732  33533708   1831416      2.46
10:40:01 AM  30483716  35573008     53.85    149164  33533972   1906196      2.56
10:50:01 AM  30680944  35375780     53.55    149708  33497120   1605224      2.16
11:00:01 AM  30494204  35562520     53.84    150204  33496884   1905020      2.56
11:10:01 AM  30391472  35665252     53.99    150728  33496916   1985016      2.67
11:20:01 AM  30544600  35512124     53.76    151196  33497176   1868356      2.51
11:30:01 AM  30260900  35795824     54.19    151912  33497488   2153920      2.89
11:40:01 AM  30628508  35428216     53.63    152512  33497744   1746020      2.35
11:50:01 AM  30395112  35661612     53.99    152948  33498004   1989496      2.67
12:00:01 PM  30432276  35624448     53.93    153448  33498268   1962464      2.64
12:10:01 PM  30628304  35428420     53.63    153804  33498532   1761392      2.37
12:20:01 PM  30474736  35581988     53.87    154168  33498792   1861060      2.50
12:30:01 PM  30447000  35609724     53.91    154600  33499092   1918196      2.58
12:40:02 PM  30526480  35530244     53.79    154956  33499348   1846196      2.48
12:50:01 PM  30599352  35457372     53.68    155464  33499608   1805396      2.43
01:00:01 PM  30441512  35615212     53.92    155908  33499868   1945372      2.61
01:10:01 PM  30477092  35579632     53.86    156320  33500128   1902076      2.55
01:20:01 PM  30547936  35508788     53.75    156792  33500392   1831084      2.46
01:30:01 PM  29891244  36165480     54.75    157268  34073604   1870336      2.51
01:40:01 PM  29911020  36145704     54.72    157680  34073876   1885636      2.53
01:50:01 PM  30216752  35839972     54.26    158236  34115476   1301000      1.75
02:00:01 PM  29867580  36189144     54.78    158760  34115716   1957856      2.63
02:10:01 PM  29808380  36248344     54.87    159156  34115972   2007756      2.70
02:20:01 PM  29127152  36929572     55.91    159660  34772044   2012088      2.70
02:30:01 PM  29091924  36964800     55.96    160116  34772344   2063588      2.77
02:40:01 PM  28911184  37145540     56.23    160532  34772596   2215656      2.98

02:40:01 PM kbmemfree kbmemused  %memused kbbuffers  kbcached  kbcommit   %commit
02:50:01 PM  29131688  36925036     55.90    160948  34772860   2022556      2.72
03:00:01 PM  29155612  36901112     55.86    161352  34773116   2010900      2.70
03:10:01 PM  28908276  37148448     56.24    161900  34773412   2258476      3.03
03:20:01 PM  28917008  37139716     56.22    162280  34773712   2237616      3.01
03:30:01 PM  28966512  37090212     56.15    162720  34774032   2218988      2.98
03:40:01 PM  28427720  37629004     56.96    163272  35518976   1830292      2.46
03:50:01 PM  27457560  38599164     58.43    163816  36388648   2026940      2.72
04:00:01 PM  27137144  38919580     58.92    166428  36408544   2342588      3.15
04:10:01 PM  27250272  38806452     58.75    166796  36416784   2210888      2.97
04:20:01 PM  26937936  39118788     59.22    167320  37182604   1637144      2.20
04:30:01 PM  26168824  39887900     60.38    167784  37303104   2618204      3.52
04:40:01 PM  26148424  39908300     60.42    168228  37496236   2490012      3.34
04:50:01 PM  25857388  40199336     60.86    172060  37813672   2507508      3.37
05:00:01 PM  25670852  40385872     61.14    172564  37917244   2426056      3.26
05:10:01 PM  25656048  40400676     61.16    172904  38015404   2369984      3.18
05:20:01 PM  22771884  43284840     65.53    183828  40480184   2676296      3.59
05:30:01 PM  22061036  43995688     66.60    184236  41358924   2462488      3.31
05:40:01 PM  21672324  44384400     67.19    184672  41752072   2461072      3.31
05:50:01 PM  21451584  44605140     67.53    185080  42841968    474672      0.64
06:00:01 PM  20814128  45242596     68.49    185444  42843836   1995508      2.68
Average:     30122540  35934184     54.40    141772  33943952   1693583      2.27

latest restart log(we tried 3.3.19/3.3.11, all encouter restart problem):

2018-10-26T00:31:28 [1726] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T02:11:04 [42847] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T04:33:45 [4821] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T04:44:39 [9343] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T04:53:57 [13312] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T10:49:58 [14512] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T13:24:23 [29898] INFO ArangoDB (version 3.3.19 [linux]) is ready for business. Have fun!
2018-10-26T13:50:00 [40810] INFO ArangoDB (version 3.3.11 [linux]) is ready for business. Have fun!
2018-10-26T14:16:13 [3280] INFO ArangoDB (version 3.3.11 [linux]) is ready for business. Have fun!
2018-10-26T15:37:38 [37513] INFO ArangoDB (version 3.3.11 [linux]) is ready for business. Have fun!
2018-10-26T15:48:20 [42093] INFO ArangoDB (version 3.3.11 [linux]) is ready for business. Have fun!
2018-10-26T16:14:32 [4962] INFO ArangoDB (version 3.3.11 [linux]) is ready for business. Have fun!
2018-10-26T16:19:47 [7405] INFO ArangoDB (version 3.3.11 [linux]) is ready for business. Have fun!
2018-10-26T17:51:12 [48140] INFO ArangoDB (version 3.3.11 [linux]) is ready for business. Have fun!

@dothebart
Copy link
Contributor

Can you try to increase the logging on the coordinator?

[log]
level = trace
line-number = true
force-direct = true

and run it interactive?
It would also be good to see the output of
thread apply all bt
if the crash re-occurs.

@betwjp
Copy link

betwjp commented Oct 29, 2018

i and zqhxuyuan are the same team.we had changed the log's config,before crash.the log is:

2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:144] no prefix handler found, trying catch all
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:148] found catch all handler '/'
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:201] found handler for path '/'
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:128] no direct handler found, trying prefixes
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:169] found prefix match '/_api/document'
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:201] found handler for path '/_api/document'
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:201] found handler for path '/_api/cursor'
2018-10-27T16:46:39 [23154] TRACE [findIndexHandleForAndNode@Methods.cpp:478] looking at index: index { id: 2, type: edge, collection: dashuju/uid2phone, unique: false, fields: [_to] }, isSorted: 0, isSparse: 0, fields: 1, supportsFilter: 1, supportsSort: 0, filterCost: 683.81, sortCost: 6441.81, totalCost: 683.81, isOnlyAttributeAccess: 0, isUnidirectional: 1, isOnlyEqualityMatch: 0, itemsInIndex: 684
2018-10-27T16:46:39 [23154] TRACE [createPlans@Optimizer.cpp:203] optimization ends with 1 plans
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:201] found handler for path '/_api/cursor'
2018-10-27T16:46:39 [23154] TRACE [findIndexHandleForAndNode@Methods.cpp:478] looking at index: index { id: 2, type: edge, collection: dashuju/uid2phone, unique: false, fields: [_to] }, isSorted: 0, isSparse: 0, fields: 1, supportsFilter: 1, supportsSort: 0, filterCost: 719.8, sortCost: 6834.13, totalCost: 719.8, isOnlyAttributeAccess: 0, isUnidirectional: 1, isOnlyEqualityMatch: 0, itemsInIndex: 720
2018-10-27T16:46:39 [23154] TRACE [createPlans@Optimizer.cpp:203] optimization ends with 1 plans
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:128] no direct handler found, trying prefixes
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:144] no prefix handler found, trying catch all
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:148] found catch all handler '/'
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:201] found handler for path '/'
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:128] no direct handler found, trying prefixes
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:169] found prefix match '/_api/document'
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:201] found handler for path '/_api/document'
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:201] found handler for path '/_api/cursor'
2018-10-27T16:46:39 [23154] TRACE [findIndexHandleForAndNode@Methods.cpp:478] looking at index: index { id: 2, type: edge, collection: dashuju/uid2phone, unique: false, fields: [_to] }, isSorted: 0, isSparse: 0, fields: 1, supportsFilter: 1, supportsSort: 0, filterCost: 323.91, sortCost: 2702.11, totalCost: 323.91, isOnlyAttributeAccess: 0, isUnidirectional: 1, isOnlyEqualityMatch: 0, itemsInIndex: 324
2018-10-27T16:46:39 [23154] TRACE [createPlans@Optimizer.cpp:203] optimization ends with 1 plans
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:128] no direct handler found, trying prefixes
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:144] no prefix handler found, trying catch all
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:148] found catch all handler '/'
2018-10-27T16:46:39 [23154] TRACE [createHandler@RestHandlerFactory.cpp:201] found handler for path '/'
2018-10-27T16:46:40 [38305] INFO [prepare@GreetingsFeature.cpp:38] ArangoDB 3.3.19 [linux] 64bit, using jemalloc, build tags/v3.3.19-0-gfe9657c-dirty, VPack 0.1.30, RocksDB 5.6.0, ICU 58.1, V8 5.7.492.77, OpenSSL 1.0.1e-fips 11 Feb 2013
2018-10-27T16:46:40 [38305] TRACE [create@Nonce.cpp:66] creating nonces with size: 4194304
2018-10-27T16:46:40 [38305] TRACE [prepare@PageSizeFeature.cpp:41] page size is 4096
2018-10-27T16:46:40 [38305] DEBUG [setJwtSecret@TokenCache.cpp:69] {authentication} Setting jwt secret /data/program/arangodb/coordinator8529/dingqin
2018-10-27T16:46:40 [38305] INFO [prepare@EnvironmentFeature.cpp:54] detected operating system: Linux version 2.6.32-696.6.3.el6.x86_64 (mockbuild@c1bl.rdu2.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-18) (GCC) ) #1 SMP Wed Jul 12 14:17:22 UTC 2017
2018-10-27T16:46:40 [38305] INFO [prepare@EngineSelectorFeature.cpp:91] using storage engine rocksdb
2018-10-27T16:46:40 [38305] TRACE [prepare@MMFilesLogfileManager.cpp:269] shutdown file found
2018-10-27T16:46:40 [38305] TRACE [readShutdownInfo@MMFilesLogfileManager.cpp:1877] previous shutdown was at '2018-10-25T13:24:06Z'
2018-10-27T16:46:40 [38305] TRACE [readShutdownInfo@MMFilesLogfileManager.cpp:1885] initial values for WAL logfile manager: tick: 357836768699, hlc: 1615303904293027840, lastCollected: 0, lastSealed: 0
2018-10-27T16:46:40 [38305] INFO [reportRole@ClusterFeature.cpp:223] {cluster} Starting up with role COORDINATOR

@dothebart
Copy link
Contributor

and after the last log line it will abort again with the same place as posted above?

@OmarAyo OmarAyo added the 3 RocksDB Storage engine related label Oct 29, 2018
@betwjp
Copy link

betwjp commented Oct 29, 2018

and after the last log line it will abort again with the same place as posted above?

yes. the same log,and the same place crash,the bt info is :
(gdb) bt
#0 0x0000000001c3361b in v8::internal::compiler::AccessInfoFactory::ComputePropertyAccessInfo(v8::internal::Handlev8::internal::Map, v8::internal::Handlev8::internal::Name, v8::internal::compiler::AccessMode, v8::internal::compiler::PropertyAccessInfo*) ()
#1 0x0000000001a07b4c in v8::internal::compiler::JSNativeContextSpecialization::Reduce(v8::internal::compiler::Node*) ()
#2 0x00000000019c5c4b in v8::internal::compiler::GraphReducer::ReduceTop() ()
#3 0x00000000019c6083 in v8::internal::compiler::GraphReducer::ReduceNode(v8::internal::compiler::Node*) ()
#4 0x00000000013de501 in v8::internal::compiler::PipelineImpl::CreateGraph() ()
#5 0x00000000013df2a2 in v8::internal::compiler::PipelineCompilationJob::PrepareJobImpl() ()
#6 0x0000000001447714 in v8::internal::CompilationJob::PrepareJob() ()
#7 0x000000000144bd5d in v8::internal::(anonymous namespace)::GetOptimizedCode ()
#8 0x000000000144c0b1 in v8::internal::Compiler::CompileOptimized(v8::internal::Handlev8::internal::JSFunction, v8::internal::Compiler::ConcurrencyMode) ()
#9 0x0000000001ae6b1d in v8::internal::Runtime_CompileOptimized_Concurrent(int, v8::internal::Object**, v8::internal::Isolate*) ()
#10 0x000002fbab9840e7 in ?? ()
#11 0x0000000600000000 in ?? ()
#12 0x000002fbab984001 in ?? ()
#13 0x00007fa5c14c5f50 in ?? ()
#14 0x0000000300000000 in ?? ()
#15 0x00007fa5c14c5fb0 in ?? ()
#16 0x000002fbab990f18 in ?? ()
#17 0x00003aac65dfb7d9 in ?? ()
#18 0x0000286815802241 in ?? ()
#19 0x00003aac65dfb7d9 in ?? ()
#20 0x0000000400000000 in ?? ()
#21 0x000002fbab990e61 in ?? ()
#22 0x0000000d00000000 in ?? ()
#23 0x00007fa5c14c6040 in ?? ()
#24 0x000002fbaba3e0ae in ?? ()
#25 0x000000ca00000000 in ?? ()
#26 0x0000143ff5b2ace1 in ?? ()
#27 0x00001255fa907b51 in ?? ()
#28 0x0000049b8431f769 in ?? ()
#29 0x0000286815802241 in ?? ()
#30 0x0000000000000016 in ?? ()
#31 0x0000001e00000000 in ?? ()
#32 0x00001a105120b931 in ?? ()
#33 0x0000071858847e91 in ?? ()
#34 0x000015dd8bb8e231 in ?? ()
#35 0x00007fa5c14c6068 in ?? ()
#36 0x0000000000000002 in ?? () at /home/abuild/rpmbuild/BUILD/gcc-5.4.0/build/x86_64-redhat-linux-gnu/libstdc++-v3/include/bits/locale_facets.h:1083

@dothebart
Copy link
Contributor

Did you already install the debug package? https://download.arangodb.com/arangodb33/CentOS_6/x86_64/arangodb3-debuginfo-3.3.17-1.x86_64.rpm
Can you install it (if not yet done) and attach the output of thread apply all bt full ?

@dothebart
Copy link
Contributor

Since you run on a relative old centos, which exact cpu architecture do you run on? Can you post one line each from /proc/cpuinfo of model name and flags ?

@betwjp
Copy link

betwjp commented Oct 30, 2018

the 3.1.17and 3.1.19 both has the same crash, I has installed the debug package, the info is :
rpm -qpl arangodb3-debuginfo-3.3.19-1.x86_64.rpm
warning: arangodb3-debuginfo-3.3.19-1.x86_64.rpm: Header V4 RSA/SHA1 Signature, key ID 6e751e9b: NOKEY
/usr/lib/debug/.build-id/0f
/usr/lib/debug/.build-id/0f/e9cc1399f9b653f83229b5fda85b801b414884.debug
/usr/lib/debug/.build-id/32
/usr/lib/debug/.build-id/32/adea9b6035823e9bc649d973e6108326bcde89.debug
/usr/lib/debug/.build-id/34
/usr/lib/debug/.build-id/34/511b0034930b854039307388bf82e392e10bd7.debug
/usr/lib/debug/.build-id/6e
/usr/lib/debug/.build-id/6e/1e363f3e818edeb13863823d97e0403888cc41.debug
/usr/lib/debug/.build-id/86
/usr/lib/debug/.build-id/86/44dee47eddff613cf0f5b9e4c3abdc2c257f26.debug
/usr/lib/debug/.build-id/97
/usr/lib/debug/.build-id/97/205d2126fc347c081e0a7376dc74994b28da57.debug
/usr/lib/debug/.build-id/a7
/usr/lib/debug/.build-id/a7/7b825f354c0aa4298a5447a325df17c027cc38.debug.

@betwjp
Copy link

betwjp commented Oct 30, 2018

the cpu info is:

cpu family : 6
model : 62
model name : Intel(R) Xeon(R) CPU E5-2620 v2 @ 2.10GHz
stepping : 4
microcode : 1064
cpu MHz : 2100.093
cache size : 15360 KB
physical id : 0
siblings : 12
core id : 5
cpu cores : 6
apicid : 11
initial apicid : 11
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm ida arat xsaveopt pln pts dtherm tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms

@betwjp
Copy link

betwjp commented Oct 30, 2018

maybe the arangodb3-debuginfo-3.3.19-1.x86_64.rpm miss the other ebugging symbols below,or i should install from other rpm package ?

Reading symbols from /usr/lib64/libssl.so.10...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libssl.so.10
Reading symbols from /usr/lib64/libcrypto.so.10...(no debugging symbols found)...done.
Loaded symbols for /usr/lib64/libcrypto.so.10
Reading symbols from /lib64/libresolv.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libresolv.so.2
Reading symbols from /lib64/librt.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/librt.so.1
Reading symbols from /lib64/libdl.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/libpthread.so.0...(no debugging symbols found)...done.
[Thread debugging using libthread_db enabled]
Loaded symbols for /lib64/libpthread.so.0
Reading symbols from /lib64/libm.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libc.so.6...(no debugging symbols found)...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/libgssapi_krb5.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libgssapi_krb5.so.2
Reading symbols from /lib64/libkrb5.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkrb5.so.3
Reading symbols from /lib64/libcom_err.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libcom_err.so.2
Reading symbols from /lib64/libk5crypto.so.3...(no debugging symbols found)...done.
Loaded symbols for /lib64/libk5crypto.so.3
Reading symbols from /lib64/libz.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libz.so.1
Reading symbols from /lib64/ld-linux-x86-64.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Reading symbols from /lib64/libkrb5support.so.0...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkrb5support.so.0
Reading symbols from /lib64/libkeyutils.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libkeyutils.so.1
Reading symbols from /lib64/libselinux.so.1...(no debugging symbols found)...done.
Loaded symbols for /lib64/libselinux.so.1
Reading symbols from /lib64/libnss_files.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_files.so.2
Reading symbols from /lib64/libnss_dns.so.2...(no debugging symbols found)...done.
Loaded symbols for /lib64/libnss_dns.so.2
Core was generated by `/usr/sbin/arangod --supervisor --pid-file /var/run/arangodb.pid -c /data/progra'.
Program terminated with signal 11, Segmentation fault.
#0 0x0000000001c3361b in v8::internal::compiler::AccessInfoFactory::ComputePropertyAccessInfo(v8::internal::Handlev8::internal::Map, v8::internal::Handlev8::internal::Name, v8::internal::compiler::AccessMode, v8::internal::compiler::PropertyAccessInfo*) ()
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.209.el6_9.2.x86_64 keyutils-libs-1.4-5.el6.x86_64 krb5-libs-1.10.3-65.el6.x86_64 libcom_err-1.41.12-23.el6.x86_64 libselinux-2.0.94-7.el6.x86_64 openssl-1.0.1e-57.el6.x86_64 zlib-1.2.3-29.el6.x86_64

@dothebart
Copy link
Contributor

hm, sorry my bad regarding the trace log, it should have been level = v8=trace

You need to install the debug symbol package matching your installed program package.

Another thing worth trying would be to fix the number of spawned V8 contexts to i.e. 8:
--javascript.v8-contexts 8

@betwjp
Copy link

betwjp commented Oct 31, 2018

I add the V8 contexts :--javascript.v8-contexts 8, the same crash happeped,
but I add the other config below:
--javascript.v8-contexts 16
--javascript.v8-contexts-minimum 8
the crash has not happend.the diff is I use the v8-contexts-minimum.
but I do not know the reason.

@dothebart
Copy link
Contributor

The release in question here is long EOLed.
ArangoDB 3.7 is available for download; closing as solved.

@dothebart dothebart added the 2 Solved Resolution label Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1 Analyzing 2 Solved Resolution 3 Cluster 3 RocksDB Storage engine related
Projects
None yet
Development

No branches or pull requests

5 participants