Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

arangod coredump when parallel aql running #9273

Closed
daoxian opened this issue Jun 17, 2019 · 3 comments
Closed

arangod coredump when parallel aql running #9273

daoxian opened this issue Jun 17, 2019 · 3 comments

Comments

@daoxian
Copy link

daoxian commented Jun 17, 2019

I setup a 3-node active failover mode arangodb. The processes are like the following:

$ps afxwww|grep arango
76165 pts/1 S+ 0:00 _ grep --color=auto arango
74883 ? S 0:00 sudo nohup arangodb --starter.mode=activefailover --server.storage-engine=rocksdb --starter.data-dir=/home/admin/arangodb/data_3.4.5_activefailover --dbservers.rocksdb.block-cache-size=5000000000 --dbservers.rocksdb.enforce-block-cache-size-limit=true --dbservers.rocksdb.max-total-wal-size=838860800 --dbservers.rocksdb.max-write-buffer-number=20 --dbservers.rocksdb.min-write-buffer-number-to-merge=10 --dbservers.rocksdb.num-threads-priority-high=10 --dbservers.rocksdb.num-threads-priority-low=5 --dbservers.rocksdb.sync-interval=10000 --dbservers.rocksdb.table-block-size=163840 --dbservers.rocksdb.total-write-buffer-size=5000000000 --dbservers.rocksdb.write-buffer-size=200000000 --dbservers.server.maximal-queue-size 0 --starter.join 11.140.25.69,11.140.129.35,11.140.126.222
74903 ? Sl 0:00 _ arangodb --starter.mode=activefailover --server.storage-engine=rocksdb --starter.data-dir=/home/admin/arangodb/data_3.4.5_activefailover --dbservers.rocksdb.block-cache-size=5000000000 --dbservers.rocksdb.enforce-block-cache-size-limit=true --dbservers.rocksdb.max-total-wal-size=838860800 --dbservers.rocksdb.max-write-buffer-number=20 --dbservers.rocksdb.min-write-buffer-number-to-merge=10 --dbservers.rocksdb.num-threads-priority-high=10 --dbservers.rocksdb.num-threads-priority-low=5 --dbservers.rocksdb.sync-interval=10000 --dbservers.rocksdb.table-block-size=163840 --dbservers.rocksdb.total-write-buffer-size=5000000000 --dbservers.rocksdb.write-buffer-size=200000000 --dbservers.server.maximal-queue-size 0 --starter.join 11.140.25.69,11.140.129.35,11.140.126.222
74925 ? Sl 0:02 _ /usr/sbin/arangod -c /home/admin/arangodb/data_3.4.5_activefailover/agent8531/arangod.conf --database.directory /home/admin/arangodb/data_3.4.5_activefailover/agent8531/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /home/admin/arangodb/data_3.4.5_activefailover/agent8531/apps --log.file /home/admin arangodb/data_3.4.5_activefailover/agent8531/arangod.log --log.force-direct false --javascript.copy-installation true --agency.activate true --agency.my-address tcp://11.140.25.69:8531 --agency.size 3 --agency.supervision true --foxx.queues false --server.statistics false --agency.endpoint tcp://11.140.126.222:8531 --agency.endpoint tcp://11.140.129.35:8531
75848 ? Sl 0:40 _ /usr/sbin/arangod -c /home/admin/arangodb/data_3.4.5_activefailover/resilientsingle8529/arangod.conf --database.directory /home/admin/arangodb/data_3.4.5_activefailover/resilientsingle8529/data --javascript.startup-directory /usr/share/arangodb3/js --javascript.app-path /home/admin/arangodb/data_3.4.5_activefailover/resilientsingle8529/apps --log.file /home/admin/arangodb/data_3.4.5_activefailover/resilientsingle8529/arangod.log --log.force-direct false --javascript.copy-installation true --foxx.queues true --server.statistics true --replication.automatic-failover true --cluster.my-address tcp://11.140.25.69:8529 --cluster.my-role SINGLE --cluster.agency-endpoint tcp://11.140.126.222:8531 --cluster.agency-endpoint tcp://11.140.25.69:8531 --cluster.agency-endpoint tcp://11.140.129.35:8531 --rocksdb.max-total-wal-size 838860800 --rocksdb.max-write-buffer-number 20 --rocksdb.min-write-buffer-number-to-merge 10 --rocksdb.num-threads-priority-low 5 --rocksdb.sync-interval 10000 --rocksdb.table-block-size 163840 --rocksdb.total-write-buffer-size 5000000000 --rocksdb.enforce-block-cache-size-limit true --server.maximal-queue-size 0 --rocksdb.write-buffer-size 200000000 --rocksdb.num-threads-priority-high 10 --rocksdb.block-cache-size 5000000000

Then after a while's running, coredump emergence. The backtrace:

Core was generated by `/usr/sbin/arangod -c /home/admin/arangodb/data_3.4.5_activefailover/resilientsi'.
Program terminated with signal 11, Segmentation fault.
#0 0x000000000110cb09 in arangodb::rest::GeneralCommTask::executeRequest(std::unique_ptr<arangodb::GeneralRequest, std::default_deletearangodb::GeneralRequest >&&, std::unique_ptr<arangodb::GeneralResponse, std::default_deletearangodb::GeneralResponse >&&) ()
Missing separate debuginfos, use: debuginfo-install arangodb3-3.4.5-1.0.x86_64
(gdb) bt
#0 0x000000000110cb09 in arangodb::rest::GeneralCommTask::executeRequest(std::unique_ptr<arangodb::GeneralRequest, std::default_deletearangodb::GeneralRequest >&&, std::unique_ptr<arangodb::GeneralResponse, std::default_deletearangodb::GeneralResponse >&&) ()
#1 0x000000000106710e in arangodb::rest::HttpCommTask::processRequest(std::unique_ptr<arangodb::HttpRequest, std::default_deletearangodb::HttpRequest >) ()
#2 0x000000000106ee05 in arangodb::rest::HttpCommTask::processRead(double) ()
#3 0x00000000010e28bb in arangodb::rest::SocketTask::processAll() ()
#4 0x00000000010e2d85 in arangodb::rest::SocketTask::asyncReadSome() ()
#5 0x00000000008bc09e in asio::detail::completion_handler<arangodb::rest::Scheduler::post(asio::io_context::strand&, std::function<void ()>)::{lambda()#2}>::do_complete(void*, asio::detail::scheduler_operation*, std::error_code const&, unsigned long) ()
#6 0x00000000008c10a5 in asio::detail::strand_service::do_complete(void*, asio::detail::scheduler_operation*, std::error_code const&, unsigned long) ()
#7 0x00000000008c2e5b in arangodb::SchedulerThread::run() ()
#8 0x00000000011f7709 in arangodb::Thread::startThread(void*) ()
#9 0x00000000012b76e9 in ThreadStarter(void*) ()
#10 0x00000000027c5a9c in start ()
#11 0x0000000000000000 in ?? ()
@dothebart
Copy link
Contributor

dothebart commented Jun 17, 2019

Hi, as the debugger stated, please install the debuginfo file, so the debugger can print the line numbers, etc.
Hint: tripple-tics make stuff better readeable - see my edits.

@mpoeter
Copy link
Member

mpoeter commented Jun 17, 2019

This seems to be a duplicate of #9266 (same backtrace and instruction pointer) where I have already posted my findings.

@dothebart
Copy link
Contributor

as pointed out by @graetzer this bug has been fixed with ArangoDB 3.4.6.1 - Please upgrade your environment.
Closing as fixed.

@dothebart dothebart added the 2 Fixed Resolution label Jun 17, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants