-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ardb hangs after continual use for 48 hours #141
Comments
is the server build with rocksdb as the storage engine? and what commands used in your application env? |
Yes, there is nothing custom about the setup, so RocksDB. I dont think we are sending it much load, mainly some scripts running against it but couldn't be more that 20-30 writes/second. Average write size is probably 1KB |
What would be the config for the 2 port setup? Sounds like a fix in the meantime. Thanks. |
just add 'listen' configuration in ardb.conf |
and can you use gdb attach running process to get the stack trace when the server hang next time? |
I'm experiencing the situation right now. I'm not that familiar with gdb: here is the output when i attach to the process but i'm not sure what to read into it: |
Here is the backtrace: |
Restarted and an hour later a similar hang. Backtrace: #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 |
It's blocked in rocksdb, the backtrace seems that rocksdb serv too many writes at the time. |
I checked top and -w is less than 2%. Then on iostat I checked %util for each drive and it was at or below 7% for several tests. |
I checked rocksdb's code, the thread with write operation would be blocked until background compaction thread complete. |
And is the disk HDD? Rocksdb perform worse than leveldb on HDD disk, much better on SSD according our tests. |
I seem to get around 24-28 hours of our ardb before it slows to a halt. It still accepts connections but returns nothing, eventually my client is timing out. This is despite not yet deploying it on production so very little volume of queries. Database size is around 40GB.
When I go to ardb and type "info" it halts and nothing is displayed. I can resolve this by killing the server and bringing it back up again, but not sustainable for being in production.
I previously saw this reported here and was closed. I am running the latest on the master branch
The text was updated successfully, but these errors were encountered: