New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"epoll_ctl failed: Bad file descriptor" when testing keydb cluster with multi-threading #125
Comments
@hengku Are you running v5.2? I could have sworn I fixed an issue like this already. |
Hi @JohnSully , I retested the latest source code from RELEASE_5 branch (KeyDB 5.3), it crashed after I ran memtier-benchmark when I enable the AOF (appendonly yes). It worked well when I disabled the AOF. Below is the error message when keydb-server was crashed. 7439:M 12 Jan 2020 23:40:24.612 # Cluster state changed: ok === KEYDB BUG REPORT START: Cut & paste starting from here === ------ STACK TRACE ------ Backtrace: |
This should be fixed with the following change: aa6409f This change was made 26 days ago and your last attempt was 25, but I don't remember when the change was pushed to github so it's hard to say if your test actually ran this code. Could you give it one more try? |
I added fix 16a019d which handles the error case without crashing. Getting into the error case should be near impossible after the earlier change, but this makes certain it's non-fatal. |
I tried a keydb cluster consisting of 4 keydb servers (complied under Ubuntu 18.04) running on 4 machines (56 cores, 384GB RAM). Each keydb server has 4 threads with binding cores. Both RDB and AOF were enabled on each server. I observed "epoll_ctl failed: Bad file descriptor" messages many times from each server when I ran the following test:
memtier_benchmark -s -p -d 256 -t 50 -c 1 --key-pattern=P:P -n 50000000 --key-minimum=1 --key-maximum=2500000000 --ratio 1:0 --cluster-mode --hide-histogram
However, I didn't see those messages when I ran the same test against Redis cluster in the same environment.
Here is the sample output:
Starting automatic rewriting of AOF on 101% growth
Background append only file rewriting started by pid 39718
AOF rewrite child asks to stop sending diffs.
Parent agreed to stop sending diffs. Finalizing AOF...
Concatenating 62.81 MB of AOF diff received from parent.
SYNC append only file rewrite performed
AOF rewrite: 337 MB of memory used by copy-on-write
Background AOF rewrite terminated with success
Residual parent diff successfully flushed to the rewritten AOF (29.51 MB)
Background AOF rewrite finished successfully
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
epoll_ctl failed: Bad file descriptor
The text was updated successfully, but these errors were encountered: