-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unkillable connections to ftp port 21 #1384
Comments
Thanks for the detailed bug report! For these scanner connections, can you tell if they are only sending the |
One way to tell about the above question is to use |
Sure -- I need to add |
Thanks! For that normal user session, did the client eventually send the |
For the normal user session w/ PID |
…king signals, then forgetting to unblock them in some error conditions; such cases _might_ cause this reported behavior.
…gnals-issue1384 Issue #1384: I reviewed the codebase for cases where we might be bloc…
… long-running `while` loops might be lacking appropriate signal handling. I didn't find many, and none immediately stood out to me as "smoking guns" for the reported behavior. Still, I think it's worth addressing them.
…ue1384 Issue #1384: I reviewed the codebase for cases where some potentially…
OK. I'm poring over the code base, to see what else might be happening here. In the mean time, the next time you see one of these stuck/unkillable processes, can you try doing a |
We are also seeing this issue, though only from malicious connections so far. An
Our config:
The OS is Centos 7.9, kernel 3.10.0-1160.53.1.el7.x86_64. |
@jpfinger Thanks for that info, that helps! The next time you can, can you run |
Unfortunately I had to kill the existing stuck sessions we had and we haven't seen any more yet. This is still on my radar and will get an update when we see it again. |
Ok, we have one currently stuck at ~20 hours. Someone knows we're talking about them! I'll hold off on killing this one for a bit.
|
Thanks! From that trace output, we do see that the |
In fact we do in mod_ctrls:
|
OK, great. I'll work through the In the mean time, for the next time you see such unkillable processes, in addition to the
and then search the generated |
Hmm; it turns out that there is another way that we might see these 5s The Timer API is not entirely accurate, and in this case, it will default to checking any registered timer intervals/callbacks every 5s anyway. So that's possibly what's happening with this unkillable process. Unfortunately, that doesn't provide any clues yet about how the process got into this state. But the fact that it's "unkillable" does suggest that timers/alarms are blocked for that process somehow -- I'll be adding trace logging, for the |
@jpfinger for your stuck PID 28387, in the |
…y/tracing, and add trace logging of alarm blocking.
…y/tracing, and add trace logging of alarm blocking.
…y/tracing, and add trace logging of alarm blocking.
…y/tracing, and add trace logging of alarm blocking.
@Castaglia I ran the strace against it for 30 minutes this morning and all I got were the quoted SIGALRM lines. |
…y/tracing, and add trace logging of alarm blocking.
…y/tracing, and add trace logging of alarm blocking.
no difference :/ compiled with |
@manuelm No worries. At least my PR didn't cause regressions; I'll merge it anyway. Now this means I'll need to work on reproducing this behavior locally even more. So I have some more questions; I know that you may not have answers, but as many details as you can provide will help:
We know (from other reports in this issue) that these "stuck" processes show as "authenticating" in
Each of these presents/exercises slightly different code paths, and I'll work on each one of them, locally, to see if I have any luck. Thanks in advance for all the info you've already provided! |
Issue #1384: One possible cause of the infinite loops reported in thi…
…ussed out one more use-after-free code path.
…ced that the Auth API creates `cmd_recs` whose `pool` and `tmp_pool` pointers point to the same pool, which is somewhat surprising. So let's use separate pools for those.
…ould _unmount_ its FS, rather than unregistering the FS and _assuming_ the unregistered FS is that from `mod_vroot`. There are interactions (such as failed authentications, or successful authentications where `mod_vroot` should not apply) where we may have been unregistering the system FS, leading to segfaults; see: proftpd/proftpd#1384
Issue #1384: While examining the reported stack trace closely, I noti…
@manuelm I think I was able to recreate this issue -- or something very like it! I was able to trigger a segfault, under
I'm not 100% convinced that this is the exact same issue, but it looks similar. The fix for this segfault of mine is here: Castaglia/proftpd-mod_vroot#38 It could be the some of my latest commits to ProFTPD, in conjunction with the above |
…ould _unmount_ its FS, rather than unregistering the FS and _assuming_ the unregistered FS is that from `mod_vroot`. There are interactions (such as failed authentications, or successful authentications where `mod_vroot` should not apply) where we may have been unregistering the system FS, leading to segfaults; see: proftpd/proftpd#1384
Actually the behavior is identical on my system. A successful login following after an unsuccessful login runs into the known 0x0000563c45a50383 in pr_fsio_lseek (fh=0x563c4760bf48, offset=0, whence=0) at fsio.c:5340
5340 while (fs && fs->fs_next && !fs->lseek) {
(gdb) bt
#0 0x0000563c45a50383 in pr_fsio_lseek (fh=0x563c4760bf48, offset=0, whence=0) at fsio.c:5340
#1 0x0000563c45a8081b in af_setpwent (p=0x563c4760c420) at mod_auth_file.c:1003
#2 0x0000563c45a82671 in authfile_getpwnam (cmd=0x563c47607a08) at mod_auth_file.c:1172
#3 0x0000563c45a433ee in pr_module_call (m=0x563c45b86ea0 <auth_file_module>, func=0x563c45a82651 <authfile_getpwnam>, cmd=cmd@entry=0x563c47607a08)
at modules.c:59
#4 0x0000563c45a461db in dispatch_auth (cmd=cmd@entry=0x563c47607a08, match=match@entry=0x563c45b26344 "getpwnam", m=m@entry=0x7ffe00b11d50) at auth.c:425
#5 0x0000563c45a499b2 in pr_auth_getpwnam (p=p@entry=0x563c476079c0, name=name@entry=0x7ffe00b11e40 "ud_3") at auth.c:626
#6 0x0000563c45a4e36e in pr_fs_interpolate (path=path@entry=0x563c475aee3d "~/ftp.allow", buf=buf@entry=0x7ffe00b12060 "", buflen=buflen@entry=4096)
at fsio.c:2777
#7 0x0000563c45a4dcdb in pr_fs_interpolate (buflen=4096, buf=0x7ffe00b12060 "", path=0x563c475aee3d "~/ftp.allow") at fsio.c:2688
#8 pr_fs_resolve_partial (path=path@entry=0x563c475aee3d "~/ftp.allow", buf=buf@entry=0x7ffe00b170f0 "", buflen=buflen@entry=4096, op=op@entry=0)
at fsio.c:2829
#9 0x0000563c45a291e3 in dir_realpath (p=p@entry=0x563c4760e460, path=path@entry=0x563c475aee3d "~/ftp.allow") at support.c:612
#10 0x0000563c45a9e126 in filetab_open_cb (parent_pool=<optimized out>, srcinfo=0x563c475aee3d "~/ftp.allow") at mod_wrap2_file.c:263
#11 0x0000563c45a9f38e in wrap2_open_table (name=0x563c475aee38 "file") at mod_wrap2.c:190
#12 0x0000563c45a9f5d1 in wrap2_allow_access (conn=conn@entry=0x7ffe00b182b0) at mod_wrap2.c:1223
#13 0x0000563c45aa01ac in wrap2_pre_pass (cmd=0x563c475ca978) at mod_wrap2.c:1903
#14 0x0000563c45a433ee in pr_module_call (m=0x563c45b88620 <wrap2_module>, func=0x563c45a9fd24 <wrap2_pre_pass>, cmd=cmd@entry=0x563c475ca978)
at modules.c:59
#15 0x0000563c45a16e5b in _dispatch (cmd=cmd@entry=0x563c475ca978, cmd_type=cmd_type@entry=1, validate=validate@entry=0, match=0x563c475caa10 "PASS",
match@entry=0x0) at main.c:366
#16 0x0000563c45a19a6a in pr_cmd_dispatch_phase (cmd=cmd@entry=0x563c475ca978, phase=phase@entry=0, flags=flags@entry=3) at main.c:678
#17 0x0000563c45a19ca8 in pr_cmd_dispatch (cmd=0x563c475ca978) at main.c:801
#18 cmd_loop (server=<optimized out>, c=<optimized out>) at main.c:944
#19 0x0000563c45a17d87 in fork_server (fd=<optimized out>, l=<optimized out>, no_fork=<optimized out>) at main.c:1517
#20 0x0000563c45a18b1a in daemon_loop () at main.c:1766
#21 0x0000563c45a164c6 in standalone_main () at main.c:1967
#22 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at main.c:2792 With |
@manuelm Hmm, OK, I'll include I will also point out -- if you want try to the same thing -- that in my local setup, I'm configuring/compiling ProFTPD with |
Good point. I'm now testing with the latest commit. Still seeing a different backtrace: (gdb) bt
#0 0x00007f1ba58724c0 in ?? () from target:/usr/lib64/libc.so.6
#1 0x00007f1ba582a02f in raise () from target:/usr/lib64/libc.so.6
#2 0x00007f1ba5815478 in abort () from target:/usr/lib64/libc.so.6
#3 0x00007f1ba586745d in ?? () from target:/usr/lib64/libc.so.6
#4 0x00007f1ba587b4f1 in ?? () from target:/usr/lib64/libc.so.6
#5 0x00007f1ba587cf67 in ?? () from target:/usr/lib64/libc.so.6
#6 0x00007f1ba587f9e2 in free () from target:/usr/lib64/libc.so.6
#7 0x0000560b8f0308ab in pool_release_free_block_list () at pool.c:475
#8 destroy_pool (p=0x560b90540a30) at pool.c:621
#9 0x0000560b8f031033 in destroy_pool (p=<optimized out>) at pool.c:591
#10 0x0000560b8f064ee0 in pr_fsio_close (fh=0x560b90540a78) at fsio.c:5127
#11 0x0000560b8f09604d in af_endpwent () at mod_auth_file.c:873
#12 authfile_endpwent (cmd=<optimized out>) at mod_auth_file.c:1156
[...]
(gdb) p ((pool *)0x560b90540a30)->tag
$1 = 0x560b8f13f656 "pr_fsio_open() subpool" |
Upon further investigation: It's the same backtrace. I didn't have any debug logs enabled, thus my process didn't print out the invalid pointer... and crashed a bit later. However, Castaglia/proftpd-mod_vroot#38 did not have any effect. |
The following --- a/mod_vroot.c 2022-11-09 13:33:07.340457699 +0100
+++ b/mod_vroot.c 2022-11-09 13:32:35.770692966 +0100
@@ -598,8 +598,9 @@
MODRET vroot_post_pass_err(cmd_rec *cmd) {
if (vroot_engine == TRUE) {
+ int is_sftp = ((cmd->cmd_class & CL_SFTP) || (cmd->cmd_class & CL_SSH));
const void *hint;
-
+
/* Look for any notes/hints attached to this command which might indicate
* that it is not a real PASS command error, but rather a fake command
* dispatched for e.g. logging/handling by other modules. We pay attention
@@ -617,9 +618,9 @@
*/
#if PROFTPD_VERSION_NUMBER < 0x0001030707
- if (hint == NULL) {
+ if (!is_sftp || hint == NULL) {
#else
- if (hint != NULL) {
+ if (!is_sftp || hint != NULL) {
#endif /* ProFTPD 1.3.7b or later */
/* If not chrooted, unmount our vroot FS. */
if (session.chroot_path == NULL) { I have no idea how the |
I'm able to see the segfault you reported in |
…dereferences and use-after-frees, related to `mod_vroot`, for proftpd/proftpd#1384, I determined that these segfaults all occurred during the authentication process. With this change, `mod_vroot` now registers its custom FS closer to the (possible) `chroot(2)` system call in timing, and _only_ if authentication succeeds. Previously the module would use a PRE_CMD PASS handler to register the custom FS, then POST_CMD/POST_CMD_ERR PASS handlers to unregister the FS in case of issues. This, in turn, means that the custom FS is active (and used!) _during_ the authentication process, which appeared to be related to the above memory/reference issues. So I took a step back, and thought of the purpose of `mod_vroot` (to provide a custom "virtual chroot" for a session), and changed the timing of the custom FSIO registration to be closer to the point where it is actually needed/used.
…auth.authentication-code" event, just like the core engine does.
OK, here's my next iteration at addressing this: Castaglia/proftpd-mod_vroot#39 Note that you will also need c30c4c6 present, so that |
…dereferences and use-after-frees, related to `mod_vroot`, for proftpd/proftpd#1384, I determined that these segfaults all occurred during the authentication process. With this change, `mod_vroot` now registers its custom FS closer to the (possible) `chroot(2)` system call in timing, and _only_ if authentication succeeds. Previously the module would use a PRE_CMD PASS handler to register the custom FS, then POST_CMD/POST_CMD_ERR PASS handlers to unregister the FS in case of issues. This, in turn, means that the custom FS is active (and used!) _during_ the authentication process, which appeared to be related to the above memory/reference issues. So I took a step back, and thought of the purpose of `mod_vroot` (to provide a custom "virtual chroot" for a session), and changed the timing of the custom FSIO registration to be closer to the point where it is actually needed/used.
First tests are looking good. I'll do some more and roll out the code on production servers tomorrow morning. So we can roll back in case something else breaks. |
…vent for anonymous logins as well; the `mod_vroot` integration tests caught this, with the moving of `mod_vroot`'s custom FS registration to this event.
…dereferences and use-after-frees, related to `mod_vroot`, for proftpd/proftpd#1384, I determined that these segfaults all occurred during the authentication process. With this change, `mod_vroot` now registers its custom FS closer to the (possible) `chroot(2)` system call in timing, and _only_ if authentication succeeds. Previously the module would use a PRE_CMD PASS handler to register the custom FS, then POST_CMD/POST_CMD_ERR PASS handlers to unregister the FS in case of issues. This, in turn, means that the custom FS is active (and used!) _during_ the authentication process, which appeared to be related to the above memory/reference issues. So I took a step back, and thought of the purpose of `mod_vroot` (to provide a custom "virtual chroot" for a session), and changed the timing of the custom FSIO registration to be closer to the point where it is actually needed/used.
@Castaglia looking good so far. No negative feedback from users and no looping processes. |
Excellent news @manuelm ! Thanks for all of your help in this! I've published a new release of I'll wait for another week or so, to see if there's any additional fallout from this issue. After that, I'll close this ticket. Thanks again! |
What I Did
I have been using proftpd 1.2.x in a production environment for several years and recently upgraded to the latest stable 1.3.7c and run proftpd inside a Docker container. Since the upgrade, security scanners like Qualys and random (malicious?) users create hanging connections stuck on status "Authenticating" that are unkillable with
ftpdctl
and that ignore/etc/proftpd.conf
settings "TimeoutNoTransfer", "TimeoutStalled", "TimeoutIdle". I am able to kill the connections by using the connection PID fromftptop
and executingkill -9 <PID>
. These hanging connections are causing CPU 100%.Here's what I see in the docker logs for proftpd for one such malicious connection that was made on Jan 13 and remained in COMMAND (authenticating) for 270h!:
The attacker tries to login with an invalid user
admin
but this connection is never terminated until I discover it 1.5 weeks after Jan 13 and manually kill the PID.What I Expected/Wanted
I expect that connections should be dropped after failing user/passwd auth and that the proftpd.conf settings for "TimeoutNoTransfer", "TimeoutStalled", "TimeoutIdle" should apply to all connections. I also expected that
ftpdctl kick <IP>
should work on stalled connections. Note thatftpdctl kick
works fine on non-zombie/malicious ftp sessions.ProFTPD Version and Configuration
Please help us reproduce the problem/issue you are encountering. To do this,
we need to know which version of ProFTPD you are using, how it was built,
etc. The following command is an easy way to get all of this information:
I am using the proftpd v1.3.7c package provided by Fedora 35.
In addition, we need to see all of the ProFTPD configuration files you
are using (minus any sensitive information like passwords, of course). Armed
with the version and configuration data, then, we can set up ProFTPD locally
using the same configuration, and see what happens.
proftpd conf files:
proftpd-bug-rpt.tar.gz
ftptop
screen shots of zombie connectionsThe text was updated successfully, but these errors were encountered: