Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker container does not exit when clamd exits #45

Open
bnutzer opened this issue Mar 5, 2024 · 2 comments
Open

Docker container does not exit when clamd exits #45

bnutzer opened this issue Mar 5, 2024 · 2 comments

Comments

@bnutzer
Copy link

bnutzer commented Mar 5, 2024

Hi,

after switching to clamav 1.3, we have seen "disappearing" clamds -- for a reason yet to be researched. However, we are having a hard time trouble-shooting that issue, as a dying clamav does not result in an instant exit of the container. The container will just become unhealthy.

The container will start up to three "daemons" (freshclam, clamd, milter), and it obviously not a trivial question which of them is central, and which ones are not. Not using milter, I regard a running clamav container with a working clamd as "valid" even if the freshclam process has gone away.

Due to this, I'd prefer something along these lines (yep, this patch misses the case that no daemons are executed at all):

--- clamav/1.3/alpine/scripts/docker-entrypoint.sh
+++ clamav/1.3/alpine/scripts/docker-entrypoint.sh
@@ -61,6 +61,7 @@ else
                        unlink "/tmp/clamd.sock"
                fi
                clamd --foreground &
+               clamdpid=$!
                while [ ! -S "/run/clamav/clamd.sock" ] && [ ! -S "/tmp/clamd.sock" ]; do
                        if [ "${_timeout:=0}" -gt "${CLAMD_STARTUP_TIMEOUT:=1800}" ]; then
                                echo
@@ -80,7 +81,7 @@ else
        fi

        # Wait forever (or until canceled)
-       exec tail -f "/dev/null"
+       wait $clamdpid
 fi

 exit 0

(Unfortunately, the "wait -n" bashism in busybox and thus in alpine seems to be broken to me; otherwise, collecting the (up to) three pids and "wait -n" for all of them would probably be fine. I have created an issue in the busybox bugtracker, but it probably will take some time until a possible fix hits the alpine image).

I'd be happy to provide a pull request on request. In that case: Should the wait statement in the debian edition wait for all three pids? Or should alpine and debian versions of the script be as close as possible? For a possible "no daemon situation", I'd suggest to use "sleep infinity" instead of the crude "tail" call?

@Newspaperman57
Copy link

+1 on this. We've lost the clamd-process a couple of times while reloading the database due to out of memory, resulting in the kernel killing it, but seemingly leaving the container running, resulting in "broken pipe" errors on clients trying to connect, and the log simply showing:

Sun May 12 11:21:14 2024 -> SelfCheck: Database status OK.
Sun May 12 11:31:15 2024 -> SelfCheck: Database status OK.
Sun May 12 11:41:17 2024 -> SelfCheck: Database status OK.
Sun May 12 11:51:18 2024 -> SelfCheck: Database status OK.
Sun May 12 12:01:19 2024 -> SelfCheck: Database status OK.
Sun May 12 12:11:20 2024 -> SelfCheck: Database status OK.
Sun May 12 12:21:22 2024 -> SelfCheck: Database status OK.
Received signal: wake up
ClamAV update process started at Sun May 12 12:23:33 2024
daily database available for update (local version: 27272, remote version: 27273)
Testing database: '/var/lib/clamav/tmp.4e593738ca/clamav-1886123323e735d67f816af8a3bdb7c7.tmp-daily.cld' ...
Database test passed.
daily.cld updated (version: 27273, sigs: 2061131, f-level: 90, builder: raynman)
main.cvd database is up-to-date (version: 62, sigs: 6647427, f-level: 90, builder: sigmgr)
bytecode.cld database is up-to-date (version: 335, sigs: 86, f-level: 90, builder: raynman)
Clamd successfully notified about the update.
Sun May 12 12:23:43 2024 -> Reading databases from /var/lib/clamav

This results in downtime and needing manual intervention to fix.

We're using the clamav-debian image on a ppc64le-based architecture

@vienleidl
Copy link

+1 on this. We've lost the clamd-process a couple of times while reloading the database due to out of memory, resulting in the kernel killing it, but seemingly leaving the container running, resulting in "broken pipe" errors on clients trying to connect, and the log simply showing:

I had the same issue Cisco-Talos/clamav#1282, then I did increase from 2 GiB to 4 GiB of RAM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants