Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to do health checks and get alerted when ClamD process is NOT up #53

Open
vienleidl opened this issue Jun 20, 2024 · 2 comments
Open

Comments

@vienleidl
Copy link

I'm trying to do health checks for a ClamAV container on Azure Container Apps. I've already tested the health probes in Container Apps and it works well when I tried to kill the clamd process.

But I'm not sure if it also works when the container fails to load databases during the ClamAV update process. I expect that it will generate an ERROR message to the console log, then the log search alert rule will pick it up and send us an alert. I'm wondering how you guys do health checks for ClamAV containers in that case.

@eliottwiener
Copy link

I am not familiar with Azure Containers, and I can't provide authoritative answers, but I may be able to provide some useful information for you.

How did you setup the health probes for clamd? Using clamdcheck.sh?

If your goal is to determine if the database used by clamd is properly updated, or is outdated, you can check the signature database version clamd is using. If you send a VERSION command to clamd, it will return a triplet of the form clamd_version/daily_cvd_version/datetime (I believe the datetime is when the db was loaded by clamd?). For example: ClamAV 1.3.1/27312/Thu Jun 20 08:34:55 2024. This shows that clamd is using the daily.cvd release 27312. You can compare the release number returned from that with the latest release number available, which is published via DNS:

$ dig +noall +answer current.cvd.clamav.net TXT
current.cvd.clamav.net.	139	IN	TXT	"0.103.11:62:27312:1718884801:1:90:49192:335"

The latest release number is the third item in this :-delimited string: 27312.

There may be a more straightforward way to check this.

You can also use freshclam to notify you of problems using the OnOutdatedExecute, and OnErrorExecute hooks, but these alone could be somewhat misleading compared with checking clamd's state directly.

@vienleidl
Copy link
Author

vienleidl commented Jun 21, 2024

@eliottwiener tks for your comment! Actually, I got this issue Cisco-Talos/clamav#1282 some days ago and I've been trying to find a way to check the clamd process. If I manually run the clamdcheck.sh, it will show the message: ClamD is up and I can also see that there is a health check every 30 seconds like below:

"Health": {
                "Status": "healthy",
                "FailingStreak": 0,
                "Log": [
                    {
                        "Start": "2024-06-21T03:16:03.751841901Z",
                        "End": "2024-06-21T03:16:03.769642788Z",
                        "ExitCode": 0,
                        "Output": "Clamd is up\n"
                    },
                    {
                        "Start": "2024-06-21T03:16:33.755130833Z",
                        "End": "2024-06-21T03:16:33.777544184Z",
                        "ExitCode": 0,
                        "Output": "Clamd is up\n"
                    },
                    {
                        "Start": "2024-06-21T03:17:03.767222816Z",
                        "End": "2024-06-21T03:17:03.784973771Z",
                        "ExitCode": 0,
                        "Output": "Clamd is up\n"
                    },
                    {
                        "Start": "2024-06-21T03:17:33.782737752Z",
                        "End": "2024-06-21T03:17:33.799253933Z",
                        "ExitCode": 0,
                        "Output": "Clamd is up\n"
                    },
                    {
                        "Start": "2024-06-21T03:18:03.798895624Z",
                        "End": "2024-06-21T03:18:03.818973114Z",
                        "ExitCode": 0,
                        "Output": "Clamd is up\n"
                    }
                ]
            }

But I'd like to have that message in the console log, so that I can create an alert rule based on that information. Unfortunately, that health check is not run automatically.

Currently, I have set up the health probes in Azure Container Apps for ClamAV containers and I'm still trying to reproduce that kind of issue when the ClamD process becomes unresponsive while leaving the ClamAV container running, and then I hope the health probes will dump the failure information into the console log which makes an alert fired as my expectation.

probes: [
          {
            type: 'Liveness'
            failureThreshold: 3
            initialDelaySeconds: 5
            periodSeconds: 20
            successThreshold: 1
            tcpSocket: {
              port: 3310
            }
            timeoutSeconds: 1
          }
          {
            type: 'Readiness'
            failureThreshold: 3
            initialDelaySeconds: 5
            periodSeconds: 10
            successThreshold: 1
            tcpSocket: {
              port: 3310
            }
            timeoutSeconds: 1
          }
          {
            type: 'Startup'
            failureThreshold: 3
            initialDelaySeconds: 20
            periodSeconds: 10
            successThreshold: 1
            tcpSocket: {
              port: 3310
            }
            timeoutSeconds: 1
          }
        ]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants