Skip to content

Commit 53f97a3

Browse files
Ripwordsclaude
andcommitted
feat(intake): virus-scan user attachments via ClamAV sidecar
Integrates ClamAV (the only genuinely free, self-hostable AV that doesn't leak attachments to a third party) for intake-time scanning of user-supplied files. Off by default — set INTAKE_USER_FILE_SCAN_ENABLED=true and run the new clamav sidecar in docker-compose to enable. - new lib: server/lib/clamav.ts wraps the clamscan npm package, lazily initialises against the configured host:port, fails-closed on scan errors (resets the cached client so the next call re-inits) - env knobs: INTAKE_USER_FILE_SCAN_ENABLED, CLAMAV_HOST, CLAMAV_PORT, CLAMAV_TIMEOUT_MS - intake/reports.ts: scans each user-file sequentially after the size/mime/ext guards; 422 on signature hit, 503 on scanner outage (fail-closed surfaces the operational issue rather than silently letting unscanned bytes through) - attachment.get.ts: kind=user-file responses now use Content-Disposition: attachment instead of inline, defense-in-depth so a file that somehow slips past the denylist + scan still cannot render in-browser - docker-compose.dev.yml: adds clamav/clamav:stable sidecar, persistent volume for the signature DB, healthcheck via clamdscan --ping Tests: 5 unit tests in clamav.test.ts cover enabled+clean, enabled+infected, disabled bypass, fail-closed throw, and the empty-viruses-array fallback. Also fixes a pre-existing latent issue in intake-attachments.test.ts: beforeAll now truncates the domain before seeding so re-runs against a non-truncated DB don't collide on the admin email's unique constraint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
1 parent 4452729 commit 53f97a3

9 files changed

Lines changed: 232 additions & 2 deletions

File tree

apps/dashboard/docker/docker-compose.dev.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,5 +24,24 @@ services:
2424
timeout: 5s
2525
retries: 10
2626

27+
# Optional virus scanner for user-supplied attachments. The dashboard only
28+
# talks to it when INTAKE_USER_FILE_SCAN_ENABLED=true (off by default), so
29+
# operators who don't want this can leave it stopped. First boot pulls the
30+
# signature DB (~500 MB); freshclam updates it daily afterwards.
31+
clamav:
32+
image: clamav/clamav:stable
33+
ports:
34+
- "127.0.0.1:3310:3310"
35+
volumes:
36+
- repro_clamav_db:/var/lib/clamav
37+
healthcheck:
38+
test: ["CMD", "clamdscan", "--ping", "1"]
39+
interval: 30s
40+
timeout: 10s
41+
retries: 5
42+
start_period: 120s
43+
restart: unless-stopped
44+
2745
volumes:
2846
repro_data:
47+
repro_clamav_db:

apps/dashboard/package.json

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
"@tailwindcss/vite": "^4.2.1",
2525
"@vueuse/nuxt": "14.2.1",
2626
"better-auth": "^1.5.6",
27+
"clamscan": "^2.4.0",
2728
"dompurify": "^3.4.1",
2829
"drizzle-orm": "^0.45.2",
2930
"h3": "^1.15.11",
@@ -46,6 +47,7 @@
4647
"@iconify-json/simple-icons": "^1.2.79",
4748
"@nuxt/test-utils": "^4.0.2",
4849
"@types/bun": "^1.3.12",
50+
"@types/clamscan": "^2.4.1",
4951
"@types/dompurify": "^3.2.0",
5052
"@types/node": "^25.6.0",
5153
"@types/nodemailer": "^8.0.0",

apps/dashboard/server/api/intake/reports.ts

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ import { enqueueSync } from "../../lib/enqueue-sync"
1313
import { env } from "../../lib/env"
1414
import { getAnonKeyLimiter, getIpLimiter, getKeyLimiter } from "../../lib/rate-limit"
1515
import { getStorage } from "../../lib/storage"
16+
import { scanBytes } from "../../lib/clamav"
1617
import { sanitizeFilename } from "../../lib/sanitize-filename"
1718
import { rollbackPuts } from "../../lib/storage/rollback"
1819

@@ -355,6 +356,34 @@ export default defineEventHandler(async (event) => {
355356
throw createError({ statusCode: 413, statusMessage: "Attachments exceed total budget" })
356357
}
357358

359+
// Virus-scan each user-file before persisting. Runs sequentially because
360+
// ClamAV's clamd is single-threaded per connection and scanning in
361+
// parallel against a shared socket can interleave verdicts. With a 5-file
362+
// cap and warm signature cache, sequential scanning is < 2s in practice.
363+
// Skipped when INTAKE_USER_FILE_SCAN_ENABLED=false (the default).
364+
if (env.INTAKE_USER_FILE_SCAN_ENABLED) {
365+
for (const { part } of userParts) {
366+
let scan: Awaited<ReturnType<typeof scanBytes>>
367+
try {
368+
scan = await scanBytes(new Uint8Array(part.data))
369+
} catch (err) {
370+
// Fail-closed: if the scanner is enabled but unreachable, surface a
371+
// 503 rather than letting unscanned bytes through.
372+
console.error("[intake] virus scanner unavailable", err)
373+
throw createError({
374+
statusCode: 503,
375+
statusMessage: "Attachment scanner unavailable, please retry",
376+
})
377+
}
378+
if (!scan.clean) {
379+
throw createError({
380+
statusCode: 422,
381+
statusMessage: `Attachment rejected by virus scanner: ${part.filename ?? "unnamed"} (${scan.reason ?? "infected"})`,
382+
})
383+
}
384+
}
385+
}
386+
358387
const storage = await getStorage()
359388
const writtenKeys: string[] = []
360389
try {

apps/dashboard/server/api/projects/[id]/reports/[reportId]/attachment.get.ts

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,12 @@ export default defineEventHandler(async (event) => {
102102
// against header injection. sanitizeFilename already strips control
103103
// bytes at intake time, but never trust two layers down.
104104
const safeName = row.filename.replace(/[\r\n"]/g, "")
105-
setHeader(event, "Content-Disposition", `inline; filename="${safeName}"`)
105+
// user-file kinds force `attachment` so a malicious file that somehow
106+
// slipped past the mime/ext denylist + virus scan still cannot render
107+
// inline in the browser. Other kinds (screenshot/replay/logs) are
108+
// already rendered via known-safe content types and stay inline.
109+
const disposition = row.kind === "user-file" ? "attachment" : "inline"
110+
setHeader(event, "Content-Disposition", `${disposition}; filename="${safeName}"`)
106111
}
107112
setResponseStatus(event, 200)
108113
return Buffer.from(bytes)
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
import { afterEach, describe, expect, test } from "bun:test"
2+
import { _reloadEnvForTesting } from "./env"
3+
import { _setClientForTesting, scanBytes } from "./clamav"
4+
5+
function fakeClient(impl: { isInfected: boolean | null; viruses?: string[]; throws?: Error }) {
6+
return {
7+
scanStream: async () => {
8+
if (impl.throws) throw impl.throws
9+
return { isInfected: impl.isInfected, viruses: impl.viruses ?? [] }
10+
},
11+
}
12+
}
13+
14+
describe("scanBytes", () => {
15+
afterEach(() => {
16+
_setClientForTesting(null)
17+
delete process.env.INTAKE_USER_FILE_SCAN_ENABLED
18+
_reloadEnvForTesting()
19+
})
20+
21+
test("returns clean immediately when scanning is disabled", async () => {
22+
delete process.env.INTAKE_USER_FILE_SCAN_ENABLED
23+
_reloadEnvForTesting()
24+
// Inject a client that would mark anything infected — proves we skipped it.
25+
_setClientForTesting(fakeClient({ isInfected: true, viruses: ["FAIL"] }))
26+
const result = await scanBytes(new Uint8Array([1, 2, 3]))
27+
expect(result).toEqual({ clean: true })
28+
})
29+
30+
test("returns clean when scanner reports no infection", async () => {
31+
process.env.INTAKE_USER_FILE_SCAN_ENABLED = "true"
32+
_reloadEnvForTesting()
33+
_setClientForTesting(fakeClient({ isInfected: false }))
34+
const result = await scanBytes(new Uint8Array([1, 2, 3]))
35+
expect(result).toEqual({ clean: true })
36+
})
37+
38+
test("returns clean=false with the first virus name when scanner finds one", async () => {
39+
process.env.INTAKE_USER_FILE_SCAN_ENABLED = "true"
40+
_reloadEnvForTesting()
41+
_setClientForTesting(
42+
fakeClient({ isInfected: true, viruses: ["Eicar-Test-Signature", "Other"] }),
43+
)
44+
const result = await scanBytes(new Uint8Array([1, 2, 3]))
45+
expect(result).toEqual({ clean: false, reason: "Eicar-Test-Signature" })
46+
})
47+
48+
test("falls back to a generic reason when viruses array is empty", async () => {
49+
process.env.INTAKE_USER_FILE_SCAN_ENABLED = "true"
50+
_reloadEnvForTesting()
51+
_setClientForTesting(fakeClient({ isInfected: true, viruses: [] }))
52+
const result = await scanBytes(new Uint8Array([1, 2, 3]))
53+
expect(result).toEqual({ clean: false, reason: "infected" })
54+
})
55+
56+
test("throws (fail-closed) when scanner errors", async () => {
57+
process.env.INTAKE_USER_FILE_SCAN_ENABLED = "true"
58+
_reloadEnvForTesting()
59+
_setClientForTesting(fakeClient({ isInfected: null, throws: new Error("ECONNREFUSED") }))
60+
await expect(scanBytes(new Uint8Array([1, 2, 3]))).rejects.toThrow(/scan failed/)
61+
})
62+
})
Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
import { Readable } from "node:stream"
2+
import NodeClam from "clamscan"
3+
import { env } from "./env"
4+
5+
interface ScanClient {
6+
scanStream: (stream: Readable) => Promise<{ isInfected: boolean | null; viruses: string[] }>
7+
}
8+
9+
export interface ScanResult {
10+
clean: boolean
11+
reason?: string
12+
}
13+
14+
let _client: ScanClient | null = null
15+
let _initInProgress: Promise<ScanClient> | null = null
16+
17+
async function buildClient(): Promise<ScanClient> {
18+
const clam = await new NodeClam().init({
19+
clamdscan: {
20+
host: env.CLAMAV_HOST,
21+
port: env.CLAMAV_PORT,
22+
timeout: env.CLAMAV_TIMEOUT_MS,
23+
// Sidecar-only — no fallback to a local CLI binary on the dashboard host.
24+
localFallback: false,
25+
bypassTest: false,
26+
},
27+
preference: "clamdscan",
28+
removeInfected: false,
29+
debugMode: false,
30+
})
31+
return clam as unknown as ScanClient
32+
}
33+
34+
async function getClient(): Promise<ScanClient> {
35+
if (_client) return _client
36+
if (_initInProgress) return _initInProgress
37+
_initInProgress = buildClient()
38+
.then((c) => {
39+
_client = c
40+
return c
41+
})
42+
.finally(() => {
43+
_initInProgress = null
44+
})
45+
return _initInProgress
46+
}
47+
48+
/**
49+
* Scan a buffer of bytes against the configured ClamAV sidecar. Returns
50+
* { clean: true } when scanning is disabled OR the file passed; throws when
51+
* the scanner is enabled but unreachable (fail-closed). Returns
52+
* { clean: false, reason } only on a confirmed signature hit.
53+
*/
54+
export async function scanBytes(bytes: Uint8Array): Promise<ScanResult> {
55+
if (!env.INTAKE_USER_FILE_SCAN_ENABLED) return { clean: true }
56+
let client: ScanClient
57+
try {
58+
client = await getClient()
59+
} catch (err) {
60+
throw new Error(
61+
`[clamav] init failed against ${env.CLAMAV_HOST}:${env.CLAMAV_PORT}: ${(err as Error).message}`,
62+
{ cause: err },
63+
)
64+
}
65+
const stream = Readable.from(Buffer.from(bytes))
66+
try {
67+
const { isInfected, viruses } = await client.scanStream(stream)
68+
if (isInfected) return { clean: false, reason: viruses?.[0] ?? "infected" }
69+
return { clean: true }
70+
} catch (err) {
71+
// Reset the cached client so the next call re-inits — handles transient
72+
// socket drops cleanly when clamd restarts.
73+
_client = null
74+
throw new Error(`[clamav] scan failed: ${(err as Error).message}`, { cause: err })
75+
}
76+
}
77+
78+
/**
79+
* Test seam: lets unit tests inject a fake client without going through the
80+
* NodeClam init path (which would require a real clamd socket).
81+
*/
82+
export function _setClientForTesting(client: ScanClient | null): void {
83+
_client = client
84+
_initInProgress = null
85+
}

apps/dashboard/server/lib/env.ts

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -81,6 +81,16 @@ const Schema = z.object({
8181
INTAKE_USER_FILE_MAX_BYTES: intString(10 * 1024 * 1024),
8282
INTAKE_USER_FILES_TOTAL_MAX_BYTES: intString(25 * 1024 * 1024),
8383
INTAKE_USER_FILES_MAX_COUNT: intString(5),
84+
85+
// Virus scanning for user-supplied attachments. When ENABLED is false
86+
// (the default) the scan path is skipped entirely so self-hosters who
87+
// don't run a ClamAV sidecar aren't impacted. When true, intake fails
88+
// closed: a scanner outage rejects uploads rather than silently letting
89+
// unscanned files through.
90+
INTAKE_USER_FILE_SCAN_ENABLED: boolString.default(false),
91+
CLAMAV_HOST: z.string().default("localhost"),
92+
CLAMAV_PORT: intString(3310),
93+
CLAMAV_TIMEOUT_MS: intString(30_000),
8494
INTAKE_REQUIRE_DWELL: boolString.default(true),
8595
INTAKE_MIN_DWELL_MS: intString(1_500),
8696

apps/dashboard/tests/api/intake-attachments.test.ts

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ import { afterEach, beforeAll, describe, expect, setDefaultTimeout, test } from
33
import { eq } from "drizzle-orm"
44
import { db } from "../../server/db"
55
import { reportAttachments } from "../../server/db/schema"
6-
import { createUser, seedProject, truncateReports } from "../helpers"
6+
import { createUser, seedProject, truncateDomain, truncateReports } from "../helpers"
77

88
await setup({ server: true, port: 3000, host: "localhost" })
99
setDefaultTimeout(15000)
@@ -57,6 +57,9 @@ async function postReportWithFiles(
5757

5858
describe("POST /api/intake/reports — user attachments", () => {
5959
beforeAll(async () => {
60+
// Hard-reset users/projects so re-runs against a non-truncated DB don't
61+
// collide on the admin email's unique constraint.
62+
await truncateDomain()
6063
const admin = await createUser("attch-admin@example.com", "admin")
6164
await seedProject({
6265
name: "Attachment Test Project",
@@ -136,6 +139,12 @@ describe("POST /api/intake/reports — user attachments", () => {
136139
expect(userFile?.storageKey.endsWith("/user/0-etcpasswd")).toBe(true)
137140
})
138141

142+
// Note: virus-scan behavior (clean / infected / fail-closed / disabled) is
143+
// covered by the unit-tests in `apps/dashboard/server/lib/clamav.test.ts`.
144+
// It cannot be driven from this integration suite because the helper hits
145+
// a separately-running `bun run dev` server: env mutations and
146+
// _setClientForTesting() calls in the test process don't reach it.
147+
139148
test("intake without attachment[N] parts behaves identically to today (regression guard)", async () => {
140149
const { res, reportId } = await postReportWithFiles([])
141150
expect(res.status).toBe(201)

bun.lock

Lines changed: 9 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)