This document outlines the security measures, scanning tools, and best practices implemented in the EvidenceLab AI project.
- Reporting Security Vulnerabilities
- Security Architecture
- Automated Security Scanning
- Dependency Management
- Code Security Measures
- Container Security
- API Security
- Development Security Practices
- OWASP ASVS L2 Compliance
If you discover a security vulnerability in this project, please report it responsibly:
- Do NOT create a public GitHub issue for security vulnerabilities
- Email evidencelab@astrobagel.com directly with details of the vulnerability
- Include steps to reproduce the issue
- Allow reasonable time for the issue to be addressed before public disclosure
We aim to acknowledge security reports within 48 hours and provide a fix timeline within 7 days.
The project implements multiple layers of security:
- Pre-commit Hooks: Catch issues before code enters the repository
- CI/CD Security Scans: Automated checks on every push and PR
- Dependency Monitoring: Automated alerts for vulnerable dependencies
- Container Scanning: Vulnerability checks on Docker images
- Runtime Protections: Input validation, CORS restrictions, rate limiting
- Request Size Limits: Body size enforcement at the middleware layer
- Error Sanitisation: Internal details stripped from production responses
- Structured Logging: JSON-formatted audit trail for SIEM ingestion
Security-focused pre-commit hooks run on every commit:
| Tool | Purpose | Configuration |
|---|---|---|
| Bandit | Python SAST - detects security issues like SQL injection, hardcoded passwords, unsafe deserialization | .pre-commit-config.yaml |
| Hadolint | Dockerfile linting - checks for security misconfigurations | .pre-commit-config.yaml |
| detect-secrets | Prevents accidental credential commits | .secrets.baseline |
| Gitleaks | Enhanced secret detection with comprehensive regex patterns | .pre-commit-config.yaml |
The GitHub Actions workflow includes dedicated security jobs:
| Check | Tool | Description |
|---|---|---|
| Python Dependencies | pip-audit | Scans requirements.txt for known vulnerabilities |
| JavaScript Dependencies | npm audit | Scans package.json for known vulnerabilities |
| Python SAST | Bandit | Static analysis for security issues (JSON report artifact) |
| Dockerfile Linting | Hadolint | Checks all Dockerfiles for security best practices |
| Secret Scanning | Gitleaks | Scans entire repository for exposed secrets |
| Check | Tool | Description |
|---|---|---|
| API Image | Trivy | Scans built Docker image for OS and application vulnerabilities |
| UI Image | Trivy | Scans built Docker image for OS and application vulnerabilities |
ESLint plugins provide JavaScript/TypeScript security analysis:
| Plugin | Purpose |
|---|---|
| eslint-plugin-security | Detects potential security issues (eval, object injection, etc.) |
| eslint-plugin-sonarjs | Code quality rules that catch security anti-patterns |
Dependabot (.github/dependabot.yml) automatically monitors and creates PRs for:
- Python (pip): Weekly scans of
requirements.txt - JavaScript (npm): Weekly scans of
ui/frontend/package.json - Docker: Weekly scans of base images in all Dockerfiles
- GitHub Actions: Weekly scans of action versions
- Security updates are prioritized and should be merged promptly
- Major version updates require manual review for breaking changes
- ML libraries (torch, transformers) have major version updates ignored to prevent breaking changes
The /file/{file_path} endpoint implements comprehensive path traversal protection:
- Double URL decoding to catch encoded attacks (
%2e%2e,%252e%252e) - Null byte rejection
- Path canonicalization using
Path.resolve() - Directory containment verification using
relative_to() - Explicit file extension whitelist
API endpoints validate data_source parameters against a whitelist loaded from config.json, preventing:
- Cache pollution attacks
- Unintended database connections
- Resource exhaustion
CORS is configured securely:
- Origins: Read from
CORS_ALLOWED_ORIGINSenvironment variable; defaults to localhost for development (never*) - Headers: Read from
CORS_ALLOWED_HEADERSenvironment variable; defaults toContent-Type, Authorization, X-API-Key, X-CSRF-Token, Accept, Accept-Language(never*) - Explicit HTTP method whitelist (
GET, POST, PUT, PATCH, DELETE, OPTIONS) - Credentials supported only for allowed origins
Request body size is enforced at the middleware layer (ASVS V13.1.3):
- Requests with
Content-Lengthexceeding the limit are rejected with HTTP 413 - Default limit: 2 MB; configurable via
MAX_REQUEST_BODY_BYTESenv var - GET and other bodyless requests are unaffected
Production error responses never expose internal details (ASVS V7.1.1, V7.4.1):
ValueErrorresponses return a generic message unless the error matches a known-safe prefix (e.g."Invalid data_source:")- Unhandled exceptions return
"Internal server error"with no stack traces, paths, or internal state - Full details available when
API_DEBUG=true(development only) - Full tracebacks are always logged server-side regardless of debug mode
JSON-structured log output for SIEM integration (ASVS V7.1.3):
JSONLogFormatteroutputs{timestamp, level, logger, message, module, function, line}with UTC ISO 8601 timestamps- Security-relevant extra fields (
user_id,user_email,ip_address,event_type,request_id) included when present - Exception tracebacks included in the
exceptionfield - Enabled via
LOG_FORMAT=jsonenv var; plain text remains the default
API endpoints are protected by rate limiting (slowapi):
- Search operations: Configurable via
RATE_LIMIT_SEARCH - AI operations: Configurable via
RATE_LIMIT_AI - Default operations: Configurable via
RATE_LIMIT_DEFAULT
- API key authentication via
X-API-Keyheader - Timing-safe comparison using
secrets.compare_digest() - OpenAPI/Swagger docs disabled in production
Hadolint enforces:
- Avoiding
latesttags for base images - Minimizing layer count
- Using specific package versions where practical
- Multi-stage builds to reduce attack surface
Trivy scans Docker images for:
- OS package vulnerabilities
- Application dependency vulnerabilities
- Misconfigurations
- API key required for all endpoints except
/health,/auth/*, file serving - API key validated using timing-safe comparison
- Auth routes exempt from API key — protected by their own rate-limiting and CSRF
- Development mode allows unauthenticated access (no
API_SECRET_KEYset)
When USER_MODULE is set to on_passive or on_active, fastapi-users provides full user lifecycle management. In on_passive mode, authentication is optional and anonymous users can browse freely. In on_active mode, all access requires login:
| Control | Implementation |
|---|---|
| Token storage | httpOnly cookies only; no localStorage (XSS mitigation) |
| Token lifetime | 1-hour JWTs for access; separate configurable lifetimes for reset (24h) and verify (7d) tokens |
| Cookie flags | httponly, secure, samesite=lax |
| CSRF protection | Double-submit cookie (evidencelab_csrf + X-CSRF-Token header); cookie cleared on logout/account deletion |
| Secret validation | AUTH_SECRET_KEY must be 32+ chars; insecure defaults rejected |
| Input validation | display_name max 255 chars, whitespace-stripped, blank-to-None |
| Password policy | Minimum length + digit + letter (configurable via AUTH_MIN_PASSWORD_LENGTH) |
| Password history | Prevents reuse of last N passwords; configurable via AUTH_PASSWORD_HISTORY_COUNT (default 5, ASVS V2.1.10) |
| Account lockout | Lock after N consecutive failures for M minutes; counters reset on password reset (AUTH_LOCKOUT_THRESHOLD, AUTH_LOCKOUT_DURATION_MINUTES) |
| Timing-attack mitigation | Password hash always computed even for non-existent users |
| Registration control | Email domain whitelist via AUTH_ALLOWED_EMAIL_DOMAINS (registration only; not enforced on password change) |
| Rate limiting | Per-IP sliding window on /auth/* (default 10 req/60s) |
| Permission model | Deny-by-default; unauthenticated users see no datasources |
| Error handling | Permission failures logged and return empty data (no leak) |
| Audit logging | All auth events (login, failure, lockout, register, password reset) logged to audit_log table |
| OAuth | Google and Microsoft SSO with explicit minimal scopes (openid, email, profile) |
| Email verification | Required after registration; token sent via SMTP |
When USER_MODULE_MODE=on_active, the ActiveAuthMiddleware enforces authentication on all data endpoints at the middleware layer. This ensures that no data endpoint can be accessed without valid credentials, even if an individual route handler lacks its own auth dependency.
The middleware intercepts every incoming request and validates credentials before the request reaches any route handler. It checks (in order):
- API key —
X-API-Keyheader compared againstAPI_SECRET_KEYusingsecrets.compare_digest()(timing-safe) - Bearer token —
Authorization: Bearer <jwt>header decoded with HS256 and audience["fastapi-users:auth"] - Session cookie —
evidencelab_authhttpOnly cookie decoded with the same JWT secret and audience
If none of the above yield a valid credential, the middleware returns 401 {"detail": "Authentication required"} without forwarding the request.
The following paths are exempt from middleware authentication because they either have their own auth mechanisms or must be publicly accessible:
| Path prefix | Reason |
|---|---|
/health |
Health check endpoint (monitoring/load balancer probes) |
/auth/ |
Login, register, verify, password reset — has own auth flow |
/config/auth-status |
Frontend feature flag (must be accessible before login) |
/users/ |
Protected by fastapi-users current_active_user dependency |
/groups/ |
Protected by fastapi-users current_active_user dependency |
/ratings/ |
Protected by fastapi-users current_active_user dependency |
/activity/ |
Protected by fastapi-users current_active_user dependency |
/docs, /redoc, /openapi.json |
OpenAPI docs (disabled in production) |
OPTIONS |
CORS preflight requests |
| Property | Detail |
|---|---|
| Scope | Only active when USER_MODULE_MODE == "on_active" |
| Layer | Starlette middleware — runs before all route handlers |
| JWT validation | Verifies signature, expiry, and audience claim |
| API key comparison | Timing-safe via secrets.compare_digest() |
| Fail-closed | Denies by default; only permits explicitly validated credentials |
| Logging | Denied requests logged with method, path, and client IP |
| Test coverage | 23 dedicated unit tests covering all auth paths and edge cases |
The middleware provides defense in depth alongside existing controls:
/config/datasources— already denies unauthenticated users (returns empty list); the middleware adds a hard 401 for all other data endpoints- fastapi-users routes (
/users/,/groups/, etc.) — have their ownDepends(current_active_user)checks; the middleware exempts these to avoid double-validation - API key auth — the middleware accepts the same
API_SECRET_KEYused elsewhere, so programmatic API access continues to work - Frontend login wall —
AuthGatecomponent blocks UI access; the middleware ensures the same protection at the API layer
- Data source access controlled via whitelist validation and group-based RBAC
- File serving restricted to specific directories and file types
- Superusers bypass datasource filtering; regular users see only granted sources
Application-level security headers middleware provides defence-in-depth:
Content-Security-Policy— configurable viaCSP_POLICYenv var; defaults to strict self-only policy withframe-ancestors 'none', SHA-256 hash-based inline script allowlisting (no'unsafe-inline'inscript-src), and explicitconnect-srcfor analytics domainsX-Content-Type-Options: nosniff— prevents MIME-sniffingX-Frame-Options: DENY— prevents clickjackingReferrer-Policy: strict-origin-when-cross-origin— limits referrer leakagePermissions-Policy— restricts camera, microphone, geolocationStrict-Transport-Security— HSTS withpreloaddirective when HTTPS is configured; warning logged whenAUTH_COOKIE_SECURE=falseCache-Control: no-store+Pragma: no-cacheon sensitive endpoints (/auth/,/users/,/groups/,/ratings/,/activity/) — prevents browsers and proxies from caching tokens or personal data (ASVS V8.1.4)
Production deployments via Caddy additionally include:
- Automatic HTTPS with Let's Encrypt
- API key validation for protected endpoints
- Proper proxy headers (X-Real-IP, X-Forwarded-For, X-Forwarded-Proto)
- Sensitive values stored in
.envfiles (gitignored) .env.exampleprovides templates without real values- Secrets managed via GitHub Actions secrets for CI/CD
AUTH_SECRET_KEYis required in production — Docker Compose uses:?syntax to fail-fast if unset; startup logs a CRITICAL warning for insecure or short values (ASVS V14.1.2)
detect-secretsbaseline prevents new secrets from being committed- Gitleaks provides additional coverage with comprehensive patterns
- Pre-commit hooks catch secrets before they enter git history
- Never hardcode secrets - Use environment variables
- Validate all input - Especially file paths and user-provided data
- Use parameterized queries - Prevent SQL injection
- Avoid dangerous functions -
eval(),exec(),subprocesswithshell=True - Keep dependencies updated - Review Dependabot PRs promptly
- Follow least privilege - Containers and services should have minimal permissions
The project targets OWASP ASVS Level 2 compliance. Current estimated coverage is ~82% across the 14 ASVS verification categories. Key controls mapped to ASVS requirements:
| ASVS Requirement | Control | Status |
|---|---|---|
| V2.1.10 | Password history / reuse prevention | Done |
| V7.1.1, V7.4.1 | Error detail sanitisation | Done |
| V7.1.3 | Structured JSON logging | Done |
| V8.1.4 | Sensitive endpoint cache control | Done |
| V13.1.3 | Request body size limits | Done |
| V14.1.2 | Startup credential validation | Done |
| V14.4.3 | CSP with hash-based inline scripts | Done |
Before submitting a PR, ensure:
- No hardcoded secrets or credentials
- Input validation for user-provided data
- Pre-commit hooks pass (including Bandit, Gitleaks)
- No new security warnings in CI
- Sensitive operations are properly authenticated
- File operations validate paths against allowed directories
| Tool | Purpose | Documentation |
|---|---|---|
| Bandit | Python SAST | https://bandit.readthedocs.io/ |
| Hadolint | Dockerfile linting | https://github.com/hadolint/hadolint |
| detect-secrets | Secret detection | https://github.com/Yelp/detect-secrets |
| Gitleaks | Secret detection | https://github.com/gitleaks/gitleaks |
| pip-audit | Python dependency scanning | https://github.com/pypa/pip-audit |
| npm audit | JS dependency scanning | https://docs.npmjs.com/cli/v8/commands/npm-audit |
| Trivy | Container scanning | https://aquasecurity.github.io/trivy/ |
| eslint-plugin-security | JS security linting | https://github.com/eslint-community/eslint-plugin-security |
- 2026-03-10: ASVS L2 Tier 1 + Tier 2 hardening (33 new tests)
- Added request body size limit middleware — rejects >2 MB with HTTP 413 (
MAX_REQUEST_BODY_BYTES, ASVS V13.1.3) - Added error detail sanitisation — production responses never expose internals (
API_DEBUG, ASVS V7.1.1 / V7.4.1) - Added startup credential validation — CRITICAL log on insecure
AUTH_SECRET_KEY; Docker Compose fails-fast if unset (ASVS V14.1.2) - Hardened CSP with SHA-256 hash for inline GA script — removed need for
'unsafe-inline'inscript-src(ASVS V14.4.3) - Added structured JSON logging via
LOG_FORMAT=jsonfor SIEM ingestion (ASVS V7.1.3) - Added password history — prevents reuse of last N passwords (
AUTH_PASSWORD_HISTORY_COUNT, ASVS V2.1.10) - Added Alembic migration
0016_add_password_history(nullable JSONB column) - Added
Cache-Control: no-storeon sensitive endpoint responses (ASVS V8.1.4)
- Added request body size limit middleware — rejects >2 MB with HTTP 413 (
- 2026-03-03: Active authentication enforcement middleware
- Added
ActiveAuthMiddlewareforon_activemode — denies unauthenticated requests to all data endpoints - Middleware validates JWT cookies, Bearer tokens, and API keys at the Starlette layer (before route handlers)
- Fail-closed design: denies by default, only permits explicitly validated credentials
- Exempt paths for health checks, auth flows, and routes with their own fastapi-users dependencies
- Frontend
AuthGatecomponent enforces login wall in the UI foron_activemode - 23 dedicated unit tests covering all auth paths, exempt routes, and edge cases
- Added
- 2026-03-01: Security hardening for enterprise pen testing
- Added Content-Security-Policy header (env-configurable via
CSP_POLICY) - Changed CORS
allow_headersfrom*to env-configurable whitelist (CORS_ALLOWED_HEADERS) - Added
display_nameinput validation (max 255 chars, whitespace stripping) - Moved email domain whitelist check to registration only (no longer triggers on password change)
- Added lockout counter reset on successful password reset
- Separated token lifetimes: reset tokens (24h) and verification tokens (7d) independently configurable
- Added warning log when no default group is configured for new users
- Added CSRF cookie clearing on account deletion
- Added explicit minimal OAuth scopes (
openid, email, profile) for Google and Microsoft - Added HSTS
preloaddirective for HSTS preload list eligibility - Added warning log when
AUTH_COOKIE_SECUREis disabled (HTTP-only development)
- Added Content-Security-Policy header (env-configurable via
- 2026-03-01: User authentication & permissions module
- Added cookie-based JWT auth with httpOnly, secure, samesite=lax flags
- Added CSRF double-submit cookie middleware (
evidencelab_csrf) - Added security response headers middleware (X-Content-Type-Options, X-Frame-Options, etc.)
- Added per-IP rate limiting on auth endpoints (sliding window)
- Added account lockout with timing-attack mitigation
- Added password complexity validation (length + digit + letter)
- Added email domain whitelisting for registration
- Added immutable audit log table for all auth events
- Added group-based RBAC with deny-by-default datasource permissions
- Added Google and Microsoft OAuth2 SSO support
- Exempted
/auth/*routes from API key requirement
- 2026-02-05: Comprehensive security policy
- Added Bandit for Python SAST
- Added Hadolint for Dockerfile linting
- Added Gitleaks for enhanced secret detection
- Added eslint-plugin-security for frontend
- Added pip-audit and npm audit to CI
- Added Trivy container scanning
- Configured Dependabot for automated updates
- Fixed CORS misconfiguration
- Fixed path traversal vulnerability
- Added data source validation