Skip to content

Release v0.0.4#502

Merged
filthyrake merged 28 commits intomainfrom
dev
Jan 4, 2026
Merged

Release v0.0.4#502
filthyrake merged 28 commits intomainfrom
dev

Conversation

@filthyrake
Copy link
Copy Markdown
Owner

Summary

Release v0.0.4 - Major feature release with webhooks, video downloads, and security improvements.

New Features

Security Improvements

Reliability Improvements

Code Quality

Documentation

  • Comprehensive documentation update for all new features
  • Updated API.md, DATABASE.md, CONFIGURATION.md, ARCHITECTURE.md
  • Added troubleshooting for webhooks, rate limiting, CSP, downloads

Test plan

  • All CI tests passing on dev branch
  • Webhook delivery tested
  • Video downloads tested
  • CSP compliance verified
  • Documentation reviewed

🤖 Generated with Claude Code

filthyrake and others added 28 commits January 1, 2026 15:53
Update all documentation to reflect features added between v0.0.2 and v0.0.3:

API.md:
- Add Playlists API (public and admin endpoints)
- Add Chapters API (video timeline navigation)
- Add Sprite Sheets API (thumbnail previews)
- Add Display Configuration endpoint
- Add Featured Videos filter

DATABASE.md:
- Add playlists, playlist_items, chapters, sprite_queue tables
- Add new video columns (is_featured, has_chapters, sprite_sheet_*)
- Update entity relationships and cascade behavior

CONFIGURATION.md:
- Add display settings, circuit breaker, streaming upload settings
- Add sprite sheet and worker version gating settings

ADMIN_UI_GUIDE.md:
- Add Playlists tab documentation

ARCHITECTURE.md:
- Add playlists, chapters, sprite sheets sections
- Add circuit breaker pattern documentation
- Add streaming segment upload modes
- Update database tables list

UPGRADING.md:
- Add complete v0.0.3 upgrade guide
- Document migrations 020-025
- Add new environment variables and API endpoints

README.md:
- Add playlists, chapters, sprite sheets, featured videos features
- Add reliability section (circuit breaker, streaming upload)
- Update storage layout with sprites directory

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…cation (#489)

* Fix video upload failures and increase max file size

- Fix NOT NULL constraint violations in upload_video INSERT
  - Add explicit values for thumbnail_source, streaming_format,
    primary_codec, is_featured, has_chapters
  - Python-level SQLAlchemy defaults don't apply with async databases lib
  - Use STREAMING_FORMAT/STREAMING_CODEC from config for modern defaults
- Increase frontend max upload size from 10GB to 50GB

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix ISO datetime parsing for Python 3.9

Python 3.9's datetime.fromisoformat() doesn't support the "Z" UTC
suffix that JavaScript's toISOString() produces. Replace "Z" with
"+00:00" before parsing to fix video metadata updates failing with
400 Bad Request.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Add reliability improvements: DLQ monitoring, heartbeat state, job verification

Issue #457 - Dead Letter Queue Monitoring:
- Add GET /api/admin/job-queue/dead-letter endpoint to list DLQ entries
- Add POST /api/admin/job-queue/dead-letter/{id}/reprocess endpoint
- Add DELETE /api/admin/job-queue/dead-letter/{id} endpoint
- Add GET /api/admin/job-queue/stats with DLQ depth warning

Issue #458 - Worker Heartbeat Stale Data Detection:
- Extend HeartbeatResponse with worker_status, current_job_id, last_heartbeat_recorded
- Worker now compares server's view of state with local state
- Logs warnings on consecutive state mismatches (3+)
- Helps detect cases where heartbeat HTTP succeeded but DB write failed

Issue #461 - Work Directory Cleanup Verification:
- Add GET /api/worker/{job_id}/verify-complete endpoint
- Verifies video files exist on disk before cleanup
- Returns all_files_present, qualities_present, missing_files
- Worker calls verification before deleting local work directory
- Preserves work directory if verification fails (prevents data loss)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address code review feedback and fix pre-existing test issues

Code Review Fixes (PR #489):
- Use fetch_one_with_retry instead of raw database.fetch_one
- Use sanitize_error_message for exception details in API responses
- Change deprecated regex= to pattern= in FastAPI Path/Query
- Add message_id validation with proper regex pattern
- Use RETURNING clause to fix race condition in job insert
- Use QUALITY_NAMES config constant instead of hardcoded list

Pre-existing Test Fixes:
- Add pythonpath = . to pytest.ini to fix code_version import
- Fix test assertions for paginated video list responses
- Add missing env vars to .env.example (sprite sheets, streaming)
- Fix migration 021 to drop session_token index before recreating
- Fix migration 002 downgrade to check if index exists

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* test: add security tests for worker authentication edge cases

* chore: fix import order for ruff lint

* refactor: simplify security tests per review feedback

- Remove duplicate tests that already exist in tests/test_worker_api.py
  (verify_worker_key, hash_api_key tests)
- Keep only the valuable _get_request_context() tests for X-Forwarded-For
  handling, trusted proxy logic, and IPv6 support
- Move tests to tests/test_worker_auth.py following project conventions
- Use proper pytest monkeypatch fixtures instead of sys.modules manipulation
- Remove unused tests/_disable_db.py file
- Remove security_tests/ directory (tests belong in tests/)

Based on code review feedback from @filthyrake

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Krish Pathak <pathak040@gmail.com>
Co-authored-by: Damen Knight <damen@knightspeed.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…iod (#455, #456) (#490)

* Reliability: Fix job completion idempotency and stale check grace period (#455, #456)

Issue #455: Job completion retry can overwrite metadata
- Add completion_token field to CompleteJobRequest for idempotency
- Worker generates unique token (job_id + UUID) on completion
- Server checks Redis for duplicate tokens, returns early if found
- Make metadata updates idempotent (only set if not already set)
- If job already completed, return success without re-processing

Issue #456: Stale job checker grace period resets on API restart
- Store last stale check time in Redis (key: stale_job_checker:last_run)
- On startup, check Redis for recent check before applying grace period
- If another API instance recently checked, skip local grace period
- After each successful check, record timestamp in Redis (5-min TTL)
- Apply same fix to orphan quality directory cleanup

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address code review feedback for reliability fixes

Fixes identified by specialist reviewers:

1. Fix .decode() bug - Redis client uses decode_responses=True, so
   redis.get() returns strings, not bytes. Removed .decode() call.

2. Add input validation to completion_token:
   - max_length=100 to prevent memory abuse
   - Regex pattern validation for format: {job_id}-{uuid4}
   - Normalize to lowercase

3. Scope token Redis key to job_id:
   - Key format: vlog:completion_token:{job_id}:{token}
   - Prevents cross-job token collisions

4. Use atomic SETNX for token storage:
   - SET with NX flag (set-if-not-exists) before transaction
   - Prevents race condition where two requests pass check before either stores
   - Token initially set to "processing", updated to "completed" after success

5. Increase token TTL from 5 to 15 minutes:
   - Covers extended retry scenarios with exponential backoff
   - New constant: COMPLETION_TOKEN_TTL = 900

6. Add Redis key namespace prefix:
   - All keys now prefixed with "vlog:" to prevent collisions
   - New constant: REDIS_KEY_PREFIX = "vlog:"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix token cleanup on transaction failure (Gafton review feedback)

If the database transaction fails (DatabaseLockedError), the completion
token was left in Redis with status "processing" for 15 minutes. This
blocked any retry attempts with the same token, even though the job
completion never actually succeeded.

Now we delete the token from Redis when the transaction fails, allowing
the worker to retry immediately with the same token.

Changes:
- Move token_key to outer scope for access in exception handler
- Add token cleanup in DatabaseLockedError exception handler
- Best-effort cleanup with logging (don't mask original error)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
)

* Refactor: Replace boolean traps with self-documenting enums (#443)

Replace unclear boolean parameters with descriptive enum values:

- ErrorLogging enum for sanitize_error_message():
  - ErrorLogging.LOG_ORIGINAL / SKIP_LOGGING replaces log_original bool

- PlaylistValidation enum for validate_hls_playlist():
  - PlaylistValidation.CHECK_SEGMENTS / STRUCTURE_ONLY replaces check_segments bool

- JobFailureMode enum for mark_job_failed():
  - JobFailureMode.RETRYABLE / PERMANENT replaces final bool

- DeleteMode enum for delete_video API endpoint:
  - DeleteMode.SOFT / PERMANENT (API keeps permanent= for backwards compatibility)

- KeyRevocation enum for delete_worker API endpoint:
  - KeyRevocation.REVOKE / KEEP (API keeps revoke_keys= for backwards compatibility)

Benefits:
- Call sites are now self-documenting
- IDE autocomplete shows available options
- Easier to add new options in the future
- Type safety prevents passing wrong values

All internal function calls updated to use new enum values.
API endpoints maintain backwards compatibility with boolean parameters.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address code review feedback for boolean-to-enum refactoring

Fixes based on comprehensive review by specialized agents:

Critical fixes:
- Add missed call sites in worker/remote_transcoder.py (lines 513, 820)
- Add explicit type validation with TypeError for invalid types
- Add deprecation warnings when boolean values are passed

Improvements:
- Remove redundant bool→enum→bool conversion in API endpoints
- Move ErrorLogging enum to api/enums.py for consistency
- Update JobFailureMode docstrings to focus on intent not implementation
- Add comprehensive test suite (20 tests) for enum functionality

Test coverage includes:
- Enum values work correctly
- Boolean backwards compatibility with deprecation warnings
- TypeError raised for invalid types (None, int, wrong string)
- Enum string comparison behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* Security: Migrate API key hashing to argon2id (#445)

Replace SHA-256 with argon2id for API key hashing to provide defense-in-depth
against brute-force attacks. This addresses Issue #445.

Changes:
- Add argon2-cffi dependency for memory-hard password hashing
- Add hash_version column to track algorithm (1=SHA-256 legacy, 2=argon2id new)
- Update worker_auth.py with dual-format verification support
  - New keys use argon2id automatically
  - Legacy SHA-256 keys continue to work
  - Unknown versions fail closed for security
- Add authenticate_api_key() shared helper to eliminate code duplication
- Fix admin.py reencode endpoints to use prefix-based lookup
  (required because argon2 hashes are non-deterministic)
- Add comprehensive tests for hash versioning

Security notes:
- argon2id with OWASP-recommended parameters (time_cost=3, memory_cost=64MB)
- Timing-safe comparison preserved for legacy SHA-256 verification
- InvalidHash exceptions caught and logged without leaking info
- Backward compatible - existing workers continue working

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address reliability review feedback for argon2 migration (#445)

Reliability improvements per Margo's review:
- Fix prefix collision: authenticate_api_key now fetches ALL matching
  prefixes and iterates through candidates (fetch_one -> fetch_all)
- Add API key length validation: keys must be >= 8 chars for prefix extraction
- Add key_prefix to error logs for better debugging context
- Add build dependencies (gcc, libffi-dev) to Dockerfiles for argon2-cffi

Migration improvements:
- Add deployment sequence documentation
- Add post-migration verification queries
- Add explicit downgrade check that raises RuntimeError if argon2 keys exist
- Document that downgrade is destructive

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
#494)

* Feature: Auto-detect video chapters from metadata/transcription (#493)

Add ability to automatically detect and generate chapter markers from
video metadata or transcription analysis.

## New Features

### API Endpoint
- POST /admin/videos/{video_id}/chapters/auto-detect
  - source: 'metadata', 'transcription', or 'both'
  - min_chapter_length: minimum seconds between chapters (10-600)
  - replace_existing: whether to clear existing chapters

### Metadata Chapter Extraction
- Extract chapter markers embedded in video files via ffprobe
- Supports Matroska (MKV), MP4/MOV, and other container formats
- Graceful fallback when no chapters found

### Transcription-Based Generation
- Generate chapters from completed transcription text
- Sentence-based analysis for topic segmentation
- Configurable minimum chapter length
- Removes filler words from generated titles

## Implementation

- New module: api/chapter_detection.py with utility functions
- New schemas: ChapterDetectionSource, AutoDetectChaptersRequest,
  AutoDetectChaptersResponse, DetectedChapter
- Comprehensive test coverage for all detection methods

Closes #493

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address code review feedback for auto-detect chapters (#493)

Security & reliability improvements:
- Add HTML entity escaping for metadata-extracted titles (XSS prevention)
- Add transaction retry logic with execute_with_retry()
- Add SELECT FOR UPDATE for concurrency protection
- Add endpoint-level 60s timeout for detection phase
- Add timeout on process.wait() after kill signal

Performance improvements:
- Convert N+1 inserts to single batch INSERT RETURNING query
- Pre-compile regex patterns at module level
- Add named constants for magic numbers

Code quality fixes:
- Replace print() with logger.error() for proper logging
- Remove unused video_slug parameter from function signature
- Rename internal dataclass to avoid collision with Pydantic schema
- Update tests for renamed class and removed parameter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* Security: Add optional authentication for metrics endpoint (#436)

- Add metrics.enabled and metrics.auth_required database-backed settings
- Update /metrics endpoint in admin.py with optional auth via X-Admin-Secret header
- Update /metrics endpoint in worker_api.py with optional auth via X-Admin-Secret header
- Add security logging for auth failures on metrics endpoints
- Add tests for metrics settings definitions and environment variable mappings
- Update .env.example with new VLOG_METRICS_ENABLED and VLOG_METRICS_AUTH_REQUIRED settings
- Update test_env_example.py to recognize database-backed settings from settings_service.py

Default behavior is backwards-compatible (metrics enabled, auth not required).
For production, enable auth OR use network-level isolation to protect operational data.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address PR #495 review feedback

- Add rate limiting (60/minute) to metrics endpoints in admin.py and worker_api.py
- Use security_logger consistently in worker_api.py for auth logging
- Add integration tests for metrics endpoint auth flow (403/404/500 cases)
- Move inline import to module level in worker_api.py
- Fix import sorting in migration file (ruff auto-fix)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
… (#496)

* Feature: Add additional Prometheus metrics and Grafana dashboard (#207)

Add 5 new Prometheus metrics for enhanced observability:
- HTTP_REQUESTS_IN_PROGRESS gauge (low-cardinality by API name)
- VIDEOS_WATCH_TIME_SECONDS_TOTAL counter
- WORKER_JOBS_COMPLETED_TOTAL counter (by worker_name)
- WORKER_HEARTBEAT_AGE_SECONDS gauge (by worker_name)
- STORAGE_VIDEOS_BYTES gauge with periodic reconciliation

Implementation highlights:
- Pure ASGI HTTPMetricsMiddleware for 6x better performance
- Endpoint path normalization to prevent cardinality explosion
- Background task updates heartbeat ages every 30s (no DB query on /metrics)
- Storage reconciliation scans filesystem every 6 hours
- Instrument existing but unused HTTP and transcoding metrics

Also includes:
- Grafana dashboard JSON with panels for API, transcoding, workers, storage
- Tests for all new metrics and middleware

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address code review feedback for PR #496 metrics implementation

Critical fixes:
- Add `api` label to HTTP_REQUESTS_TOTAL for Grafana dashboard compatibility
- Fix storage reconciliation with timeout, symlink protection, partial failure handling
- Add database retry logic (fetch_all_with_retry) to background task
- Fix storage metric for overwritten segments (track net change)

High priority fixes:
- Add LRU cache to normalize_endpoint() for 95%+ allocation reduction
- Replace _metrics.clear() with selective label removal to avoid race conditions
- Add worker name label sanitization to prevent label injection
- Add background task health metrics (errors, last_success, duration)

Medium priority improvements:
- Improve normalize_endpoint with UUID and slug pattern detection
- Make reconciliation interval configurable via VLOG_STORAGE_RECONCILIATION_INTERVAL
- Add VLOG_STORAGE_SCAN_TIMEOUT and VLOG_STORAGE_SCAN_MAX_FILES configs
- Add comprehensive tests for new features

New metrics added:
- BACKGROUND_TASK_ERRORS_TOTAL
- BACKGROUND_TASK_LAST_SUCCESS
- BACKGROUND_TASK_DURATION_SECONDS
- STORAGE_RECONCILIATION_STATUS

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Replace asyncio.timeout (Python 3.11+) with asyncio.wait_for
which is available in Python 3.9.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update Dockerfile.worker.gpu to use av1 as the preferred hardware
encoder codec instead of h264, matching the production CMAF/AV1
streaming configuration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Feature: Add video download support (#202)

Add configuration and API endpoints for video download functionality.
Downloads are disabled by default for security, requiring explicit
opt-in via environment variables or admin settings.

Configuration options:
- VLOG_DOWNLOADS_ENABLED: Master switch (default: false)
- VLOG_DOWNLOADS_REQUIRE_AUTH: Auth requirement (default: true)
- VLOG_DOWNLOADS_ALLOW_ORIGINAL: Original file downloads (default: false)
- VLOG_DOWNLOADS_ALLOW_TRANSCODED: Transcoded quality downloads (default: true)
- VLOG_DOWNLOADS_RATE_LIMIT_PER_HOUR: Rate limiting (default: 10/hour)
- VLOG_DOWNLOADS_MAX_CONCURRENT: Concurrency limit (default: 2)

Changes:
- Add download configuration to config.py with secure defaults
- Add download settings to settings_service.py for database-backed config
- Add /api/config/downloads endpoint for UI configuration
- Add /api/videos/{slug}/download/original endpoint for original files
- Add download button to watch.html (conditionally shown when enabled)
- Update .env.example with documentation for all new settings

Note: Transcoded quality downloads (MP4 from HLS/DASH segments) are
planned for a future release. Currently only original file downloads
are supported.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address code review feedback for download feature (#202)

Critical fixes from security, reliability, and performance reviews:

Security (Bruce):
- Remove unimplemented require_auth setting to avoid false security
- Add RFC 5987 filename encoding to prevent header injection
- Validate file extension is in allowed list before serving
- Add proper media type detection based on file extension

Reliability (Margo):
- Add asyncio.Lock to prevent cache thundering herd on expiry
- Implement concurrent download tracking with proper slot management
- Add comprehensive filesystem error handling (OSError, permissions)
- Add file validation (empty files, size limits, readable check)
- Add storage availability dependency check
- Reduce TOCTOU window with final validation before serving
- Change logging from DEBUG to WARNING for operational issues

Performance (Brendan):
- Implement max_concurrent enforcement (was configured but not used)
- Add 100GB file size sanity check
- Document that rate limit requires restart to change

Code Quality:
- Add comprehensive test suite (18 tests covering all scenarios)
- Improve filename sanitization (length limit, space handling)
- Add proper docstrings with raised exceptions documented

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
* Security: Auto-detect Redis for rate limiting storage (#446)

Addresses the security issue where in-memory rate limiting allows
attackers to bypass rate limits in multi-instance deployments by
distributing requests across instances.

Changes:
- Auto-detect Redis: If VLOG_REDIS_URL is configured, rate limiting
  now automatically uses Redis storage instead of in-memory
- Explicit override: VLOG_RATE_LIMIT_STORAGE_URL still takes precedence
  if explicitly set
- Enhanced warnings: Startup warnings now explicitly mention "SECURITY"
  and explain the attack vector
- Updated documentation: CONFIGURATION.md now has security warnings
  and explains auto-detection
- Updated .env.example: Better comments explaining the security issue
  and auto-detection feature

This is a non-breaking change. Existing deployments continue to work:
- Single instance: Defaults to memory:// (with warning)
- Multi-instance with Redis: Automatically uses Redis
- Explicit config: Honors VLOG_RATE_LIMIT_STORAGE_URL

Fixes #446

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address review feedback for rate limit auto-detection

Changes based on specialist code review:

1. Add security warning to worker_api.py (was missing)
   - Matches admin.py and public.py warnings

2. Improve config.py clarity:
   - Better variable names: _explicit_rate_limit_storage, _redis_url_for_rate_limit
   - Avoid redundant os.getenv() calls
   - Add info log when Redis is auto-detected (aids debugging)

3. Add unit tests for auto-detection logic (tests/test_config.py):
   - test_defaults_to_memory_when_no_redis
   - test_auto_detects_redis_from_redis_url
   - test_explicit_storage_takes_precedence_over_redis_url
   - test_explicit_memory_overrides_redis_url
   - test_empty_redis_url_falls_back_to_memory

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
) (#499)

* Refactor: Extract transcoding job state machine into explicit code (#438)

- Add api/job_state.py with TranscodingJobStateMachine class
  - JobState enum: unclaimed, claimed, expired, completed, failed, retrying
  - State predicates: is_unclaimed(), is_claimed(), is_expired(), etc.
  - SQL condition generators: sql_unclaimed(), sql_claimed(), etc.
  - Transition validation: can_claim(), can_complete(), can_fail()
  - JobRow dataclass for type-safe state determination
- Add comprehensive tests (49 test cases) for state machine
- Simplify 109-line comment block in database.py to 6 lines
- Update worker_api.py to reference new state machine module

The implicit state machine derived from nullable field combinations is now
explicit, self-documenting code that reads like requirements.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address code review feedback for state machine

Security fixes:
- Add SQL identifier validation to prevent injection in table_alias
- Add SQL parameter validation to prevent injection in now_param
- Sanitize error messages to avoid leaking internal state details

Reliability fixes:
- Add timezone normalization in JobRow.from_mapping() for naive datetimes
- Add defensive completed_at checks in is_failed() and is_retrying()
- Validate numeric fields (attempt_number, max_attempts >= 1)
- Improve indeterminate state error logging

Consistency fixes:
- Fix sql_claimable() to include both unclaimed AND retrying jobs
- Update is_unclaimed() to check last_error is None (distinguishes from retrying)
- Rename 'now' parameter to 'current_time' for clarity

Performance:
- Remove unused jobs_table constructor parameter
- Add _normalize_job() helper to reduce code duplication

Documentation:
- Add state transition diagram to module and class docstrings
- Add thread safety and distributed safety notes
- Improve docstrings with worker identity verification notes

Tests (71 total):
- Add SQL injection prevention tests
- Add timezone normalization tests
- Add boundary condition tests (exact expiration time)
- Add indeterminate state test
- Add numeric validation tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
…#500)

* Security: Harden CSP by removing unsafe-inline and unsafe-eval (#325)

- Remove 'unsafe-inline' and 'unsafe-eval' from script-src directives
- Remove 'unsafe-inline' from style-src directives
- Replace Alpine.js standard build with CSP-compatible build
  - Public pages: alpine.csp.min.js from @alpinejs/csp
  - Admin: npm package @alpinejs/csp instead of alpinejs
- Extract inline JavaScript to external files:
  - web/public/static/js/pages/home.js (from index.html)
  - web/public/static/js/pages/category.js (from category.html)
  - web/public/static/js/pages/tag.js (from tag.html)
- Move [x-cloak] CSS to external stylesheets:
  - web/public/static/css/bundle.css
  - web/admin/src/styles/tokens.css
- Add TypeScript declaration for @alpinejs/csp module

The CSP-compatible Alpine.js build avoids using eval() and Function()
constructors, enabling stricter Content Security Policy headers that
provide real XSS protection.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address code review feedback for CSP hardening

- Fix global namespace pollution in page JS files:
  - Wrap all page components in IIFEs with 'use strict'
  - Register components via Alpine.data() instead of global functions
  - Encapsulate constants (MAX_SEARCH_LENGTH, SLUG_PATTERN) in IIFE scope

- Improve script loading consistency:
  - Add defer attribute to all page script tags
  - Remove parentheses from x-data attributes (use string reference)
  - Ensure consistent loading order across all pages

- Strengthen CSP headers:
  - Add object-src 'none' to block plugin-based XSS vectors
  - Add base-uri 'self' to prevent base tag injection attacks
  - Apply to all HTML files (index, category, tag, watch, admin)

- Document vendor file provenance:
  - Add source URL, version, license info to alpine.csp.min.js
  - Include instructions for verifying integrity

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Replace complex Alpine expressions with precomputed properties for
CSP-compliant build compatibility. The Alpine CSP build cannot parse:
- Object class syntax: { 'class': condition }
- Boolean expressions: condition && !other
- Method calls: .getFullYear(), .toString()
- Array literals in templates
- Inline style bindings

Changes:
- Add precomputed _-prefixed properties for all display values
- Replace :style bindings with CSS classes for progress bars
- Add progress bar width classes (5-100% in 5% increments)
- Add skeleton utility classes to avoid inline styles
- Apply watermark styles via JavaScript instead of Alpine :style
- Update $watch() calls to keep precomputed values in sync

Files: home.js, category.js, tag.js, watch.js, index.html,
category.html, tag.html, watch.html, video-card.css, skeleton.css

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add precomputed watermark display values (_showImageWatermark,
  _showTextWatermark, _watermarkPosition, _watermarkImageUrl, _watermarkText)
- Add precomputed _showVideo flag to replace complex boolean expression
- Update watch.html to use precomputed values instead of complex expressions
  with && and || operators that Alpine CSP cannot parse
- Fix player-controls.js initialization order (buildSpeedOptions after DOM refs)
- Use precomputed _categoryHref instead of inline string concatenation

These changes fix "Cannot read properties of undefined (reading 'description')"
and other Alpine CSP runtime errors when watching videos.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Alpine.js CSP build does not support optional chaining (?.) in expressions.
Moved all video display properties to top-level component state:

- Added _videoTitle, _videoCategoryName, _videoCategoryHref, _videoShowCategory
- Added _videoThumbnailUrl, _videoPublishedDate, _videoDuration, _videoResolution
- Added _videoHasQualities, _videoQualities, _videoDownloadHref
- Added _videoHasDescription, _videoDescription
- Added updateVideoDisplayProperties() to populate these when video loads
- Updated all watch.html bindings to use top-level properties instead of video?.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Alpine CSP build cannot parse method calls with variable arguments like
selectCategory(cat). Changed to:
- Store category slug in data-slug attribute
- Use selectCategoryBySlug() method that reads from $el.dataset.slug

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Alpine CSP build cannot parse method calls with arguments. Changed to use
data attributes for all event handlers that need to pass values:

- selectCategory(cat) → selectCategoryBySlug() with data-slug attribute
- setViewMode('grid'/'list') → setViewModeFromData() with data-mode attribute
- toggleWatchLater(video) → toggleWatchLaterById() with data-video-id attribute

Updated all public pages: index.html, category.html, tag.html
Added CSP-compatible methods to: home.js, category.js, tag.js

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Alpine CSP build cannot parse method calls even with empty parentheses.
Changed all @click="method()" to @click="method" across all public pages.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
* Add webhook notifications for events (Issue #203)

Implement a complete webhook notification system for external integrations:

Database:
- Add webhooks table for subscription configuration
- Add webhook_deliveries table for delivery history/retry tracking
- Migration 027_add_webhooks.py with proper indices

API Endpoints:
- GET/POST/PUT/DELETE /api/webhooks - CRUD for webhook subscriptions
- GET /api/webhooks/stats - System-wide delivery statistics
- POST /api/webhooks/{id}/test - Test webhook delivery
- GET /api/webhooks/{id}/deliveries - Delivery history
- POST /api/webhooks/{id}/deliveries/{id}/retry - Retry failed delivery

Webhook Service (api/webhook_service.py):
- HMAC-SHA256 payload signing for security
- Exponential backoff retry (configurable up to 10 attempts)
- Background delivery worker with configurable concurrency
- Rate-limited database updates

Supported Events:
- video.uploaded, video.ready, video.failed
- video.deleted, video.restored
- transcription.completed
- worker.registered, worker.offline

Settings (database-backed via settings_service.py):
- webhooks.enabled, max_retries, retry_base_delay
- retry_backoff_multiplier, request_timeout
- max_concurrent_deliveries, delivery_batch_size

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Address security, reliability, and performance review feedback for webhooks

Security improvements:
- Add SSRF protection with blocked IP ranges (private networks, cloud metadata)
- Add header injection protection (protected headers list)
- Enforce minimum secret length (32 characters)
- Prevent URL redirects for SSRF protection

Reliability improvements:
- Add circuit breaker pattern for failing webhooks
- Add crash recovery for in-flight deliveries on startup
- Add worker health monitoring with heartbeat tracking
- Add graceful shutdown with configurable timeout
- Add jitter to exponential backoff to prevent thundering herd

Performance improvements:
- Use shared HTTP client with connection pooling
- Batch insert delivery records in single transaction
- Fix N+1 query pattern with bulk operations
- Use atomic SQL updates to prevent race conditions
- Add pagination to webhook list endpoint

New features:
- Add missing webhook triggers for transcription.completed, worker.registered, worker.offline
- Add automatic delivery history cleanup (configurable retention period)
- Add webhook worker status endpoint

Fixes Issue #203 review feedback.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Documentation updates across 7 files (+1084 lines):

API.md:
- Add authentication endpoints (login/logout/check/csrf-token)
- Add webhook endpoints with CRUD, events, and verification examples
- Add video download endpoints

DATABASE.md:
- Fix streaming_format constraint (NOT NULL with default)
- Add webhooks, webhook_deliveries, reencode_queue, deployment_events tables
- Update worker_api_keys with hash_version for argon2id

CONFIGURATION.md:
- Fix sprite sheet variable names (VLOG_SPRITE_SHEET_*)
- Add webhook notification settings section
- Add video download settings section

ARCHITECTURE.md:
- Update system diagram with Settings Service and Webhook Delivery
- Add Settings Service, Webhook Service, Video Downloads sections
- Update shared utilities table with new modules

ADMIN_UI_GUIDE.md:
- Add Webhooks tab to navigation
- Add Video Chapters, Sprite Sheets, Re-encode Queue sections
- Expand Workers tab with management features

TROUBLESHOOTING.md:
- Add Webhook Issues section
- Add Rate Limiting Issues section
- Add CSP Issues section
- Add Video Download Issues section

UPGRADING.md:
- Add v0.2.x upgrade section for webhooks/downloads/argon2id
- Update Breaking Changes Log with new migrations and endpoints

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fixes test_env_example_completeness test failure.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove unused AsyncMock import
- Remove unused _release_download_slot import
- Sort import blocks

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Fix test_log_original_logs_message: specify logger name for caplog
- Fix test_end_analytics_session: use dict access on Record object

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@filthyrake filthyrake merged commit e1520cf into main Jan 4, 2026
6 of 16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants