Skip to content

Feature: Auto-detect video chapters from metadata/transcription (#493)#494

Merged
filthyrake merged 2 commits intodevfrom
feature/493-auto-detect-chapters
Jan 3, 2026
Merged

Feature: Auto-detect video chapters from metadata/transcription (#493)#494
filthyrake merged 2 commits intodevfrom
feature/493-auto-detect-chapters

Conversation

@filthyrake
Copy link
Copy Markdown
Owner

Summary

Changes

  • New API Endpoint: POST /admin/videos/{video_id}/chapters/auto-detect
  • New Module: api/chapter_detection.py with ffprobe and transcription utilities
  • New Schemas: ChapterDetectionSource, AutoDetectChaptersRequest, AutoDetectChaptersResponse

Features

1. Import from Video Metadata

  • Extract chapter markers embedded in video files via ffprobe -show_chapters
  • Support common formats (Matroska chapters, MP4 chapters, etc.)
  • Graceful fallback when source file not found or no chapters embedded

2. Generate from Transcription

  • Analyze transcription text to generate chapter suggestions
  • Sentence-based segmentation with configurable minimum length
  • Auto-generates titles from transcript content (removes filler words)

API Usage

POST /admin/videos/{id}/chapters/auto-detect
{
  "source": "metadata" | "transcription" | "both",
  "min_chapter_length": 60,
  "replace_existing": false
}

Test plan

  • Unit tests for schema validation
  • Unit tests for ffprobe chapter extraction (mocked)
  • Unit tests for transcription-based chapter generation
  • Unit tests for chapter filtering by minimum length
  • Manual testing with real video files containing chapters
  • Manual testing with transcribed videos

Closes #493

🤖 Generated with Claude Code

filthyrake and others added 2 commits January 3, 2026 12:02
Add ability to automatically detect and generate chapter markers from
video metadata or transcription analysis.

## New Features

### API Endpoint
- POST /admin/videos/{video_id}/chapters/auto-detect
  - source: 'metadata', 'transcription', or 'both'
  - min_chapter_length: minimum seconds between chapters (10-600)
  - replace_existing: whether to clear existing chapters

### Metadata Chapter Extraction
- Extract chapter markers embedded in video files via ffprobe
- Supports Matroska (MKV), MP4/MOV, and other container formats
- Graceful fallback when no chapters found

### Transcription-Based Generation
- Generate chapters from completed transcription text
- Sentence-based analysis for topic segmentation
- Configurable minimum chapter length
- Removes filler words from generated titles

## Implementation

- New module: api/chapter_detection.py with utility functions
- New schemas: ChapterDetectionSource, AutoDetectChaptersRequest,
  AutoDetectChaptersResponse, DetectedChapter
- Comprehensive test coverage for all detection methods

Closes #493

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Security & reliability improvements:
- Add HTML entity escaping for metadata-extracted titles (XSS prevention)
- Add transaction retry logic with execute_with_retry()
- Add SELECT FOR UPDATE for concurrency protection
- Add endpoint-level 60s timeout for detection phase
- Add timeout on process.wait() after kill signal

Performance improvements:
- Convert N+1 inserts to single batch INSERT RETURNING query
- Pre-compile regex patterns at module level
- Add named constants for magic numbers

Code quality fixes:
- Replace print() with logger.error() for proper logging
- Remove unused video_slug parameter from function signature
- Rename internal dataclass to avoid collision with Pydantic schema
- Update tests for renamed class and removed parameter

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@filthyrake filthyrake merged commit 91cf2a6 into dev Jan 3, 2026
6 checks passed
@filthyrake filthyrake deleted the feature/493-auto-detect-chapters branch January 3, 2026 20:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Auto-detect video chapters from metadata/transcription

1 participant