Skip to content

fix(pdf-quality): Phase 3 timeout + dataset dir permissions#236

Merged
mrviduus merged 1 commit into
mainfrom
fix/pdf-cleanup-poller
May 22, 2026
Merged

fix(pdf-quality): Phase 3 timeout + dataset dir permissions#236
mrviduus merged 1 commit into
mainfrom
fix/pdf-cleanup-poller

Conversation

@mrviduus
Copy link
Copy Markdown
Owner

Summary

First prod test of feat-0007 Phase 3 surfaced two bugs. Phase 1-2 and the
Phase 3 mechanism itself (chapter → Claude → preservation gate → write back)
worked; these two stopped it short.

Bugs fixed

  • Claude timeout too shorttimeout 300 was hardcoded. Large chapters
    (15-20k words) take 10-15 min for Claude to rewrite, so every big content
    chapter timed out and was skipped (wasting 5 min each). Now CLEANUP_TIMEOUT
    env var, default 900s.
  • data/pdf-cleanup-dataset permission denieddata/ is root-owned;
    the poller runs as the deploy user and can't mkdir there. make fix-permissions now creates + chowns it via the root alpine container
    (same pattern as the other caches). The poller also degrades gracefully —
    cleanup still runs, only pair-logging is skipped — if the dir is missing.

Changes

  • infra/scripts/quality-poll.shCLEANUP_TIMEOUT env (default 900s);
    graceful dataset-dir handling.
  • Makefilefix-permissions creates/chowns data/pdf-cleanup-dataset.
  • .env.example — documents CLEANUP_TIMEOUT.

Tests

Rollback plan

Pre-release bug fix to a flag-gated feature — revert the commit if needed,
no runtime impact while CONTENT_CLEANUP_ENABLED is off.

🤖 Generated with Claude Code

First prod test of Phase 3 surfaced two bugs:

- Claude timeout was a hardcoded 300s — large chapters (15-20k words)
  need 10-15 min to rewrite, so every big chapter timed out and was
  skipped. Now CLEANUP_TIMEOUT env var, default 900s.
- data/pdf-cleanup-dataset couldn't be created — data/ is root-owned,
  the poller runs as the deploy user. `make fix-permissions` now creates
  + chowns it (via the root alpine container, like the other caches);
  the poller degrades gracefully (skips pair-logging) if it's missing.

Phase 1-2 + the Phase 3 mechanism itself verified working on the test
run (chapter cleaned → gate accepted → HTML written back).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mrviduus mrviduus merged commit ffd89ac into main May 22, 2026
5 checks passed
@mrviduus mrviduus deleted the fix/pdf-cleanup-poller branch May 22, 2026 21:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant