refactor(convert): move 6 PDF-export converters off the UI thread#36
Merged
Merged
Conversation
The PDF→DOCX / TXT / PPTX / XLSX / HTML / EPUB conversions still ran synchronously on the main thread, polling QApplication.processEvents() between pages. UI froze and the cancel button was effectively useless on multi-page documents (PPTX in particular renders each page at 2x DPI, several seconds per page). Migrate all 6 to BasePage._run_background, mirroring the pattern from the already-migrated _convert_images: - Pre-flight on the main thread: dependency check (ImportError → proper localized message via QMessageBox), open the doc once for page-count, capture self._pdf_password into a local before the closure. - do_work re-opens the fitz doc, re-authenticates if needed, runs the per-page extraction loop, emits worker.progress.emit(i, label) per page, and checks worker.is_cancelled() at each iteration so the Cancel button on the progress dialog actually aborts. - on_done runs on the main thread via the @slot dispatcher and shows the success toast / message box. Add 3 new i18n keys × 8 languages for the previously-missing dependency-error messages (tool.convert.dep_pptx / dep_xlsx / dep_epub) — before this PR those formats fell through to the generic str(e) error. Verified end-to-end on Ubuntu 26.04 + Py3.14.4 with a smoke test that runs all 6 converters on a 4-page sample and asserts on_done fires on the main thread + the output file is produced. Remaining background-task TODOs: import_pdf.py (8 paths) and page_numbers.py (mid-flow QMessageBox prompt). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Migrate the remaining 6 PDF-export converters (
_convert_docx/_convert_txt/_convert_pptx/_convert_xlsx/_convert_html/_convert_epub) from synchronous UI-thread loops withQApplication.processEvents()toBasePage._run_background, following the same pattern compress / OCR / convert-images / watermark / N-up already use.Why
PPTX and DOCX in particular render each page at high DPI (PPTX uses 2× pixmap matrix and embeds the image per slide); on multi-page documents the UI freezes for tens of seconds and the Cancel button is effectively dead because
processEventsonly runs between iterations. After this change the dialog is responsive andworker.is_cancelled()is honoured at every page boundary.Changes
ImportError→ localizedQMessageBox, open doc once forpage_count, captureself._pdf_password.do_workre-opens fitz, authenticates if needed, runs the per-page loop emittingworker.progress.emit(i, "i+1/total…")and checkingworker.is_cancelled().on_doneon main thread shows the success message.tool.convert.dep_pptx / dep_xlsx / dep_epub). Before this PR those three converters fell through to the genericstr(e)error on missing optional deps.Test plan
on_donefires on the main thread and the output file is produced. All 6 pass (txt 174B / docx 36KB / pptx 60KB / xlsx 6KB / html 563B / epub 3KB).Remaining work
After this lands, the only tools still doing heavy work synchronously are
import_pdf.py(8 conversion paths, each needs per-converter migration because they shell into different third-party libs) andpage_numbers.py(has a mid-flowQMessageBox.questionabout replacing existing numbers, needs main-thread split before/after the prompt). Tracked in [feedback_background_tasks.md](memory note).🤖 Generated with Claude Code