Skip to content

fix: 4 critical dataloss/crash bugs + 2 related lifecycle fixes#72

Merged
nelsonduarte merged 4 commits into
mainfrom
fix/critical-dataloss-crashes
Jun 5, 2026
Merged

fix: 4 critical dataloss/crash bugs + 2 related lifecycle fixes#72
nelsonduarte merged 4 commits into
mainfrom
fix/critical-dataloss-crashes

Conversation

@nelsonduarte
Copy link
Copy Markdown
Owner

Summary

Closes 4 critical bugs identified in audit rounds 5 and 7, plus 2 related HIGH lifecycle fixes in the same code paths. All are silent-dataloss or crash risks.

Critical fixes

  • 🔴 CRIT-1 redo stack wipe (editor/tab.py) — _redo() called _add() which cleared the redo stack. After 1 redo, all other redos vanished. Now _add(_from_redo=True) skips the clear.
  • 🔴 CRIT-2 viewer saveIncr crash (viewer/canvas.py) — right-click "Delete comment" wrote to disk with no confirmation, no try/except. Read-only PDFs crashed the Qt event loop. Now prompts with defaultButton=No and surfaces errors via show_error.
  • 🔴 CRIT-3 print pixmap alpha/CMYK (viewer/panel.py) — QImage(pix.samples, …, Format_RGB888) assumed 3 bytes/pixel; crashed on alpha/CMYK PDFs. Plus pix.samples lifetime race. Applies the OCR Round 3 pattern: get_pixmap(alpha=False) + if pix.n != 3: fitz.Pixmap(fitz.csRGB, pix) + QImage(...).copy().
  • 🔴 N7-CRIT-1 same-source overwrite (8 tools) — with open(out_path, "wb") truncated the SOURCE PDF if user picked the same path for input and output. PdfWriter reads streams lazily from PdfReader; the open("wb") destroyed the input before the write. Silent data loss. New BasePage._atomic_pdf_write(writer, dst, sources=[...]) helper rejects same-path and uses tempfile + os.replace for atomicity. Migrated extract, encrypt×2, merge, reorder, rotate, watermark, page_numbers, ocr×2.

Related HIGH fixes (same files)

  • 🟠 B-extra _open_note tuple stale (viewer/canvas.py) — after entry.annots.pop(annot_idx), the cached _open_note=(page, idx) could point to the wrong annot. Now reconciles the tuple after pop.
  • 🟠 B1 viewer load race (viewer/panel.py) — _canvas._doc could briefly point to a closed document between close→reopen. Now _canvas.close_doc() is called before the old doc is closed.

Tests

  • tests/test_editor_undo.py (new) — 4 tests for CRIT-1
  • tests/test_atomic_pdf_write.py (new) — 10 tests for N7-CRIT-1 (both PdfWriter and fitz.Document branches)
  • tests/test_viewer_safety.py (new) — 6 source-level smoke tests for CRIT-2/3 + B-extra + B1
  • Final: 135 passed, 1 failed (pre-existing test_flatpak_manifest_tag_is_current)

i18n

3 new keys × 8 languages (en/pt/es/fr/de/zh/it/nl), all parity intact:

  • msg.confirm
  • viewer.confirm_delete_comment
  • tool.err.same_source_output

Audit context

These are findings from audit rounds 5 and 7. Full audit catalog has 50 bugs across 7 rounds; this PR fixes the 6 critical/data-loss ones. HIGH-severity bugs are in PR-B (fix/audit-highs).

Validation

  • 4 atomic commits (one per bug family)
  • Cross-platform os.replace (Win+POSIX)
  • i18n parity verified (8 langs × 576 keys)
  • No new pyflakes warnings
  • Smoke imports OK (from app.base import BasePage._atomic_pdf_write)

nelsonduarte and others added 4 commits June 5, 2026 17:33
_redo() now passes _from_redo=True to _add(), preventing the
self._redo_stack.clear() that was wiping all redo entries after
the first redo call.

After 3 edits -> undo x3 -> redo: all 3 are now correctly restored
(previously only 1, the rest were silently discarded).

Adds tests/test_editor_undo.py with regression coverage.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…-2 + B-extra)

Right-click "Delete comment" was calling self._doc.saveIncr() directly
on the user's file with no confirmation, no try/except, and no undo.
Read-only PDFs crashed the Qt event loop; users could lose annotations
they didn't intend to delete.

Now prompts via QMessageBox.question (defaultButton=No) and wraps
saveIncr in try/except -> show_error. Also corrects stale _open_note
tuple after annot delete: when a note above the open one is removed,
its index is shifted down so reopening the popup doesn't show the
wrong comment (B-extra).

New i18n keys (8 languages each):
- msg.confirm
- viewer.confirm_delete_comment
- tool.err.same_source_output (used by N7-CRIT-1 in the next commit)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… (CRIT-3 + B1)

Print previously crashed on PDFs with transparency or CMYK colorspace
because QImage(Format_RGB888) assumed 3 bytes/pixel. The pixmap
buffer could also be freed before painter.drawImage finished — on the
next iteration `pix = page.get_pixmap(...)` released the previous
allocation while the painter was still drawing from it (race).

Applies the same fix the OCR tool received in Round 3:
- get_pixmap(alpha=False)
- if pix.n != 3: pix = fitz.Pixmap(fitz.csRGB, pix)
- QImage(...).copy() to decouple from native buffer lifetime

Also fixes viewer.load() race where _canvas._doc pointed to a closed
document briefly (B1). panel.load() now routes the old-doc teardown
through _canvas.close_doc(), which drops _doc and bumps _gen BEFORE
the underlying fitz.Document is actually closed — so a paintEvent or
queued _on_page_ready that runs between the close and the next load
can no longer touch a freed document.

Adds tests/test_viewer_safety.py covering CRIT-2/CRIT-3/B-extra/B1
via source-level guards (these paths require a real Qt event loop
to exercise end-to-end).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
8 tools (extract, encrypt x2, merge, reorder, rotate, watermark,
page_numbers, ocr x2) wrote output with open(path, "wb") + w.write(f).
PdfWriter reads streams lazily from PdfReader; the open("wb")
truncated the input file on disk BEFORE the write completed when the
user picked the same path for input and output — silent dataloss +
corrupt output.

New helpers on BasePage:
- _check_not_same_path(dst, sources): rejects with RuntimeError
  (translated via tool.err.same_source_output) when dst resolves to
  any source path via os.path.realpath.
- _atomic_pdf_write(writer, dst, sources, save_opts): calls the
  check, then writes to a same-directory tempfile and atomically
  renames via os.replace. Accepts pypdf.PdfWriter (write(fh)) and
  fitz.Document (save(path, **opts)) — detected by module name
  (modern PyMuPDF reports module="pymupdf").

All 8 tools migrated. ocr.py also covers the .txt branch via a
parallel tempfile + os.replace path so a mistaken input==output for
text output can't truncate the source PDF either.

Test coverage in tests/test_atomic_pdf_write.py (10 cases): same-
path rejection (exact + realpath), atomic os.replace verification,
tempfile cleanup on writer error, fitz.Document branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@nelsonduarte nelsonduarte added the bug Something isn't working label Jun 5, 2026
finally:
doc.close()
assert dst.exists()
assert len(fitz.open(str(dst))) == 2
Comment thread app/tools/ocr.py
except Exception:
if os.path.exists(_tmp):
try: os.unlink(_tmp)
except OSError: pass
# display.
os.environ.setdefault("QT_QPA_PLATFORM", "offscreen")
from PySide6.QtWidgets import QApplication
_app = QApplication.instance() or QApplication([])
Comment thread tests/test_editor_undo.py
``_add/_undo/_redo`` semantics.
"""

import os
Comment thread tests/test_editor_undo.py
import sys
from pathlib import Path

import pytest
import sys
from pathlib import Path

import pytest
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants