Skip to content

fix: 7 HIGH audit findings (memory, security, updater, print)#73

Closed
nelsonduarte wants to merge 12 commits into
mainfrom
fix/audit-highs
Closed

fix: 7 HIGH audit findings (memory, security, updater, print)#73
nelsonduarte wants to merge 12 commits into
mainfrom
fix/audit-highs

Conversation

@nelsonduarte
Copy link
Copy Markdown
Owner

Summary

Closes 7 HIGH-severity bugs from audit rounds 5, 6, and 7.

Note: PR base is fix/critical-dataloss-crashes (PR-A). After PR-A merges, GitHub will auto-rebase this PR onto main.

Fixes

  • 🟠 R5/A2 _pending unbounded (editor/tab.py) — long sessions grew the edit history indefinitely; temp signature/image files never released. Cap at 500 + FIFO drop with temp-file cleanup. Test guards _pending_list widget sync (regression for off-by-one drift caught in review).
  • 🟠 R5/C1 signature file permissions (i18n.py, editor/dialogs.py) — ~/.pdfapps_signature.png and tmp images saved with default umask (0o644). Now os.chmod(path, 0o600) after save.
  • 🟠 R5/D2 password plaintext in memory (base.py, editor/tab.py, viewer/panel.py) — cached _pdf_password persisted until GC. Heap dumps could expose. Best-effort wipe via wipe_pdf_password() in utils.py (Python str immutability documented as limitation). Wired into close/load paths.
  • 🟠 R5/F3 updater tempfile leak (updater.py) — _apply_update_unix left ~100 MB installer tempfile on failed apply. Wrapped in try/finally.
  • 🟠 R6/B6/B7 UpdateDialog signal duplication (updater.py) — retry reconnected _signals.finished/error/cancelled to a new thread without disconnecting old. Qt delivered finished to all historical threads. Now fresh _Signals instance per download, old deleteLater'd.
  • 🟠 R7/N7-H1 print range/copies/reverse (viewer/panel.py) — QPrintDialog exposes fromPage/toPage/copyCount/pageOrder; loop ignored all four. Now respects user choice. Preserves the CRIT-3 pixmap fix inside the loop.
  • 🟠 R7/N7-H2 CWD-dependent tests (tests/test_pdfapps.py) — 11 tests opened "app/base.py" etc. with relative paths. Migrated to _REPO_ROOT = Path(__file__).resolve().parent.parent.

Tests

  • Extended tests/test_editor_undo.py (+4 tests including widget-sync regression)
  • New tests/test_signature_perms.py (3 tests; functional 0o600 check skipped on Windows)
  • New tests/test_password_lifecycle.py (8 tests)
  • Extended tests/test_updater.py (+6 tests for cleanup + signal lifecycle)
  • Extended tests/test_viewer_safety.py (+4 tests for print range/copies/reverse)
  • Final: 160 passed, 1 failed (Flatpak pre-existing), 1 skipped

Validation

  • CWD-independence verified: cd /tmp && pytest <repo>/tests/ passes
  • 8 atomic commits
  • No i18n changes (this PR is internal-only)
  • No new pyflakes warnings

nelsonduarte and others added 12 commits June 5, 2026 17:33
_redo() now passes _from_redo=True to _add(), preventing the
self._redo_stack.clear() that was wiping all redo entries after
the first redo call.

After 3 edits -> undo x3 -> redo: all 3 are now correctly restored
(previously only 1, the rest were silently discarded).

Adds tests/test_editor_undo.py with regression coverage.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…-2 + B-extra)

Right-click "Delete comment" was calling self._doc.saveIncr() directly
on the user's file with no confirmation, no try/except, and no undo.
Read-only PDFs crashed the Qt event loop; users could lose annotations
they didn't intend to delete.

Now prompts via QMessageBox.question (defaultButton=No) and wraps
saveIncr in try/except -> show_error. Also corrects stale _open_note
tuple after annot delete: when a note above the open one is removed,
its index is shifted down so reopening the popup doesn't show the
wrong comment (B-extra).

New i18n keys (8 languages each):
- msg.confirm
- viewer.confirm_delete_comment
- tool.err.same_source_output (used by N7-CRIT-1 in the next commit)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… (CRIT-3 + B1)

Print previously crashed on PDFs with transparency or CMYK colorspace
because QImage(Format_RGB888) assumed 3 bytes/pixel. The pixmap
buffer could also be freed before painter.drawImage finished — on the
next iteration `pix = page.get_pixmap(...)` released the previous
allocation while the painter was still drawing from it (race).

Applies the same fix the OCR tool received in Round 3:
- get_pixmap(alpha=False)
- if pix.n != 3: pix = fitz.Pixmap(fitz.csRGB, pix)
- QImage(...).copy() to decouple from native buffer lifetime

Also fixes viewer.load() race where _canvas._doc pointed to a closed
document briefly (B1). panel.load() now routes the old-doc teardown
through _canvas.close_doc(), which drops _doc and bumps _gen BEFORE
the underlying fitz.Document is actually closed — so a paintEvent or
queued _on_page_ready that runs between the close and the next load
can no longer touch a freed document.

Adds tests/test_viewer_safety.py covering CRIT-2/CRIT-3/B-extra/B1
via source-level guards (these paths require a real Qt event loop
to exercise end-to-end).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
8 tools (extract, encrypt x2, merge, reorder, rotate, watermark,
page_numbers, ocr x2) wrote output with open(path, "wb") + w.write(f).
PdfWriter reads streams lazily from PdfReader; the open("wb")
truncated the input file on disk BEFORE the write completed when the
user picked the same path for input and output — silent dataloss +
corrupt output.

New helpers on BasePage:
- _check_not_same_path(dst, sources): rejects with RuntimeError
  (translated via tool.err.same_source_output) when dst resolves to
  any source path via os.path.realpath.
- _atomic_pdf_write(writer, dst, sources, save_opts): calls the
  check, then writes to a same-directory tempfile and atomically
  renames via os.replace. Accepts pypdf.PdfWriter (write(fh)) and
  fitz.Document (save(path, **opts)) — detected by module name
  (modern PyMuPDF reports module="pymupdf").

All 8 tools migrated. ocr.py also covers the .txt branch via a
parallel tempfile + os.replace path so a mistaken input==output for
text output can't truncate the source PDF either.

Test coverage in tests/test_atomic_pdf_write.py (10 cases): same-
path rejection (exact + realpath), atomic os.replace verification,
tempfile cleanup on writer error, fitz.Document branch.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Long editing sessions could grow _pending unbounded; edits referencing
temp signature/image files were never released until the tab closed.

Add _MAX_PENDING=500 constant; _add() now trims oldest edits and
unlinks their orphan temp files (only paths inside the system tempdir
are touched, so user-picked source images are left alone). Undo paths
already handle empty stacks correctly. The visible QListWidget shadow
is trimmed in lockstep so labels stay aligned with edit indices.

Addresses R5/A2.
Previously the persistent ~/.pdfapps_signature.png (and the in-flight
/tmp copy produced by _SignatureDialog) were saved with the default
umask (typically 0o644), allowing other users on multi-user POSIX
hosts to read the cached signature.

Apply os.chmod(path, 0o600) after both writes. Windows is unchanged
because NTFS ACLs already inherit owner-only access from the user
profile.

Addresses R5/C1.
Cached _pdf_password persisted in memory until the BasePage / EditorTab
/ ViewerPanel object was GC'd. A heap inspection or memory dump after
the user finished viewing a confidential PDF could surface the password
as plain text.

Add _clear_pdf_password() helper on BasePage (and a mirror on EditorTab,
which does not inherit from BasePage). Wire it into:
  - EditorTab._close_pdf  (already committed alongside the editor fix)
  - PdfViewerPanel.load (when switching files)
  - PdfViewerPanel.closeEvent (teardown)

Limitation documented: Python str immutability prevents a real
zero-scrub of the original bytes; the helper drops the only reachable
reference and uses a ctypes zero-buffer hint as a defensive marker.

Addresses R5/D2.
…ion on retry

Two related lifetime bugs:

* _apply_update_unix wraps the apply body in try/finally so failed
  applies no longer leak the ~100 MB installer tempfile. Pre-fix
  os.remove(downloaded) only ran on the success path, leaving the
  artefact in /tmp until the OS cleaned it up (R5/F3).

* UpdateDialog now creates a fresh _Signals instance per
  _start_download call. Pre-fix the dialog reused a single _Signals
  across retries, reconnecting finished/error/cancelled on top of
  prior thread connections. When the second download completed Qt
  delivered the signal to both the live AND already-deleted historical
  threads, producing warnings or crashes (R6/B6-B7).

The prior _signals instance is deleteLater'd before reassignment so
historical connections are released atomically.
QPrintDialog exposes fromPage/toPage/copyCount/pageOrder but the
print loop ignored all four — always printing every page once in
forward order regardless of the user's choices.

Build the iteration range from printer.fromPage()..toPage()
(treating 0,0 as "all pages" per Qt's contract), repeat the loop
copyCount() times, and reverse the page list when pageOrder() is
LastPageFirst. AttributeError-guard the enum access so PySide6
< 6.4 bindings still print (worst case: forward order, same as
before this fix). Page-break placement adjusted so neither the
first page of the first copy nor the boundary between copies
produces a blank leading page.

Preserves the CRIT-3 pixmap safety fixes (alpha=False / csRGB
fallback / .copy()) which live in the same loop body.

Addresses R7/N7-H1.
Several tests opened "app/base.py" / "installer.py" / etc. with
relative paths, breaking when pytest ran from any cwd other than
the repo root (IDEs, CI matrix configurations, sub-directories,
or /tmp).

Migrate the affected open() calls to absolute paths anchored at
_REPO_ROOT = Path(__file__).resolve().parent.parent so the suite
passes from any working directory.

Addresses R7/N7-H2.
The cap-trim block computed `extra = _pending_list.count() - len(_pending)`
BEFORE the new edit's addItem ran. In steady state at the cap that
diff was 0, so no takeItem ever fired, and the subsequent addItem left
the QListWidget with +1 row vs the underlying _pending list. The next
_undo then removed the wrong label (the visible most-recent row, not
the row matching the popped edit).

Switch the takeItem loop to use len(to_drop), which is the correct
trim count regardless of whether addItem has happened yet.

Adds a regression test that drives the real TabEditar._add against a
real QListWidget and asserts the widget rows stay locked to the
_pending list both after the cap-trim and after a subsequent _undo.

Caught in PR-B adversarial review.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- wipe_pdf_password is the single source of truth in app/utils.py;
  BasePage._clear_pdf_password, TabEditar._clear_pdf_password and
  PdfViewerPanel._clear_pdf_password are now thin wrappers that
  delegate to it. Used to be three near-identical copies — flagged in
  PR-B review as DRY rot.
- Remove duplicate `import contextlib` from updater.py:268
  (already imported at module top).

Existing password-lifecycle tests still pass (they exercise the
_clear_pdf_password method on BasePage, which now routes through the
shared helper).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@nelsonduarte nelsonduarte added the bug Something isn't working label Jun 5, 2026
@nelsonduarte nelsonduarte deleted the branch main June 5, 2026 17:37
@nelsonduarte nelsonduarte reopened this Jun 5, 2026
@nelsonduarte nelsonduarte changed the base branch from fix/critical-dataloss-crashes to main June 5, 2026 17:39
@nelsonduarte
Copy link
Copy Markdown
Owner Author

Superseded by #78 (same content, cherry-picked onto current main after PR #72 squash). Closing to avoid duplicate merge.

@nelsonduarte nelsonduarte deleted the fix/audit-highs branch June 5, 2026 17:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant