v1.2.0 - Vision-aware halftone picker, unified sample contact sheet, preserve_text fix
A maintenance release focused on making the halftone choice smarter, unifying every "look at one page through different settings" feature behind a single contact-sheet entry point, and fixing a legibility regression where saturated-coloured text on white paper was getting whitened away before binarization.
Highlights
-
Vision-aware halftone auto-picker.
recommend_dither()is now a 9-branch decision tree fed by cheap content stats (mean/std luma, edge density, texture score, bimodality, dark/light fractions, connected-region count) extracted from the photo-mask pixels in_extract_photo_features(). Below the legacy gates (text-only →none, low-DPI →blue-noise,--fax-heavy→clustered) the picker now selects within the detail family{clustered, atkinson, edd, floyd, jarvis, green-noise}based on what is actually on the page. The per-page features and the discriminating-feature values that drove the pick are surfaced in the--reportJSON so you can always see WHY a page picked what it picked. Validated onPrestige_Estates_v3.pdf: now picksedd(correct for the billboard text-on-photo cover), instead of always defaulting togreen-noise. -
Unified
--sample N --panels Kcontact sheet. A single entry point —render_contact_sheet— replaces the previous trio of competing flags. Render any page through 1 to ~20 panels, customise the recipe with--sample-include orig,gray,floyd,edd,line, and the strip above the grid shows every setting that produced the panels (input file, page number, resolution,preserve_text,ocr_text,recover_text,text_binarize,dither, auto-pick recommendation, photo_fraction). Default behaviour is still the 4-panel sample. The old--compare-page,--recover-text-preview, and--preview-pageflags continue to work but are marked(legacy)in--help. -
preserve_text legibility fix.
preserve_text_maskwas over-firing on saturated-coloured text on white paper (e.g. the gold "EXTRAORDINARY PROPERTIES" subhead beneath a logo lockup), classifying it as a "decorative accent" and whitening the strokes themselves. The fix adds a chroma-density gate: when a saturated component has no dark text strokes inside (frac < 0.005), it only counts as a decorative accent when the raw pre-morph chroma densely fills the closed component (density > 0.55). Text-shaped strokes have ~25 % chroma density (lots of paper between letters) and now fall through to the binarizer, which renders them as crisp black ink. Thetext_preservedwarning on the Prestige cover dropped from 395 kpx to 2 kpx; the same chips-with-dark-text rescue case (the original use ofpreserve_text) is unaffected.
Documentation refresh
-
Halftone-styles table reframe. The "Noise robustness" column became "Channel character", with positive descriptive labels (long-run / line-tolerant, fine-grain / detail-first, edge-enhancing / type-friendly, scanline-aligned stripes, …) instead of negatively-loaded "worst"/"poor" terms. The previous wording made the best-fidelity error-diffusion screens (
floyd,jarvis,edd) read as wrong picks, which is the opposite of what the auto-picker now chooses for the detail family. Synchronised acrossREADME.md,pdf-fax-optimizer/SKILL.md, andpdf-fax-optimizer/references/fax-optimization.md. -
text_rescue.pngcleanup. Dropped the redundant gray subhead between the title and the first section heading; rewrote thepreserve_textandrecover_textsection headings into parallel, self-contained sentences that describe what each feature actually does. Tightened the title strip to remove the dead air left by the removed subhead. -
halftone_grid.pngregen. The "EXTRAORDINARY PROPERTIES" subhead beneath the Prestige logo now reads correctly in every halftoned panel (it was silently being whitened by the over-firingpreserve_textbefore this release). -
Reference docs.
references/config-schema.mdnow marks--sample/--panels/--sample-include/--no-sample-headeras the supported flags and the older flags as(legacy).requirements.txtnotes the Python 3.10+ requirement up top.
CLI changes
- New:
--panels K(1, 2, 4, 6, 8, 12, 20, max) for--sample. - New:
--sample-include orig,gray,clustered,floyd,line,…for custom recipes. - New:
--no-sample-headerto omit the settings strip above the grid. - Legacy and still working:
--compare-page,--compare-methods,--compare-original,--recover-text-preview,--preview-page. Help text now points to the--sampleequivalents.
JSON report
pages[i].photo_featuresis now present on any page that ran the full MRC pipeline: a dict ofmean_luma,std_luma,dark_fraction,light_fraction,edge_density,texture_score,bimodal_score,photo_area_px,n_regions. This is what the auto-picker reasoned over to pick the dither.
No breaking changes
All v1.1.0 flags continue to work, including the legacy --compare-page / --recover-text-preview / --preview-page flags. The auto-picker change can surface a different default dither for any given page; if you want the v1.1.0 behaviour, pass --dither green-noise explicitly.
Install
Download pdf-fax-optimizer.zip from this release and either:
- Claude.ai — upload via Settings → Capabilities → Skills.
- Claude Code — unzip into
~/.claude/skills/. - Codex — unzip into
~/.codex/skills/.
Runtime requirements: Python 3.10+, qpdf, plus the deps in requirements.txt (PyMuPDF, Pillow, opencv-python, numpy, img2pdf, rapidocr-onnxruntime). LibreOffice headless is optional but recommended for .doc/.docx/.ppt/.pptx/.xls/.xlsx conversion.
Diff summary
2 commits since v1.1.0, 9 files changed (1137 insertions, 303 deletions). The bulk of the change is in pdf-fax-optimizer/scripts/fax_pipeline.py (smart picker + unified sampler + preserve_text fix); the rest is documentation and regenerated demo graphics.