Releases: san64777/acroforge
v0.4.0 - remove() + JSON manifest
Adds remove(pdf, names), completing the engine's CRUD (build / read_fields / fill / remove).
remove()
Delete specific AcroForm fields by the name read_fields reports, so the two compose:
specs = af.read_fields(pdf)
junk = [s.name for s in specs if s.type == af.FieldType.SIGNATURE]
clean = af.remove(pdf, junk) # raises if any name is missing
Deletes by fully-qualified name via pure object surgery (no appearance regeneration). Naming a radio group removes the whole group; removing the last field leaves an empty, re-usable /AcroForm; /XFA is stripped. Validated on 125 real public forms (incl. encrypted and XFA) with zero crashes.
JSON manifest
detect() / read_fields() return pydantic models - serialize and round-trip them with model_dump_json() / model_validate_json() for the detect -> review/edit -> build flow. No new API.
Install: pip install acroforge==0.4.0
v0.3.4 - detection precision
Best-effort detection precision improvement, driven by real-corpus QA.
The geometry detector's underline finder was firing on decorative full-width separator rules, header/footer margin rules, and near-duplicate lines, inflating false positives in detect() / make_fillable(). It now rejects near-full-width and page-margin rules and merges near-coincident duplicate lines, plus a tolerant cross-source dedup of table cells against underlines.
On a 125-form real corpus: detection precision 0.58 -> 0.61, false positives down 12%, recall held. Detection remains best-effort (a draft manifest to review).
pip install acroforge==0.3.4
v0.3.3 - guaranteed core survives real-world forms
Hardening release, driven by real-world QA over 125 public PDF forms (government, healthcare, legal, business). Found and fixed 4 guaranteed-core bugs that the synthetic test suite could not reach.
Fixes
read_fieldsnow returns the fully-qualified field name (the/Tchain joined by.), so its output is addressable byfill(). Previously it returned the leaf name, whichfillcould not find on hierarchical / XFA forms - read-then-fill failed on most real fillable forms. Flat fields are unchanged.buildno longer crashes on a single-button radio group (reportlab raised "RadioGroup has 1 < 2 RadioBoxes"); a 1-member radio group is built as a checkbox.fillandflattenno longer crash withKeyError '/N'on a widget that has an/APwith no/N(a malformed real-world form); both enumerations are guarded.
Result
On the 125-form real corpus: read_fields 125/125, fill 108/108, flatten 108/108, build 108/108 - zero crashes. Three regression tests added.
Install: pip install acroforge==0.3.3
v0.3.2 - robust per-page detection + list-box fix
Patch release: robustness and a list-box fix, both found by real-world testing.
Detection degrades gracefully (the headline fix)
detect() / make_fillable() used to refuse the entire document on the first image-only page. Now an image-only or error-throwing page is skipped with a warning and the rest of the document is processed. ScannedPDFError is raised only when every page is image-only. A mostly-vector form with one scanned page now yields the fields from its vector pages instead of nothing.
Other robustness fixes
fill()with a list value on a non-multi-select field now raises a clear error instead of a cryptic dependency crash.read_fields()skips a single malformed widget (with a warning) instead of aborting, and guards against a circular/Parentchain that could hang.flatten()is a no-op on a PDF with no fields instead of raising.
Also (from 0.3.1, rolled in)
- Building a list box with a rect too short to fit one option row no longer crashes; it raises a clear error.
Install: pip install acroforge==0.3.2
v0.3.0 - CHOICE field type (dropdowns and list boxes)
CHOICE field type (dropdowns and list boxes)
Adds FieldType.CHOICE, covering the full PDF /Ch family - all four variants are cross-viewer verified (the selected value renders in both pdfium and pdf.js):
- Dropdown (combo box)
- Editable dropdown (accepts free-typed text)
- Single-select list box
- Multi-select list box
read_fields now recovers existing /Ch fields into CHOICE specs (options as plain strings or (export, label) pairs, plus list_box / multi_select / editable flags), round-tripping with build. fill validates values against the field's options.
import acroforge as af
from acroforge import FieldSpec, FieldType
fields = [FieldSpec(type=FieldType.CHOICE, page=0, rect=(200, 620, 360, 640),
name="state", options=["CA", "NY", "TX"])]
fillable = af.build(flat_pdf, fields)
filled = af.fill(fillable, {"state": "NY"})Validated on the IRS W-9. Install: pip install acroforge==0.3.0
acroforge 0.2.0
First-class reading of existing forms.
- NEW:
read_fields(pdf) -> list[FieldSpec]ingests the AcroForm fields already in a fillable PDF (type, name, coordinates, checkbox/radio on-states); confidence 1.0. It is the inverse ofbuild, so the two round-trip:build(other, read_fields(template)). Verified on the IRS W-9 (recovers its 23 real fields). - README documents the new API.
pip install -U acroforge
acroforge 0.1.1
Docs and metadata cleanup.
- README now leads with
pip install acroforge(PyPI is live). acroforge.__version__now reads from installed package metadata (was hardcoded).- Removed em/en dashes across the repo.
No code/behavior changes to the engine or detection.
acroforge 0.1.0
First release.
Turn flat PDFs into real, fillable AcroForms. Permissive (Apache-2.0), deterministic, zero-copyleft.
- Engine:
build/fill/flatten— inject named, typed, positioned AcroForm fields (text, checkbox, radio groups, comb, signature), fill, and flatten. Cross-viewer correct (Chrome pdfium + Firefox pdf.js). - Best-effort detection:
detect/make_fillableauto-find fields on flat PDFs (underlines, table cells, glyph + vector checkboxes). No AI, deterministic. - CLI:
acroforge build|fill|flatten|detect|make-fillable. - Zero-copyleft dependency tree (BSD/MIT), CI-enforced.
pip install acroforge