3a384491_1769098711502_BWL_Zusammenfassung.pdf
Summary
--text-page-separator appears to be accepted by the CLI but is not actually applied to the emitted text output.
I verified this on edgeparse 0.2.5.
Interesting project, let me know if you are looking for potential collaborators!
Minimal reproduction
edgeparse '/path/to/file.pdf' \
--format text \
--output-dir /tmp/edgeparse-issue-repro \
--text-page-separator '[[PAGE %page-number%]]'
Example real command I used:
~/edgeparse '/Users/mjgp2/Library/CloudStorage/GoogleDrive-matthew@gizmo.ai/Shared drives/PDFs/sample-pdf/3a384491_1769098711502_BWL_Zusammenfassung.pdf' \
--format text \
--output-dir /tmp/edgeparse-issue-repro \
--text-page-separator '[[PAGE %page-number%]]'
Actual behavior
- CLI exits successfully
- text output file is written
- output does not contain any
[[PAGE N]] markers
I confirmed with:
rg -n '\[\[PAGE ' /tmp/edgeparse-issue-repro/3a384491_1769098711502_BWL_Zusammenfassung.txt
which produced no matches.
Expected behavior
The emitted .txt output should include the requested page separator, for example:
[[PAGE 1]]
[[PAGE 2]]
- etc.
Notes
--markdown-page-separator appears to work correctly
- this seems specific to
--text-page-separator
- I also observed the same behavior when
text was requested alongside other formats, e.g. --format markdown-with-html,json,text
3a384491_1769098711502_BWL_Zusammenfassung.pdf
Summary
--text-page-separatorappears to be accepted by the CLI but is not actually applied to the emitted text output.I verified this on
edgeparse 0.2.5.Interesting project, let me know if you are looking for potential collaborators!
Minimal reproduction
Example real command I used:
Actual behavior
[[PAGE N]]markersI confirmed with:
rg -n '\[\[PAGE ' /tmp/edgeparse-issue-repro/3a384491_1769098711502_BWL_Zusammenfassung.txtwhich produced no matches.
Expected behavior
The emitted
.txtoutput should include the requested page separator, for example:[[PAGE 1]][[PAGE 2]]Notes
--markdown-page-separatorappears to work correctly--text-page-separatortextwas requested alongside other formats, e.g.--format markdown-with-html,json,text