-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Working with split pdfs #26
Comments
Hi @Conaws See if the test examples work for you: https://github.com/dotemacs/pdfboxing/blob/master/test/pdfboxing/split_test.clj I'll have a look at this a bit later. Can I ask you to type in all that you type in at the repl? Thanks |
Yeah, That's what I went off of. I got splitting to work fine, it's just I'm not familiar with PDDocuments so don't know how to output a new file. |
I've been using split-pdf and extract-text, but curious how to save each page as a new pdf |
Hey @Conaws see this function: and see how it's being used: (merge-pddocuments :docs [pddoc1 pddoc2 pddoc3] :output "output.pdf") Does that help? |
So you're saying I can't create a pdf of a single pddoc, I can only merge Or could I use merge-pddocuments with a vector of only one pddoc? On Monday, May 16, 2016, Александар Симић notifications@github.com wrote:
Sent from Gmail Mobile |
Hello Conor
You can create a PDF document with merge-pddocuments where only one PDDocument is supplied in a vector. Please have a play with it yourself.
|
brilliant |
I'm trying to read a pdf in, split it with (pdf-split/merge-pddocuments
:docs (pdf-split/split-pdf :input path :start 1 :end 4)
:output "test.pdf") But I get this error, full stacktrace below:
|
Looking at your code: (pdf-split/merge-pddocuments
:docs (pdf-split/split-pdf :input path :start 1 :end 4)
:output "test.pdf") What does the argument to I'm asking because looking at the discussion above, there is this example: (merge-pddocuments :docs [pddoc1 pddoc2 pddoc3] :output "output.pdf") Which looks like the argument to Is your result of (pdf-split/split-pdf :input path :start 1 :end 4) a vector? I can see that the docstring talks of a https://github.com/dotemacs/pdfboxing/blob/master/src/pdfboxing/split.clj#L35 So see which one is applicable to you and let me know. Because this might be a case where I might need to update the docstring or the code depending on what you find out. Thanks |
It's a vector: => (def s (pdf-split/split-pdf :input path :start 1 :end 4))
=> (type s)
clojure.lang.PersistentVector I'm getting the same error with: (pdf-split/merge-pddocuments
:docs (apply list (pdf-split/split-pdf :input path :start 1 :end 4))
:output "test.pdf") This is how the list looks: => (apply list s)
(#object[org.apache.pdfbox.pdmodel.PDDocument 0x2932d27f "org.apache.pdfbox.pdmodel.PDDocument@2932d27f"] #object[org.apache.pdfbox.pdmodel.PDDocument 0x7ce0fee0 "org.apache.pdfbox.pdmodel.PDDocument@7ce0fee0"] #object[org.apache.pdfbox.pdmodel.PDDocument 0x3aa4ccc1 "org.apache.pdfbox.pdmodel.PDDocument@3aa4ccc1"] #object[org.apache.pdfbox.pdmodel.PDDocument 0x719b3937 "org.apache.pdfbox.pdmodel.PDDocument@719b3937"]) This is a ~20MB PDF with more than 40,000 pages if it might be an issue |
Can you see if the (pdf-split/merge-pddocuments
:docs [pddoc1 pddoc2 pddoc3]
:output "test.pdf") will work if you supply any other documents? And can you see if you can merge documents that are split, saved to the disk first? Basically, just trying to see where the issue could be. |
Exact same error with another PDF with both a vector and a list. It looks like it's trying to get to the PDF somehow wich is closed |
OK.
Yea. I won't be able to have a look at this issue this week. Thanks |
Not sure I'll be able to work on it unfortunately. I'll see what I can do |
A few questions
Is java interop necessary for that, or another clojure library?
for instance, if I want to turn "/sample/pdf-title.pdf" into "sample/pdf-title-pages/1.pdf" "sample/pdf-title-pages/2.pdf"
The text was updated successfully, but these errors were encountered: