Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pdfcpu split failed via API but succeeds via cmd #87

Closed
leowmjw opened this issue Jun 16, 2019 · 3 comments

Comments

@leowmjw
Copy link

commented Jun 16, 2019

Related to #80 . Split works with cmd CLI but fails via API call.

Output via cmd:

$ pdfcpu split ./raw/Lisan/JDR12032019.pdf /var/folders/_p/
qk0rf40514b4sgy16r5qyxs40000gn/T/pardocs819115744/raw/splitout/Lisan/JDR12032019/pages
splitting ./raw/Lisan/JDR12032019.pdf into /var/folders/_p/qk0rf40514b4sgy16r5qyxs40000gn/T/pardocs819115744/raw/splitout/Lisan/JDR12032019/pages (span=1)...
writing /var/folders/_p/qk0rf40514b4sgy16r5qyxs40000gn/T/pardocs819115744/raw/splitout/Lisan/JDR12032019/pages/JDR12032019_1.pdf ...
writing /var/folders/_p/qk0rf40514b4sgy16r5qyxs40000gn/T/pardocs819115744/raw/splitout/Lisan/JDR12032019/pages/JDR12032019_2.pdf ...
w
..

but when using the API it seems to fail

Am I missing anything? Confirmed the validation is in Relaxed mode

Output via API:

OK; splitting!!
VALIDATION:  relaxed
panic: dict=pagesDict entry=Tabs: unsupported in version 1.4
This file could be PDF/A compliant but pdfcpu only supports versions <= PDF V1.7
 [recovered]
	panic: dict=pagesDict entry=Tabs: unsupported in version 1.4
This file could be PDF/A compliant but pdfcpu only supports versions <= PDF V1.7

Code:

	conf := pdfcpu.NewDefaultConfiguration()
        // not needed but just to ensure
	conf.ValidationMode = pdfcpu.ValidationRelaxed

	cmd := papi.SplitCommand(sourcePDFPath,  destPDFDir, 1, conf)
	o, perr := papi.Process(cmd)
	if perr != nil {
		panic(perr)
	}
// Above fails!
@hhrutter

This comment has been minimized.

Copy link
Collaborator

commented Jun 16, 2019

just tried this out using the Test in process_test.go: TestSplitCommand() and it works for me.
I am pushing a new release but this should be unrelated.

@StephanVerbeeck

This comment has been minimized.

Copy link

commented Jun 19, 2019

same problem here.
Does not work with any of the PDF's that I can find.
The resulting single page PDF's are only 1Kb and invalid (no PDF viewer can read them)
Looks like all the referred objects are not copied along or worse.
Extract of images works fine but extract of pages does not.

func testPDF02() {
	config := pdfcpu.NewDefaultConfiguration()
	file := "C:\\Account\\indoc\\Rekeninguittreksels_BE70979980972725_2019-04-05_122814.pdf"
	pages := []string{"1-11"}

	{
		cmd := api.ExtractPagesCommand(file, "C:\\Build\\Account\\testResults", pages, config)
		api.ExtractPages(cmd)
	}
	{
		cmd := api.ExtractImagesCommand(file, "C:\\Build\\Account\\testResults", pages, config)
		api.ExtractImages(cmd)
	}
}

It seems it works when calling api.Process(cmd) instead of api.ExtractPages(cmd) .
The only difference between the 2 methods of calling (direct or indirect) is the following line of code:

cmd.Config.Cmd = cmd.Mode
@hhrutter

This comment has been minimized.

Copy link
Collaborator

commented Jun 20, 2019

If you follow the tests in process_test.go everything works fine but I agree it makes sense to be able to call api.ExtractPages(cmd) instead of api.Process(cmd).

I prefer to use the idiom

cmd := api.ExtractPagesCommand(...)
_, err := api.Process(cmd)

over

cmd := api.ExtractPagesCommand(...)
_, err := api.ExtractPages(cmd)

because there is less stuttering.

Both ways are going to work in the next release so you can choose however you want to call into pdfcpu.

@hhrutter hhrutter self-assigned this Jun 20, 2019

@hhrutter hhrutter closed this in f643ce2 Jul 14, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.