Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic processing PDF #94

Closed
mtzanidakis opened this issue Jul 18, 2019 · 4 comments

Comments

@mtzanidakis
Copy link

commented Jul 18, 2019

Hello,
thanks a lot for this great tool!

I use the api in a http service for watermarking and/or encrypting PDFs and stumbled upon a PDF file that causes panics on processing. I use the latest release, v0.2.1.

The problem is also triggered with the optimize cmd:

DEBUG: 2019/07/18 11:24:54 UndeleteObject: begin 1055
DEBUG: 2019/07/18 11:24:54 UndeleteObject end: undeleting obj#1055
Fatal: unexpected panic attack: runtime error: invalid memory address or nil pointer dereference

github.com/hhrutter/pdfcpu/pkg/cli.Process.func1
        /home/manolis/downloads/pdfcpu/pkg/cli/process.go:83
runtime.gopanic
        /usr/lib/go/src/runtime/panic.go:522
runtime.panicmem
        /usr/lib/go/src/runtime/panic.go:82
runtime.sigpanic
        /usr/lib/go/src/runtime/signal_unix.go:390
github.com/hhrutter/pdfcpu/pkg/pdfcpu.(*XRefTable).FindTableEntry
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/xreftable.go:275
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writePDFNullObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:249
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeNullObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:562
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeIndirectObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:655
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeDeepObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:710
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeDeepDict
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:582
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeIndirectObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:667
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeDeepObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:710
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeDeepDict
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:582
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeIndirectObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:667
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeDeepObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:710
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeEntry
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/writeObjects.go:729
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeRootEntry
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/write.go:205
github.com/hhrutter/pdfcpu/pkg/pdfcpu.writeRootObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/write.go:331
github.com/hhrutter/pdfcpu/pkg/pdfcpu.Write
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/write.go:89
github.com/hhrutter/pdfcpu/pkg/api.WriteContext
        /home/manolis/downloads/pdfcpu/pkg/api/api.go:112
github.com/hhrutter/pdfcpu/pkg/api.Optimize
        /home/manolis/downloads/pdfcpu/pkg/api/api.go:240
github.com/hhrutter/pdfcpu/pkg/api.OptimizeFile
        /home/manolis/downloads/pdfcpu/pkg/api/api.go:299
github.com/hhrutter/pdfcpu/pkg/cli.Optimize
        /home/manolis/downloads/pdfcpu/pkg/cli/cli.go:38
github.com/hhrutter/pdfcpu/pkg/cli.Process
        /home/manolis/downloads/pdfcpu/pkg/cli/process.go:90
main.process
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/main.go:219
main.handleOptimizeCommand
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/prepare.go:119
main.CommandMap.Handle
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/registry.go:85
main.main
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/main.go:175
runtime.main
        /usr/lib/go/src/runtime/proc.go:200
runtime.goexit
        /usr/lib/go/src/runtime/asm_amd64.s:1337

You can find the complete output of pdfcpu optimize -v (14436 lines) at https://pastebin.com/vVE7j6y6 .

Running validate -mode strict cmd on it, panics with:

Fatal: dict=type1FontDict required entry=FirstChar missing
github.com/hhrutter/pdfcpu/pkg/pdfcpu.Dict.Entry
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/dict.go:101
github.com/hhrutter/pdfcpu/pkg/pdfcpu/validate.validateIntegerEntry
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/validate/objects.go:522
github.com/hhrutter/pdfcpu/pkg/pdfcpu/validate.validateType1FontDict
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/validate/font.go:668
github.com/hhrutter/pdfcpu/pkg/pdfcpu/validate.validateFontDict
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/validate/font.go:952
github.com/hhrutter/pdfcpu/pkg/pdfcpu/validate.validateFontResourceDict
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/validate/font.go:985
github.com/hhrutter/pdfcpu/pkg/pdfcpu/validate.validateResourceDict
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/validate/pages.go:45
github.com/hhrutter/pdfcpu/pkg/pdfcpu/validate.validateAcroFormEntryDR
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/validate/acroForm.go:391
github.com/hhrutter/pdfcpu/pkg/pdfcpu/validate.validateAcroForm
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/validate/acroForm.go:443
github.com/hhrutter/pdfcpu/pkg/pdfcpu/validate.validateRootObject
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/validate/xReftable.go:895
github.com/hhrutter/pdfcpu/pkg/pdfcpu/validate.XRefTable
        /home/manolis/downloads/pdfcpu/pkg/pdfcpu/validate/xReftable.go:33
github.com/hhrutter/pdfcpu/pkg/api.ValidateContext
        /home/manolis/downloads/pdfcpu/pkg/api/api.go:101
github.com/hhrutter/pdfcpu/pkg/api.Validate
        /home/manolis/downloads/pdfcpu/pkg/api/api.go:176
github.com/hhrutter/pdfcpu/pkg/api.ValidateFile
        /home/manolis/downloads/pdfcpu/pkg/api/api.go:214
github.com/hhrutter/pdfcpu/pkg/cli.Validate
        /home/manolis/downloads/pdfcpu/pkg/cli/cli.go:33
github.com/hhrutter/pdfcpu/pkg/cli.Process
        /home/manolis/downloads/pdfcpu/pkg/cli/process.go:90
main.process
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/main.go:219
main.handleValidateCommand
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/prepare.go:96
main.CommandMap.Handle
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/registry.go:85
main.main
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/main.go:175
runtime.main
        /usr/lib/go/src/runtime/proc.go:200
runtime.goexit
        /usr/lib/go/src/runtime/asm_amd64.s:1337
validation error (try -mode=relaxed)
github.com/hhrutter/pdfcpu/pkg/api.Validate
        /home/manolis/downloads/pdfcpu/pkg/api/api.go:181
github.com/hhrutter/pdfcpu/pkg/api.ValidateFile
        /home/manolis/downloads/pdfcpu/pkg/api/api.go:214
github.com/hhrutter/pdfcpu/pkg/cli.Validate
        /home/manolis/downloads/pdfcpu/pkg/cli/cli.go:33
github.com/hhrutter/pdfcpu/pkg/cli.Process
        /home/manolis/downloads/pdfcpu/pkg/cli/process.go:90
main.process
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/main.go:219
main.handleValidateCommand
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/prepare.go:96
main.CommandMap.Handle
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/registry.go:85
main.main
        /home/manolis/downloads/pdfcpu/cmd/pdfcpu/main.go:175
runtime.main
        /usr/lib/go/src/runtime/proc.go:200
runtime.goexit
        /usr/lib/go/src/runtime/asm_amd64.s:1337

Unfortunately, the pdf contains copyrighted material and I'm not allowed to share it, however we managed to remove all content with Acrobat DC and save it as an empty pdf that triggers the same panic on strict validation. The empty pdf passes optimize. I've attached this empty pdf, in case it helps.

Thanks in advance

empty_broken.pdf

@hhrutter

This comment has been minimized.

Copy link
Collaborator

commented Jul 19, 2019

First of all when you run pdfcpu validate -v -mode strict inFile.pdf you should not get a stacktrace - just the error on stdOut.

Errors may happen in strictmode when validation in relaxed mode does not give an error.
The specific error you are getting is not necessarily related to the error you are getting when writing the optimized file.

empty_broken.pdf does not help because it validates and optimizes without error.

What would help to analyse this though is the output of pdfcpu validate -vv inFile.pdf which
produces maximum verbose output or even better a sample file that raises the error on optimization.

Thanks for using pdfcpu! 💚

@mtzanidakis

This comment has been minimized.

Copy link
Author

commented Jul 19, 2019

Thanks a lot for answering. Can I send you the original pdf via email?

@hhrutter

This comment has been minimized.

Copy link
Collaborator

commented Jul 19, 2019

@hhrutter hhrutter closed this in 18994fd Jul 21, 2019

@mtzanidakis

This comment has been minimized.

Copy link
Author

commented Jul 22, 2019

You're the best! Thanks a lot for the quick fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.