Clarify error optional content #28

cw6365 · 2015-09-03T12:04:18Z

i get the following runtime error "Optional Content PDF files aren't supported and their pages cannot be safely extracted", this is a one off error and doesn't normally happen so looks to be pdf specific.

I've had a look at the code but can you quickly clarify the issue here for me please. Thanks.

boazsegev · 2015-09-03T20:56:32Z

Some PDF files are designed to display different content according to the media on which they are displayed.

For example, a PDF file might display one way when viewed on screen and it might replace some images or text when printed on paper... Just like some websites use the CSS media query.

These PDF file required very specific instructions inside their structure. Ignoring these instructions might produce corrupted files or files that look very different than what you wanted them to look like.

Because of the issues related with this structure, these files are not supported for now and cannot be merged with other PDF files.

On Sep 3, 2015, at 05:04, cw6365 notifications@github.com wrote:

i get the following runtime error "Optional Content PDF files aren't supported and their pages cannot be safely extracted", this is a one off error and doesn't normally happen so looks to be pdf specific.

I've had a look at the code but can you quickly clarify the issue here for me please. Thanks.

—
Reply to this email directly or view it on GitHub.

andyrue · 2015-12-21T21:43:24Z

What's the best way to rescue from this error? I tried wrapping my CombinePDF.load command in a begin, rescue but it hangs upon running.

boazsegev · 2015-12-21T22:25:16Z

@andyrue , could you post your code you tried?

The following should work (I think, I don't have a file to test with):

filename  = 'file.pdf'
begin
   pdf = CombinePDF.load filename
rescue => e
   puts "Couldn't load #{filename}: #{e.message}"
end

andyrue · 2015-12-21T22:34:16Z

Seems there was some ruby process stuck. I killed it with activity monitor and the begin rescue works now. Thanks for your quick response!

joelw · 2016-06-24T06:11:20Z

Would you support a PR to add an option to ignore this error? For my use case I'd be happier with potentially weird output than no output at all, and when I've tested commenting out the raise my test PDFs look just terrific!

boazsegev · 2016-06-24T06:17:55Z

Hi Joel,

Thanks for asking.

If the PR allowed this as a non-default option, yeah sure.

I think it's better if the default would fail then to risk quite failures (missing data), but I definitely understand that different implementations might prefer a different behavior.

P.S.

If you're writing in a option flag to the existing API (i.e. CombinePDF.load file_name, unsafe: true), it might be better to use a hash then a simple argument, allowing for future features and alterations.

joelw · 2016-06-24T07:31:10Z

Thanks for your quick response (and for the gem!) I've put something together and will create a PR :)

chrisconcepcion · 2021-08-10T23:58:31Z

Ran into this issue and found an odd solution using the libreconv gem. Essentially you convert the pdf to a pdf with libreoffice and this will make the pdf compatible with combinepdf. Hope this helps someone, took me hours to figure this out.

Libreconv.convert(document_path, document_path)

mackermedia · 2021-11-03T18:30:39Z

Ran into this issue and found an odd solution using the libreconv gem. Essentially you convert the pdf to a pdf with libreoffice and this will make the pdf compatible with combinepdf. Hope this helps someone, took me hours to figure this out.

Libreconv.convert(document_path, document_path)

@chrisconcepcion - I'm struggling to get Libreconv.convert to work with a .pdf to. .pdf. Getting an error about source file could not be loaded. However, it works if I test with a .txt file to a .pdf.
Do you have any other tips you have for getting around this issue?

caioagiani · 2023-01-09T22:22:48Z

Try this:

CombinePDF.load("your_file.pdf", unsafe: true, allow_optional_content: true)

cw6365 changed the title ~~Calrify error optional content~~ Clarify error optional content Sep 3, 2015

boazsegev added the question label Sep 6, 2015

boazsegev closed this as completed Sep 6, 2015

boazsegev mentioned this issue Jan 12, 2016

Check for corrupt pdf file #40

Closed

joelw mentioned this issue Jun 24, 2016

Add a parameter for ignoring PDFs containing optional content blocks … #67

Merged

rickreyhsig mentioned this issue Apr 30, 2024

Rescue CombinePDF::ParsingError on 8879 signature pages codeforamerica/vita-min#4521

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify error optional content #28

Clarify error optional content #28

cw6365 commented Sep 3, 2015

boazsegev commented Sep 3, 2015

andyrue commented Dec 21, 2015

boazsegev commented Dec 21, 2015

andyrue commented Dec 21, 2015

joelw commented Jun 24, 2016

boazsegev commented Jun 24, 2016

joelw commented Jun 24, 2016

chrisconcepcion commented Aug 10, 2021

mackermedia commented Nov 3, 2021

caioagiani commented Jan 9, 2023

Clarify error optional content #28

Clarify error optional content #28

Comments

cw6365 commented Sep 3, 2015

boazsegev commented Sep 3, 2015

andyrue commented Dec 21, 2015

boazsegev commented Dec 21, 2015

andyrue commented Dec 21, 2015

joelw commented Jun 24, 2016

boazsegev commented Jun 24, 2016

joelw commented Jun 24, 2016

chrisconcepcion commented Aug 10, 2021

mackermedia commented Nov 3, 2021

caioagiani commented Jan 9, 2023