invalid or incomplete deflate data #88

FinPl · 2020-04-10T15:17:12Z

Hello,
I am encountering an error while trying to extract text from the first page of this document:

https://www.diffusion.transports.gouv.qc.ca/ords/pes/APEX_PES.P_PESB_DSI_AFFCH_RIG?P_VC_NUM_DOSSR=00007

I am rather new to pdf parsing and as I understand it there might be a problem with the compression used. Other documents which are similar work perfectly fine.

This one works:

https://www.diffusion.transports.gouv.qc.ca/ords/pes/APEX_PES.P_PESB_DSI_AFFCH_RIG?P_VC_NUM_DOSSR=00003

Can you help me solve that issue?

sambitdash · 2020-04-13T08:37:22Z

The compressed stream for the content stream is corrupt. Hence, at some point the extraction will not be completed. Accepting partial corrupt data can make the file pass through but some data may be corrupt due to bad flate compressed data.

sambitdash · 2020-04-13T10:20:44Z

Fix in: 208d064

There can never be a perfect solution when the data is corrupt, whatever data can be recovered is recovered.

sambitdash closed this as completed Apr 13, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

invalid or incomplete deflate data #88

invalid or incomplete deflate data #88

FinPl commented Apr 10, 2020

sambitdash commented Apr 13, 2020

sambitdash commented Apr 13, 2020

invalid or incomplete deflate data #88

invalid or incomplete deflate data #88

Comments

FinPl commented Apr 10, 2020

sambitdash commented Apr 13, 2020

sambitdash commented Apr 13, 2020