PDF Extract no longer extracts all images under JDK14 #2031

JamzTheMan · 2020-06-18T16:07:50Z

Describe the bug
PDF Extractor can no longer write out certain images like jpeg2000.

To Reproduce
Steps to reproduce the behavior:

Attempt to extract out various PDF's (need to find a PDF example that doesn't have copy write material) but for my tests I tried various Paizo PDF's and Page_13.pdf from @Phergus (but hesitant to post as that is also probably copyrighted material)

Expected behavior
All images in PDF should extract as it does using MapTool 1.7.x or TokenTool 2.1.

MapTool Info

Version: DEVELOP branch
Install: New, Upgrade [previous version], or JAR [Java Version]

Desktop (please complete the following information):

OS: ALL
Version ALL

Additional context
I've tried to capture failed Image.io write of jpg and write image as png instead which fixes some issues but several images are still not being written out. Further debugging needs to done.

Can compare more with TokenTool DEVELOP branch as that now also uses JDK14 and uses virtually same code except images are stored in memory (because only one page worth of images is shown at a time vs extract of whole PDF)

Phergus · 2021-04-16T19:55:54Z

Next release will be using Java 16. With the updated ImageIO plugins from #2495 I am seeing the correct results with PDF import on the ones I've tried. Still need to test against the Paizo files.

Phergus · 2021-04-23T19:05:05Z

The Paizo PDFs are still an issue and TT 2.2 extracts more images from other PDFs.

Zahariel · 2022-10-23T04:04:42Z

Has this been addressed at all? It still doesn't seem to work very well with Paizo's files; it manages to extract "some" images but it's not at all reliable, even with One File Per Chapter files that aren't 600 pages long. (I'm ok with it not working well with a 600 page PDF!)

Phergus · 2022-10-25T17:40:56Z

Nothing so far. Someone is going to have to get up to speed on the PDF extraction code and do some serious debugging.

JamzTheMan added the bug label Jun 18, 2020

JamzTheMan added this to To do in MapTool 1.8.0 via automation Jun 18, 2020

JamzTheMan added the medium Medium priority bug/enhancement label Jun 18, 2020

Phergus added this to To do in MapTool 1.9.0 via automation Feb 7, 2021

Phergus removed this from To do in MapTool 1.8.0 Feb 7, 2021

Phergus moved this from To do to In progress in MapTool 1.9.0 Apr 21, 2021

Phergus removed this from In progress in MapTool 1.9.0 Jun 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PDF Extract no longer extracts all images under JDK14 #2031

PDF Extract no longer extracts all images under JDK14 #2031

JamzTheMan commented Jun 18, 2020 •

edited by Phergus

Phergus commented Apr 16, 2021

Phergus commented Apr 23, 2021

Zahariel commented Oct 23, 2022

Phergus commented Oct 25, 2022

PDF Extract no longer extracts all images under JDK14 #2031

PDF Extract no longer extracts all images under JDK14 #2031

Comments

JamzTheMan commented Jun 18, 2020 • edited by Phergus

Phergus commented Apr 16, 2021

Phergus commented Apr 23, 2021

Zahariel commented Oct 23, 2022

Phergus commented Oct 25, 2022

JamzTheMan commented Jun 18, 2020 •

edited by Phergus