TimeOut / stuck in loop (?) - 1.24.1 PDF-hul for attached file #646

asciim0 · 2020-09-10T13:55:37Z

Using both UI and CLI (on Windows environment, tested on different machines) the attached file nevers seems to reach a result for validation.

897714407.pdf

rosetta-development · 2020-10-14T13:28:27Z

We encounter the same issue with our pdf file.
Thread is stuck on :
edu.harvard.hul.ois.jhove.Property.getByName(Property.java:159)

karenhanson · 2021-03-25T02:25:12Z

I think I have this same issue - so wanted to share what I found while trying to troubleshoot in case it adds useful context. Looks like #645 may be the same thing too. In my case the following lines in the PDF are causing an infinite loop (using v1.24):

464 0 obj
<</Dest(þÿ s e c S 0 0 4)/Next 463 0 R/Parent 399 0 R/Prev 465 0 R/Title(þÿ P a r t A\t & \t .)>>
endobj
465 0 obj
<</Dest(þÿ s e c S 0 0 3)/Next 464 0 R/Parent 399 0 R/Prev 466 0 R/Title(þÿ P a r t B)>>
endobj

In the Tokenizer it seems that the backslash in the title causes it to go into Literal.readBackslashSequence() where it fails to see the >> that should end the entry. It then proceeds to read in the following row as part of the same object, which sets the Next value back to 464 and causes an infinite loop where it keeps reloading the same garbled object. In my case the PDF gets stuck in this loop.

I note that the PDF attached to this issue and the one attached to #645 both have backslashes in the Title property of a dictionary entry. I tested the fix in PR #652 and it works for my issue too.

Patch integration tests and added regression test files for #652: - patched the result of pdf-hul-76-372051162.pdf; and - added regression tests for #645 and #646.

MaximPlusov mentioned this issue Nov 20, 2020

Fix issues with backslash sequence #652

Merged

carlwilson added bug A product defect that needs fixing P2 Medium priority issues to be scheduled in a future release labels Mar 4, 2021

carlwilson assigned MartinSpeller Mar 5, 2021

karenhanson mentioned this issue Jan 26, 2022

Safely exit infinite loops on AProfile.outlinesOK / checkItemOutline #704

Merged

carlwilson added a commit that referenced this issue Apr 7, 2022

FIX: Regresession tests for PR 652

ef9b82b

Patch integration tests and added regression test files for #652: - patched the result of pdf-hul-76-372051162.pdf; and - added regression tests for #645 and #646.

carlwilson mentioned this issue Apr 7, 2022

FIX: issues with backslash sequence #719

Merged

carlwilson closed this as completed in #719 Apr 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TimeOut / stuck in loop (?) - 1.24.1 PDF-hul for attached file #646

TimeOut / stuck in loop (?) - 1.24.1 PDF-hul for attached file #646

asciim0 commented Sep 10, 2020

rosetta-development commented Oct 14, 2020

karenhanson commented Mar 25, 2021

TimeOut / stuck in loop (?) - 1.24.1 PDF-hul for attached file #646

TimeOut / stuck in loop (?) - 1.24.1 PDF-hul for attached file #646

Comments

asciim0 commented Sep 10, 2020

rosetta-development commented Oct 14, 2020

karenhanson commented Mar 25, 2021