You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If a document has StructTreeRoot it's even more likely we're dealing with a bona fide OCR + tagged document, but the user can still request --force-ocr. Although there may be tricky cases like a mixed content document of digital output + scanned. This has interaction with --pages as well.
First, we need to warn if this object is present (and covering all requested --pages?) since it's a stronger warning against OCR.
Currently using --force-ocr will leave an invalid StructTreeRoot full of pointers to deleted objects. If we're doing --force-ocr we should delete all objects in the tree on each processed page.
For --skip-text we ought to leave StructTreeRoot intact. Except in contrived cases, the StructTreeRoot will not reference any pages with known text.
For --redo-ocr we also need to discard all objects, because our new objects may not match the old ones.
The text was updated successfully, but these errors were encountered:
If a document has StructTreeRoot it's even more likely we're dealing with a bona fide OCR + tagged document, but the user can still request --force-ocr. Although there may be tricky cases like a mixed content document of digital output + scanned. This has interaction with --pages as well.
First, we need to warn if this object is present (and covering all requested --pages?) since it's a stronger warning against OCR.
Currently using --force-ocr will leave an invalid StructTreeRoot full of pointers to deleted objects. If we're doing --force-ocr we should delete all objects in the tree on each processed page.
For --skip-text we ought to leave StructTreeRoot intact. Except in contrived cases, the StructTreeRoot will not reference any pages with known text.
For --redo-ocr we also need to discard all objects, because our new objects may not match the old ones.
The text was updated successfully, but these errors were encountered: