Skip to content

Extract Text from Annotiations and corresponding highlighted Text #1671

Answered by JorjMcKie
M-M99 asked this question in Q&A
Discussion options

You must be logged in to vote

There is one thing you must keep in mind:
Annotations are not part of the page's contents. Imagine them like dust on a nice painting on the wall. The items shown in the painting are not aware of any dust that may cover them. And like dust, annotations can be wiped out without changing the page itself.
You get the idea.
Accordingly, an annotation may cover just anything: text, drawings, images ... or nothing.
There is a rectangle associated with an annotation, annot.rect, which can be used to find out what is underneath it. For example do this to find any covered text: text = page.get_text(clip=annot.rect).

But of course, highlight annotations (like their friends: underlines, strike-throug…

Replies: 4 comments 5 replies

Comment options

You must be logged in to vote
4 replies
@M-M99
Comment options

@tristone13th
Comment options

@JorjMcKie
Comment options

@tristone13th
Comment options

Answer selected by JorjMcKie
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
1 reply
@JorjMcKie
Comment options

Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
5 participants
Converted from issue

This discussion was converted from issue #1670 on April 11, 2022 10:42.