Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add keyboard annotation command #29

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

dalanicolai
Copy link
Contributor

This PR adds a keyboard annotation command, to add markup annotations using only the keyboard.
A description for its usage is given in its docstring. No problem if you do not like to merge this, but I guess
some people would like it.

@dalanicolai dalanicolai force-pushed the add-keyboard-annotation-command branch from a2aa2f5 to 66d7f9f Compare June 15, 2021 04:17
@vedang
Copy link
Owner

vedang commented Jun 20, 2021

Hi @dalanicolai,

I went through the code you have attached. It cleverly sets up
highlighting a region using the already implemented search
functionality. I'd love to have this in pdf-tools.

However, I want to see this functionality implemented on the back of
the default set-mark-command (C-SPC) to select regions.

I imagine the work flow would be exactly as it is in any normal Emacs
buffer:

  • Search for a word (C-s, already works in pdf-tools)
  • Start marking a region (C-SPC, does not work in pdf-tools). Mark
    the desired region with C-n, C-f, C-p, C-b commands.
  • Once we have an active region, use existing annotation keybindings
    to create the necessary annotation.

I will leave this PR open for folks who will find this patch useful,
or for anyone to try and implement this workflow above.

@dalanicolai
Copy link
Contributor Author

@vedang Your idea sounds nice, but I guess it will be quite cumbersome to implement this with only the current set of query functions available. As far as I know the current set only offers the possibility to either obtain all text-regions on a page in full-text blocks, or otherwise obtain a single region by searching for some regexp/string match.
However, poppler almost certainly provides functionality to return text regions structured by characters/words/lines etc. So if you would like to implement it like you propose, then I would suggest extending the query options in the epdfinfo server. When you do so, then you could also extend the annotation options, because poppler also provides arrow and free-text annotations.

I have implemented such annotation functionality for pdf-tools in pymupdf-mode. Pymupdf is another option that can be used for retrieving text-regions structured by characters/words/lines. pymupdf-mode 'communicates' with pymupdf via an interactive repl, which makes this mode really slow (it was just an experiment, and then of course I prefered an interactive implementation). However, in the meantime I have discovered that there exists also emacs-epc, I expect implementing pymupdf-mode using epc would be much faster. I would argue that it does not really matter if you obtain info about the text-regions via pymupf or via epdfinfo server. Of course the epdfinfo server will be slightly faster and 'native', but pymupdf would be more 'hackable' and probably more than fast enough (mupdf itself is considerably faster than poppler btw, so maybe even using it via python would be faster than using poppler directly).

Now that I wrote this, I actually realize that you can also use mutool to extract 'structured-text' from a pdf, but it only returns structured-xml structured by char (e.g. mutool draw -F stext filepath pagenumber).

Btw, just thinking with you...

@dalanicolai
Copy link
Contributor Author

Although I like the idea of what you are suggesting, after thinking a little more I would say it is more cumbersome than my current implementation. I already have a command to highlight a single word by typing it (or part of it). And as far as I understand, you intend to set the start position by searching for a word, which is what I do now also. But then you would like to expand the region with those keys, while I simply ask to type a second pattern to put the end mark of the region. So I think in practice, the current implementation is simpler and faster. Instead of using the existing keys to finally create the annotation, here the annotation is created automatically, where I have a customizable default annotation, and otherwise you can prefix the command with a universal argument to select another other annotation style. Did you try out the current implementation?

B.t.w. if the pdf would get rendered using librsvg, then the pdf-avy-highlight would work fast also. Which then might be the most convenient implementation.

@vedang vedang added the new feature implementation This is a substantial code change and / or implements significant new functionality label Dec 31, 2021
@vedang vedang added this to the v1.1.0 milestone Jan 6, 2022
@aikrahguzar aikrahguzar mentioned this pull request Jul 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
new feature implementation This is a substantial code change and / or implements significant new functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants