Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incremental extraction of highlights #94

Closed
nanjigen opened this issue Feb 11, 2020 · 3 comments
Closed

Incremental extraction of highlights #94

nanjigen opened this issue Feb 11, 2020 · 3 comments

Comments

@nanjigen
Copy link

I am trying to implement an 'incremental reading' system using org-noter, org-brain and org-drill (as well as Anki).

Currently I read a given PDF, highlighting portions of the text I want extracted. I then extract these highlights using org-noter-create-skeleton and add a :drill: tag to the subsequent tree. I org-drill these items, slowly whittling them down and eventually exporting them to Anki.

This process works well initially, when that first extraction occurs with org-noter-create-skeleton. However subsequent extractions create new skeletons with all previous highlights, and I need to dig through the tree to find the new highlights.

I wonder if its possible to extract only highlights that aren't already in the org-file and append them to the end of the tree? To throw a spanner in the works I'm using @fuxialexander 's org-pdftools and their pdf-notes-booster branch, which gives precise locations, as @weirdNox of course already knows.

@UndeadKernel
Copy link

UndeadKernel commented Apr 3, 2020

I was just looking into something like this: I want to add an annotation to an opened PDF and then have a way to import that annotation into org noter.

I was thinking of implementing the new function org-noter-sync-annots, which would only export annotations that are not present in the headline under the point. A simple way to do this would be to modify org-noter-create-skeleton to also add to the annotations' property drawer the id returned by the function pdf-info-getannots. This id is supposed to be unique to all annotations in the same PDF.

This way, we can check if, for all annotations, an item with a property using its id is present or not. When the id is not present, we can just then export the new annotation (or somehow add it sorted). This also has the advantage of not requiring to change anything in pdf-tools.

I can probably program this but I was wondering if this is something that you @weirdNox would be interested in having. If yes, I can work on a PR.

@weirdNox
Copy link
Owner

Hello there! Sorry for "ghosting" this project; org-noter reached a point for me that is very stable and has all the features I need (I use it every day!), and I don't really have much free time anyway...

With that said, @UndeadKernel if you feel that you can do it, go ahead and suggest a pull request! The basic functionality of dumb syncing is easy enough, like you said.
However, you mentioned that you would also use org-noter-create-skeleton which is annotations->org, besides doing the export org->annotations. If you really want to sync both ways, there may be some problems, as you could run into data loss due to overwrites somewhere (ie. if both the org heading and the annotation change before syncing again). Maybe you can use the diffing utilities Emacs already has builtin.

Also, I believe that this what issue #27 is about, so it would be (at least) 2 issues with a single pull request! :D

@Ypot
Copy link

Ypot commented Sep 11, 2020

I was just looking into something like this: I want to add an annotation to an opened PDF and then have a way to import that annotation into org noter.

I was thinking of implementing the new function org-noter-sync-annots, which would only export annotations that are not present in the headline under the point. A simple way to do this would be to modify org-noter-create-skeleton to also add to the annotations' property drawer the id returned by the function pdf-info-getannots. This id is supposed to be unique to all annotations in the same PDF.

This way, we can check if, for all annotations, an item with a property using its id is present or not. When the id is not present, we can just then export the new annotation (or somehow add it sorted). This also has the advantage of not requiring to change anything in pdf-tools.

I can probably program this but I was wondering if this is something that you @weirdNox would be interested in having. If yes, I can work on a PR.

Could it be possible to have annotation/highlights grouped by page? It's a bit annoying to have each highlight of a page in different headings. Or maybe group them in quarters of the page if wanted to use precise note location.

@nanjigen nanjigen closed this as completed Feb 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants