Incremental extraction of highlights #94

nanjigen · 2020-02-11T07:59:36Z

I am trying to implement an 'incremental reading' system using org-noter, org-brain and org-drill (as well as Anki).

Currently I read a given PDF, highlighting portions of the text I want extracted. I then extract these highlights using org-noter-create-skeleton and add a :drill: tag to the subsequent tree. I org-drill these items, slowly whittling them down and eventually exporting them to Anki.

This process works well initially, when that first extraction occurs with org-noter-create-skeleton. However subsequent extractions create new skeletons with all previous highlights, and I need to dig through the tree to find the new highlights.

I wonder if its possible to extract only highlights that aren't already in the org-file and append them to the end of the tree? To throw a spanner in the works I'm using @fuxialexander 's org-pdftools and their pdf-notes-booster branch, which gives precise locations, as @weirdNox of course already knows.

The text was updated successfully, but these errors were encountered:

UndeadKernel · 2020-04-03T16:52:53Z

I was just looking into something like this: I want to add an annotation to an opened PDF and then have a way to import that annotation into org noter.

I was thinking of implementing the new function org-noter-sync-annots, which would only export annotations that are not present in the headline under the point. A simple way to do this would be to modify org-noter-create-skeleton to also add to the annotations' property drawer the id returned by the function pdf-info-getannots. This id is supposed to be unique to all annotations in the same PDF.

This way, we can check if, for all annotations, an item with a property using its id is present or not. When the id is not present, we can just then export the new annotation (or somehow add it sorted). This also has the advantage of not requiring to change anything in pdf-tools.

I can probably program this but I was wondering if this is something that you @weirdNox would be interested in having. If yes, I can work on a PR.

weirdNox · 2020-04-16T21:38:07Z

Hello there! Sorry for "ghosting" this project; org-noter reached a point for me that is very stable and has all the features I need (I use it every day!), and I don't really have much free time anyway...

With that said, @UndeadKernel if you feel that you can do it, go ahead and suggest a pull request! The basic functionality of dumb syncing is easy enough, like you said.
However, you mentioned that you would also use org-noter-create-skeleton which is annotations->org, besides doing the export org->annotations. If you really want to sync both ways, there may be some problems, as you could run into data loss due to overwrites somewhere (ie. if both the org heading and the annotation change before syncing again). Maybe you can use the diffing utilities Emacs already has builtin.

Also, I believe that this what issue #27 is about, so it would be (at least) 2 issues with a single pull request! :D

Ypot · 2020-09-11T03:13:31Z

I was just looking into something like this: I want to add an annotation to an opened PDF and then have a way to import that annotation into org noter.

I was thinking of implementing the new function org-noter-sync-annots, which would only export annotations that are not present in the headline under the point. A simple way to do this would be to modify org-noter-create-skeleton to also add to the annotations' property drawer the id returned by the function pdf-info-getannots. This id is supposed to be unique to all annotations in the same PDF.

This way, we can check if, for all annotations, an item with a property using its id is present or not. When the id is not present, we can just then export the new annotation (or somehow add it sorted). This also has the advantage of not requiring to change anything in pdf-tools.

I can probably program this but I was wondering if this is something that you @weirdNox would be interested in having. If yes, I can work on a PR.

Could it be possible to have annotation/highlights grouped by page? It's a bit annoying to have each highlight of a page in different headings. Or maybe group them in quarters of the page if wanted to use precise note location.

nanjigen mentioned this issue Mar 11, 2020

Incremental Reading of PDFs l3kn/org-fc#15

Closed

nanjigen mentioned this issue Apr 3, 2020

Incremental reading of PDF's fuxialexander/org-pdftools#22

Open

nanjigen closed this as completed Feb 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental extraction of highlights #94

Incremental extraction of highlights #94

nanjigen commented Feb 11, 2020

UndeadKernel commented Apr 3, 2020 •

edited

weirdNox commented Apr 16, 2020

Ypot commented Sep 11, 2020

Incremental extraction of highlights #94

Incremental extraction of highlights #94

Comments

nanjigen commented Feb 11, 2020

UndeadKernel commented Apr 3, 2020 • edited

weirdNox commented Apr 16, 2020

Ypot commented Sep 11, 2020

UndeadKernel commented Apr 3, 2020 •

edited