Skip to content

Commit

Permalink
chore
Browse files Browse the repository at this point in the history
  • Loading branch information
chrisgrieser committed Jun 26, 2023
1 parent 676b3ef commit 9067af3
Show file tree
Hide file tree
Showing 8 changed files with 734 additions and 462 deletions.
1 change: 0 additions & 1 deletion .eslintrc.yml

This file was deleted.

4 changes: 4 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
.PHONY: release
release:
zsh ./release.sh

29 changes: 29 additions & 0 deletions cheatsheet-pdf-annotation-extractor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
# PDF Annotation Extractor
Use the hotkey to trigger the Annotation Extraction on the PDF file currently selected in Finder.

__Annotation Types extracted__
- Highlight ➡️ bullet point, quoting text and prepending the comment
- Underline ➡️ output to [Drafts.app](https://getdrafts.com/); they are not included in the annotations.
- Free Comment ➡️ blockquote of the comment text
- Strikethrough ➡️ Markdown strikethrough
- Rectangle ➡️ image

## Automatic Page Number Identification
Instead of the PDF page numbers, this workflow retrieves information about the *real* page numbers from the BibTeX library and inserts them. If there is no page data in the BibTeX entry (e.g., monographies), you are prompted to enter the page number manually.
- In that case, enter the __real page number__ of your __first PDF page__.
- In case there is content before the actual text (e.g., a foreword or Table of Contents), the real page number `1` often occurs later in the PDF. In that case, you must enter a __negative page number__, reflecting the true page number the first PDF would have. *Example: Your PDF is a book which has a foreword, and uses roman numbers for it; real page number 1 is PDF page number 12. If you continued the numbering backwards, the first PDF page would have page number `-10`, you enter the value `-10` when prompted for a page number.*

## Annotation Codes
Insert these special codes at the __beginning__ of an annotation to invoke special actions on that annotation. Annotation Codes do not apply to Strikethroughs. (You can run the Alfred command `acode` to display a cheat sheet showing all the following information.)

- `+`: Merge this highlight/underline with the previous highlight/underline. Works for annotations on the same page (= skipping text in between) and for annotations across two pages.
- `? foo` __(free comments)__: Turns "foo" into a [Question Callout](https://help.obsidian.md/How+to/Use+callouts) (`> ![QUESTION]`) and move up. (Callouts are Obsidian-specific Syntax.)
- `##`: Turns highlighted/underlined text into a __heading__ that is added at that location. The number of `#` determines the heading level. If the annotation is a free comment, the text following the `#` is used as heading instead (Space after `#` required).
- `=`: Adds highlighted/underlined text as __tags__ to the YAML-frontmatter (mostly used for Obsidian as output). If the annotation is a free comment, uses the text after the `=`. In both cases, the annotation is removed afterwards.
- `_` __(highlights only)__: Removes the `_` and creates a copy of the annotation, but with the type `underline`. This annotation code avoids having to highlight *and* underline the same text segment to have it in both places.

## Extracting Images
- The respective images is saved in the `attachments` subfolder of the output folder, and named `{citekey}_image{n}.png`.
- The images is embedded in the markdown file with the `![[ ]]` syntax, e.g. `![[filename.png|foobar]]`
- Any `rectangle` type annotation in the PDF is extracted as image.
- If the rectangle annotation has any comment, it is used as the alt-text for the image. (Note that some PDF readers like PDF Expert do not allow you to add a comment to rectangular annotations.)
227 changes: 209 additions & 18 deletions info.plist
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,41 @@
<dict>
<key>bundleid</key>
<string>de.chris-grieser.pdf-annotation-extraction</string>
<key>category</key>
<string>⭐️</string>
<key>connections</key>
<dict>
<key>5B84D786-4228-4895-B183-DF571B003C75</key>
<key>65783839-9CCB-48D0-A740-1F7BF96926D1</key>
<array>
<dict>
<key>destinationuid</key>
<string>5EE42C29-6B2C-42C9-BF53-12C26B7B22D8</string>
<string>0C315F47-D751-4D59-8DFA-2CFC7BF581A2</string>
<key>modifiers</key>
<integer>0</integer>
<key>modifiersubtext</key>
<string></string>
<key>vitoclose</key>
<false/>
</dict>
</array>
<key>77F6FCC3-FA68-477B-BBB0-40F8C4911955</key>
<array>
<dict>
<key>destinationuid</key>
<string>65783839-9CCB-48D0-A740-1F7BF96926D1</string>
<key>modifiers</key>
<integer>0</integer>
<key>modifiersubtext</key>
<string></string>
<key>vitoclose</key>
<false/>
</dict>
</array>
<key>CD293D43-2356-49DB-AD4F-DBA5FDE026C5</key>
<array>
<dict>
<key>destinationuid</key>
<string>3069BB02-5E48-40CF-9DD8-337C5FA9F054</string>
<key>modifiers</key>
<integer>0</integer>
<key>modifiersubtext</key>
Expand All @@ -19,8 +47,6 @@
<false/>
</dict>
</array>
<key>5EE42C29-6B2C-42C9-BF53-12C26B7B22D8</key>
<array/>
</dict>
<key>createdby</key>
<string>Chris Grieser</string>
Expand All @@ -40,9 +66,9 @@
<key>argument</key>
<integer>0</integer>
<key>focusedappvariable</key>
<true/>
<false/>
<key>focusedappvariablename</key>
<string>focusedapp</string>
<string></string>
<key>hotkey</key>
<integer>0</integer>
<key>hotmod</key>
Expand All @@ -52,10 +78,11 @@
<key>leftcursor</key>
<false/>
<key>modsmode</key>
<integer>2</integer>
<integer>0</integer>
<key>relatedApps</key>
<array>
<string>com.apple.finder</string>
<string>com.readdle.PDFExpert-Mac</string>
<string>net.highlightsapp.universal</string>
</array>
<key>relatedAppsMode</key>
Expand All @@ -64,7 +91,7 @@
<key>type</key>
<string>alfred.workflow.trigger.hotkey</string>
<key>uid</key>
<string>5B84D786-4228-4895-B183-DF571B003C75</string>
<string>77F6FCC3-FA68-477B-BBB0-40F8C4911955</string>
<key>version</key>
<integer>2</integer>
</dict>
Expand All @@ -80,17 +107,82 @@
<key>scriptargtype</key>
<integer>1</integer>
<key>scriptfile</key>
<string>scripts/run-extraction.sh</string>
<string>./scripts/run-extraction.sh</string>
<key>type</key>
<integer>8</integer>
</dict>
<key>type</key>
<string>alfred.workflow.action.script</string>
<key>uid</key>
<string>5EE42C29-6B2C-42C9-BF53-12C26B7B22D8</string>
<string>65783839-9CCB-48D0-A740-1F7BF96926D1</string>
<key>version</key>
<integer>2</integer>
</dict>
<dict>
<key>config</key>
<dict>
<key>lastpathcomponent</key>
<false/>
<key>onlyshowifquerypopulated</key>
<true/>
<key>removeextension</key>
<false/>
<key>text</key>
<string></string>
<key>title</key>
<string>{query}</string>
</dict>
<key>type</key>
<string>alfred.workflow.output.notification</string>
<key>uid</key>
<string>0C315F47-D751-4D59-8DFA-2CFC7BF581A2</string>
<key>version</key>
<integer>1</integer>
</dict>
<dict>
<key>config</key>
<dict>
<key>concurrently</key>
<false/>
<key>escaping</key>
<integer>102</integer>
<key>script</key>
<string>qlmanage -p "./cheatsheet-pdf-annotation-extractor.md"</string>
<key>scriptargtype</key>
<integer>1</integer>
<key>scriptfile</key>
<string></string>
<key>type</key>
<integer>5</integer>
</dict>
<key>type</key>
<string>alfred.workflow.action.script</string>
<key>uid</key>
<string>3069BB02-5E48-40CF-9DD8-337C5FA9F054</string>
<key>version</key>
<integer>2</integer>
</dict>
<dict>
<key>config</key>
<dict>
<key>argumenttype</key>
<integer>2</integer>
<key>keyword</key>
<string>acodes</string>
<key>subtext</key>
<string></string>
<key>text</key>
<string>Cheatsheet for PDF Annotation Extractor</string>
<key>withspace</key>
<false/>
</dict>
<key>type</key>
<string>alfred.workflow.input.keyword</string>
<key>uid</key>
<string>CD293D43-2356-49DB-AD4F-DBA5FDE026C5</string>
<key>version</key>
<integer>1</integer>
</dict>
</array>
<key>readme</key>
<string>## Alfred Gallery[This workflow is now included in the Alfred gallery](http://alfred.app/workflows/chrisgrieser/pdf-annotation-extractor-alfred/) and auto-updates will there be managed by Alfred and not by this workflow anymore.---
Expand Down Expand Up @@ -122,31 +214,130 @@ Extract Annotations as Markdown, insert Pandoc Citations with correct page numbe
[Ko-Fi](https://ko-fi.com/pseudometa)</string>
<key>uidata</key>
<dict>
<key>5B84D786-4228-4895-B183-DF571B003C75</key>
<key>0C315F47-D751-4D59-8DFA-2CFC7BF581A2</key>
<dict>
<key>colorindex</key>
<integer>7</integer>
<integer>9</integer>
<key>note</key>
<string>DOUBLE CLICK THIS to define Hotkey</string>
<string>reports error messages</string>
<key>xpos</key>
<real>30</real>
<real>350</real>
<key>ypos</key>
<real>35</real>
<real>75</real>
</dict>
<key>5EE42C29-6B2C-42C9-BF53-12C26B7B22D8</key>
<key>3069BB02-5E48-40CF-9DD8-337C5FA9F054</key>
<dict>
<key>colorindex</key>
<integer>2</integer>
<integer>9</integer>
<key>xpos</key>
<real>195</real>
<key>ypos</key>
<real>220</real>
</dict>
<key>65783839-9CCB-48D0-A740-1F7BF96926D1</key>
<dict>
<key>colorindex</key>
<integer>9</integer>
<key>note</key>
<string>run extraction</string>
<key>xpos</key>
<real>215</real>
<real>195</real>
<key>ypos</key>
<real>75</real>
</dict>
<key>77F6FCC3-FA68-477B-BBB0-40F8C4911955</key>
<dict>
<key>colorindex</key>
<integer>9</integer>
<key>note</key>
<string>run PDF annotation extractor</string>
<key>xpos</key>
<real>35</real>
<key>ypos</key>
<real>75</real>
</dict>
<key>CD293D43-2356-49DB-AD4F-DBA5FDE026C5</key>
<dict>
<key>colorindex</key>
<integer>9</integer>
<key>note</key>
<string>cheatsheet for PDF Annotation Extractor</string>
<key>xpos</key>
<real>35</real>
<key>ypos</key>
<real>220</real>
</dict>
</dict>
<key>userconfigurationconfig</key>
<array>
<dict>
<key>config</key>
<dict>
<key>default</key>
<string></string>
<key>filtermode</key>
<integer>2</integer>
<key>placeholder</key>
<string></string>
<key>required</key>
<true/>
</dict>
<key>description</key>
<string>The .bib file containing your library.</string>
<key>label</key>
<string>BibTeX Library Path</string>
<key>type</key>
<string>filepicker</string>
<key>variable</key>
<string>bibtex_library_path</string>
</dict>
<dict>
<key>config</key>
<dict>
<key>default</key>
<string>~/Documents</string>
<key>filtermode</key>
<integer>1</integer>
<key>placeholder</key>
<string></string>
<key>required</key>
<true/>
</dict>
<key>description</key>
<string>If folder is inside an Obsidian vault, will open the file in Obsidian after extraction.</string>
<key>label</key>
<string>Output Path</string>
<key>type</key>
<string>filepicker</string>
<key>variable</key>
<string>output_path</string>
</dict>
<dict>
<key>config</key>
<dict>
<key>default</key>
<string>pdfannots2json</string>
<key>pairs</key>
<array>
<array>
<string>pdfannots2json</string>
<string>pdfannots2json</string>
</array>
<array>
<string>pdfannots</string>
<string>pdfannots</string>
</array>
</array>
</dict>
<key>description</key>
<string>Advanced users only. Normally, this should stay "pdfannots2json". (`pdfannots` requries the respective pip package.)</string>
<key>label</key>
<string>Extraction Engine</string>
<key>type</key>
<string>popupbutton</string>
<key>variable</key>
<string>extraction_engine</string>
</dict>
<dict>
<key>config</key>
<dict>
Expand Down
File renamed without changes.
21 changes: 15 additions & 6 deletions scripts/get-pdf-path.applescript
100644 → 100755
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@

tell application "System Events" to set frontApp to (name of first process where it is frontmost)

# PDF Expert
# opens Finder and then lets the Finder part do its thing
# PDF Expert: # opens Finder, so the subsequent block can do it's work
if (frontApp is "PDF Expert") then
tell application "System Events"
tell process "PDF Expert"
Expand All @@ -15,12 +14,22 @@ if (frontApp is "PDF Expert") then
delay 0.5
end if

# Highlights
# HACK to get filepath, requires "pdf_folder_highlights_app" being set
# Finder
if (frontApp is "Finder" or frontApp is "PDF Expert") then
tell application "Finder" to set sel to selection
if ((count sel) > 1) then
set firstItem to item 1 of sel
set current_file to POSIX path of (firstItem as text)
else
set current_file to POSIX path of (sel as text)
end if
end if

# Highlights # HACK to get filepath
if (frontApp is "Highlights") then

# resolved PDF Folder
set pdfFolder to (system attribute "pdf_folder_highlights_app")
set pdfFolder to (system attribute "pdf_folder")
set AppleScript's text item delimiters to "~/"
set theTextItems to every text item of pdfFolder
set AppleScript's text item delimiters to (POSIX path of (path to home folder as string))
Expand All @@ -38,7 +47,7 @@ if (frontApp is "Highlights") then
set AppleScript's text item delimiters to ""
set filename to text item 1 of frontWindow

set current_file to do shell script ("find '" & pdfFolder & "' -type f -name '" & filename & "'")
set current_file to do shell script ("find " & (quoted form of pdfFolder) & " -type f -name " & (quoted form of filename))
end if

current_file # direct return
Expand Down
Loading

0 comments on commit 9067af3

Please sign in to comment.