Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extract text formatting information #24

Closed
Tracked by #19
alexkolson opened this issue Apr 4, 2023 · 2 comments
Closed
Tracked by #19

Extract text formatting information #24

alexkolson opened this issue Apr 4, 2023 · 2 comments

Comments

@alexkolson
Copy link
Contributor

Useful for #19.

@lhagan
Copy link

lhagan commented Apr 7, 2023

Do you know if Notes stores hyperlinks in the same way that it does other text formatting? The biggest issue I have dealing with Notes content is that most export methods (including apple-notes-liberator) don't preserve links.

For example, if you add some text to a note, e.g. "Wikipedia", and convert it into a link by selecting the text and pressing cmd-k then setting the destination to "https://www.wikipedia.org/", this works as a normal, clickable hyperlink within Notes. But once this note has been exported, you're left with just the word "Wikipedia" with no link/URL.

The only way I've found to preserve links is to select all the text in the note, copy it to the clipboard, and then paste it into another app that supports rich text. Not ideal for multiple notes, although the process could be scripted.

To sum up, I'm wondering whether "preserve links" is a separate feature request, or if it'd come "free" with text formatting. I spent some time browsing the database and it wasn't obvious to me where Apple is storing this information.

@alexkolson
Copy link
Contributor Author

Hi there!

I've added rudimentary support for extracting links in v1.1.0. Each extracted note now has a links property, which contains a list of links, in order, extracted from the note. See Output Format for more information. This is only probably minimally useful, if at all, because the links are taken out of context of where they are in the note. It's a start though, and it should be enough to build upon in order to support links when implementing something like #19.

apple_cloud_notes_parser extracts and preserves links in various output formats already, so you might try using that and see if it suits you. It is a really awesome project without which I wouldn't have been able to get even the slightest handle on Notes structure.

More detailed information

Yeah Notes does store hyperlinks in a similar way to other text formatting. It keeps a list of what it calls attribute runs in the note protobuf. I like to think of these as formatting information objects. You can then go through the attribute runs, see what kind it is, and then map it back to its location in the actual note via the attribute run's length property. Each attribute run has a length property, and the sum total length of all attribute runs in the list will equal the length of the note itself. Thus you can figure out where the attribute run (or formatting information object as I like to think of it) belongs in the note by taking the sum of the lengths of all previous attribute runs in the list.

Example
// Pseudocode
// myNoteProto is a NoteStoreProto
note = myNoteProto.document.note
noteText = note.noteText
formattingInformationObjects = note.attributeRunList
formattingObjectNoteTextIndex = 0
for each formattingInfo in formattingInformationObject
// current formattingInfo is located in note text at formattingObjectNoteTextIndex and continues for formattingInfo.length
// we could for example replace the note text between formattingObjectNoteTextIndex and formattingInfo.length with whatever
// representation of formattingInfo suits us best

// then we move along the formattingObjectNoteIndex in preparation for the next formattingInformationObject
 formattingObjectNoteTextIndex += formattingInfo.length

Right now I'm not replacing any actual note text, but extracting the link text follows the same procedure. You can see it here:

I hope this helps a little bit!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants