UTF Encoding Fix for Annotation Comments #21
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello,
I encountered an issue when using Cyrillic characters in annotations: during export, the text was transformed into an unreadable set of characters. My knowledge of PDF is not sufficient to confidently pinpoint the exact cause of the problem. However, some code experiments helped me find a simple solution that appears to resolve the issue.
I will illustrate the problem with a specially created PDF file with annotations: Example.pdf. For maximum clarity, I will also provide screenshot
So, the screenshot shows a PDF with 2 lines of text and 2 annotated annotations in which Latin characters are combined with Cyrillic.
The export of this document looks as follows:
As you can see, the "comment" fields contain unreadable characters (where Unicode can be guessed).
The updated version produces results with correctly encoded characters:
This resolves the issue with unreadable characters in the comments.
P.S. This may not be crucial, but I'd like to mention that I'm using this project through your Obsidian-Zotero-Integrator. I should also note that I use Okular for annotation.