Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include full page text, with inline highlighting #25

Open
robertandrews opened this issue Oct 15, 2022 · 5 comments
Open

Include full page text, with inline highlighting #25

robertandrews opened this issue Oct 15, 2022 · 5 comments
Labels
wontfix This will not be worked on

Comments

@robertandrews
Copy link

robertandrews commented Oct 15, 2022

I believe Raindrop's API has the capability to return full text of source pages. This is for paying customers, which I am not currently.

This would allow an option to return full page text into Obsidian...

And that might allow an optional alternative method of presenting highlights...

Rather than return only highlights, in sequence, you could wrap the corresponding text portions from the full text in the Markdown highlight syntax (==).

IMG_0495

This would allow the user to a) see full page text, b) see highlights in their full context and c) align the Obsidian-side experience of raindrops with the experience inside Raindrop itself.

I don't know how Raindrop stores/returns the full page text, ie HTML? If so, it may require conversion to Markdown.

May need to function on raindrop type "article" only.

@the-c0d3r
Copy link

This would be awesome to have. Sometimes articles can be full of information you want to keep, and highlighting everything is obviously not a good idea, and on top of that, images are a problem saving them as highlights.

If only we could use raindrop (and/or this plugin) to keep the whole article offline.

@kaiiiz
Copy link
Owner

kaiiiz commented Oct 31, 2022

I believe Raindrop's API has the capability to return full text of source pages.

No, it's currently not available. Raindrop API provides an endpoint to a permanent copy, but it only redirects to the saved page and not return the full text of source pages.

There are some challenges to implementing this feature in this plugin:

  1. Raindrop doesn't provide full text API. This means the plugin needs to implement a web crawler. The simplest pipeline is request -> parsing -> readability conversion -> convert to markdown.
  2. Raindrop doesn't provide the location information for each highlight. The plugin needs an efficient algorithm to find the appropriate location to insert == symbol. This is not trivial due to the readability conversion may corrupt the web content, causing the highlighted text from Raindrop API doesn't match with the converted web content.
  3. Syncing the new highlight requires overwriting the whole markdown content, meaning that the new highlight added from Obsidian will be overwritten. (BTW. Two-way syncing is merely impossible given the high flexibility of editing.)

I think the most stable implementation is to "archive" the full web page to the Obsidian vault (maybe through obsidian web clipper) and directly highlight and link the content in Obsidian.

I'm not planning to implement this feature at this moment unless Raindrop provides a more stable solution.

@kaiiiz kaiiiz added the wontfix This will not be worked on label Oct 31, 2022
@robertandrews
Copy link
Author

No, it's currently not available. Raindrop API provides an endpoint to a permanent copy, but it only redirects to the saved page and not return the full text of source pages.

Hmm... so the "permanent copy" is not stored in the Raindrop database, it is a copy of the HTML page stored on some server?

Either way, I see why more work would need to be done to pull the article.

It's a shame since UI views for raindrops include both "Web" and "Preview", the latter of which is actually the stripped-down readability/plaintext version of the page, but with highlights showing. So Raindrop is doing that at some point; it's a shame if it's not being surfaced.

Thanks for looking.

@robertandrews
Copy link
Author

robertandrews commented Oct 31, 2022

I've just taken out a Raindrop Pro subscription to test.

Re: the plaintext article version I talked about. I've done some poking around...

For me, the https://api.raindrop.io/rest/v1/raindrop/{id}/cache endpoint is failing with error 200 and no body content, so I can't see what's behind it, but I presume it's the first of these two URLs.

Screenshot 2022-10-31 at 16 41 55

Source of the Preview mode page reveals...

  • a) a link to the original article URL
  • b) three included files...
  1. https://preview.systems/article/app.css?v1.0.50
  2. https://preview.systems/article/safari.js?v1.0.50
  3. https://preview.systems/article/app.js?v1.0.50

Number 2 is Readability.
Number 3 seems to both include DOMPurify, which cleans up HTML, and then all the Javascript for using it. There is stuff in here about mark positioning.

Looks like Raindrop is already doing the work and using those to render a stripped-down version.

Back inside the app, the URL/s is/are rendering highlighted text with <mark>. It includes the highlight's own ID (the same one returned in the "highlights" node of the response from the "raindrop" endpoint) as a data attribute...

<p>I think <mark data-rdhid="635422f95002697a01898a7e" title="Great writing here.">the devil is real and he wants you to be more productive<svg class="rdhni" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 10 10"> <path d="M8 0a2 2 0 0 1 2 2v8L6 8H2a2 2 0 0 1-2-2V2C0 .9.9 0 2 0h6ZM2 3a1 1 0 1 0 0 2 1 1 0 0 0 0-2Zm3 0a1 1 0 1 0 0 2 1 1 0 0 0 0-2Zm3 0a1 1 0 1 0 0 2 1 1 0 0 0 0-2Z"></path> </svg></mark>. He’s everywhere, spreading wickedness disguised as wisdom. Here he is in&nbsp;<em><a href="https://www.forbes.com/sites/ilyapozin/2013/08/14/9-habits-of-productive-people/?sh=4bb8f2b22d3f">Forbes</a>:</em></p>

There, title="Great writing here." denotes an annotation I added to this highlight.

Whilst the readability-generated URL is infinitely more uniform and scrapable, it does not seem to include the highlights.

However, it may be useful to note you, in these files, you can see a lot of the logic for how Raindrop uses Readability etc to do stripping-back and matching.

@iroQuai
Copy link

iroQuai commented May 21, 2023

The omnivore obsidian plugin offers this! Thinking of switching from raindrop to omnivore for several reasons, but it seems it needs to mature a bit more first

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
wontfix This will not be worked on
Projects
None yet
Development

No branches or pull requests

4 participants