[Feature Request] Local Scraper (Use browser auth) #172

brucealdridge · 2024-05-19T23:06:00Z

There are a number of sites that I visit that I would like to bookmark that require authentication. A good example of this is news sites with content behind paywalls.

Using a scraper on a server won't work and will instead just save a login screen.

Can the browser extensions pass a copy of the page via the API and save that?

kamtschatka · 2024-05-21T19:15:48Z

Yes, taking screenshots is possible with chrome extensions.
One issue would be that rescraping the page would not be possible as you would go to the login screen again, so there would need to be some kind of "prevent rescraping" flag.
Another option would probably be to use your locally installed chrome instance for scraping the data by running a worker locally. I am not sure how user friendly that would be^^.

MohamedBassem · 2024-05-22T08:15:31Z

This makes a lot of sense. The extension itself can capture the page content so that hoarder doesn't need to crawl it. This is a reasonable feature request, will add it to our todo list :)

kureta · 2024-05-27T20:21:37Z

This would be a great feature. Also tubearchivist has a browser extension that syncs your youtube cookies with the tube archivist server. An extension that automatically shares all your cookies, or let's you choose which cookies to share, or sends the cookies of current page to hoarder before it starts scraping might be an option.

javydekoning · 2024-10-03T17:26:44Z

A similar solution to Evernote Web clipper would be awesome.

Select some text/images -> right click -> hoard.

https://chromewebstore.google.com/detail/evernote-web-clipper/pioclpoplcdbaefihamjohnefbikjilc?hl=en

huyz · 2024-10-07T07:08:03Z

See also https://github.com/webclipper/web-clipper

NotChristianGarcia · 2024-11-10T23:06:22Z

^ web-clipper does exist and work. I didn't like the flow too much.

SingleFile is another project to check. It outputs .html (or an archive). I think it's easier to manage and quick to run. It has an Upload to a REST form API option in settings that sets a destination that hoarder could use if hoarder doesn't want to re-implement scraping.

MohamedBassem added the feature request New feature or request label May 22, 2024

MohamedBassem mentioned this issue Jul 27, 2024

[FR] Support web clipping in the browser extensions #330

Closed

huyz mentioned this issue Aug 7, 2024

Cloudflare Captchas #344

Closed

MohamedBassem added this to Hoarder's Roadmap Oct 5, 2024

MohamedBassem mentioned this issue Nov 1, 2024

Crawl / store pages that are behind a login #607

Closed

1 task

This was referenced Nov 15, 2024

To cache full content that is behind a pay wall #659

Closed

Add customized cookie support #648

Closed

Subscription-based content not appearing #668

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Local Scraper (Use browser auth) #172

[Feature Request] Local Scraper (Use browser auth) #172

brucealdridge commented May 19, 2024

kamtschatka commented May 21, 2024

MohamedBassem commented May 22, 2024

kureta commented May 27, 2024

javydekoning commented Oct 3, 2024

huyz commented Oct 7, 2024

NotChristianGarcia commented Nov 10, 2024

[Feature Request] Local Scraper (Use browser auth) #172

[Feature Request] Local Scraper (Use browser auth) #172

Comments

brucealdridge commented May 19, 2024

kamtschatka commented May 21, 2024

MohamedBassem commented May 22, 2024

kureta commented May 27, 2024

javydekoning commented Oct 3, 2024

huyz commented Oct 7, 2024

NotChristianGarcia commented Nov 10, 2024