Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Local Scraper (Use browser auth) #172

Open
brucealdridge opened this issue May 19, 2024 · 6 comments
Open

[Feature Request] Local Scraper (Use browser auth) #172

brucealdridge opened this issue May 19, 2024 · 6 comments
Labels
feature request New feature or request

Comments

@brucealdridge
Copy link

There are a number of sites that I visit that I would like to bookmark that require authentication. A good example of this is news sites with content behind paywalls.

Using a scraper on a server won't work and will instead just save a login screen.

Can the browser extensions pass a copy of the page via the API and save that?

@kamtschatka
Copy link
Collaborator

Yes, taking screenshots is possible with chrome extensions.
One issue would be that rescraping the page would not be possible as you would go to the login screen again, so there would need to be some kind of "prevent rescraping" flag.
Another option would probably be to use your locally installed chrome instance for scraping the data by running a worker locally. I am not sure how user friendly that would be^^.

@MohamedBassem
Copy link
Collaborator

This makes a lot of sense. The extension itself can capture the page content so that hoarder doesn't need to crawl it. This is a reasonable feature request, will add it to our todo list :)

@MohamedBassem MohamedBassem added the feature request New feature or request label May 22, 2024
@kureta
Copy link

kureta commented May 27, 2024

This would be a great feature. Also tubearchivist has a browser extension that syncs your youtube cookies with the tube archivist server. An extension that automatically shares all your cookies, or let's you choose which cookies to share, or sends the cookies of current page to hoarder before it starts scraping might be an option.

@javydekoning
Copy link

A similar solution to Evernote Web clipper would be awesome.

Select some text/images -> right click -> hoard.

https://chromewebstore.google.com/detail/evernote-web-clipper/pioclpoplcdbaefihamjohnefbikjilc?hl=en

@huyz
Copy link

huyz commented Oct 7, 2024

See also https://github.com/webclipper/web-clipper

@NotChristianGarcia
Copy link

^ web-clipper does exist and work. I didn't like the flow too much.

SingleFile is another project to check. It outputs .html (or an archive). I think it's easier to manage and quick to run. It has an Upload to a REST form API option in settings that sets a destination that hoarder could use if hoarder doesn't want to re-implement scraping.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
Status: Backlog
Development

No branches or pull requests

7 participants