Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scraping improvements #45

Closed
4 of 5 tasks
Rafiot opened this issue May 22, 2019 · 7 comments
Closed
4 of 5 tasks

Scraping improvements #45

Rafiot opened this issue May 22, 2019 · 7 comments

Comments

@Rafiot
Copy link
Member

Rafiot commented May 22, 2019

  • Proxy support
  • Pass a pre-generated cookie
  • Initial referrer
  • Locale of the browser
  • Login creds <= how to pass them properly in the webpage will be challenging (solved by passing a valid cookie)
@Rafiot Rafiot changed the title Proxy support ? Scraping improvements May 23, 2019
@quinnnorton
Copy link
Collaborator

also ability to pass login creds

@Rafiot
Copy link
Member Author

Rafiot commented May 23, 2019

(added in the list)

The first step will be to pass a cookie. Passing the credentials in a webpage without user interaction will be challenging.

@quinnnorton
Copy link
Collaborator

we should think about how we can push this back on the lookyloo user wrt internal or often used sites... can they do the config so it's not a general case?

@Rafiot
Copy link
Member Author

Rafiot commented Jan 23, 2020

This commit (f1d83d2) allows to load cookies exported via https://addons.mozilla.org/en-US/firefox/addon/cookie-quick-manager/

The way to go is to dump a file cookies.json in JSON format (the default of the plugin) in the root directory of lookyloo. The File will be automatically loaded, every cookies converted in the HAR cookies format (as required by splash), and sent along with the initial query when scraping a website.

@Rafiot
Copy link
Member Author

Rafiot commented Nov 19, 2020

For Proxy support, see example 5 there: https://splash.readthedocs.io/en/stable/scripting-ref.html#splash-on-request

@stale
Copy link

stale bot commented Mar 19, 2021

Close call! This issue has been marked as stale because it has not had any recent activity. It should be closed if no further activity occurs. Add a comment or push a commit to keep this issue stay alive and kicking. Thank you for your contribution; it is appreciated.

@stale stale bot added the stale label Mar 19, 2021
@Rafiot
Copy link
Member Author

Rafiot commented Jul 22, 2022

Done.

@Rafiot Rafiot closed this as completed Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants