Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML Parameter #42

Closed
j3vr0n opened this issue Dec 8, 2020 · 2 comments
Closed

HTML Parameter #42

j3vr0n opened this issue Dec 8, 2020 · 2 comments

Comments

@j3vr0n
Copy link

j3vr0n commented Dec 8, 2020

I read a previous post that mentioned capability for the HTML parameter, in which I could render a JS application using another tool (BS or Selenium) and pass in the HTML data for AutoScraper to parse. Does anyone have steps or documentation on how to use this parameter?

@go-delicious
Copy link

I read a previous post that mentioned capability for the HTML parameter, in which I could render a JS application using another tool (BS or Selenium) and pass in the HTML data for AutoScraper to parse. Does anyone have steps or documentation on how to use this parameter?

In the first example it says you can parse the html instead of the URL.
https://github.com/alirezamika/autoscraper#getting-exact-result

Just do a request with selenium etc, and return the html. Then put it in there.

@j3vr0n
Copy link
Author

j3vr0n commented Dec 9, 2020

Awesome, I have a third party scraping tool that I actively use and am looking to embed this Python code as part of my jobs in order to have better "self-healing" measures for website changes from the HTML.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants