New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implementing Ajax functionality #97
Comments
I did use PhantomJS+casperjs, but I donot implement in crawler4j yet. I think it is good choice to add this extension, should we create a new project for PhantomJS+casperjs for java? |
Link extraction is done here (with the help of a parser class - can be found in the source as well) Line 316ff: https://github.com/yasserg/crawler4j/blob/master/src/main/java/edu/uci/ics/crawler4j/crawler/WebCrawler.java Fetching is done here: |
recently, many site has load the nodejs style in their page. in order to crawl their page, we cannot just simply wget or curl, we need to render the page. can crawl4j do that? |
Phantomjs doesn't support authentication NTLM 👎 yet ariya/phantomjs#11037 |
In order to accomplish this, you can use htmlunit for ajax processing |
Hi there,
i would like to implement ajax functionality via Selenium+PhantomJS. But i really need a starting point: Where is the actual content fetched? How does crawler4j extract links?
If you help me with that i will implement the ajax feature during the next days. You can also contact me @ flurz123@gmail.com
Thanks in advance
Best
fabian
The text was updated successfully, but these errors were encountered: