New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Next page navigation #439
Comments
Did you use a Javascript action for the next page link? |
You need to configure a splash instance. You will also need to add a setting to your projects SPLASH_URL = 'http://127.0.0.1:8050' You also need to make sure that the |
I receive error like this while running the spider with pageaction Traceback (most recent call last): |
hi |
my problem is same with you. |
hi AlexTan-b-z |
hi agramesh |
Hi AlexTan-b-z If i run the spider with above configured settings file, the pageaction middleware just triggered only. I didn't check with AUTO_PAGINATION=1. |
The slybot/settings.py file. |
Hi agramesh Hava you find something? |
+1 |
Hi! |
You need to set the SPLASH_URL in splash to point to your splash instance |
Is the new portia scrape next page data?
I build spider in portia with nextpage recorded option which is run in docker, but when i deployed it in scrapyd there is no data scraped i got output like
2016-03-29 15:57:55 [scrapy] INFO: Spider opened
2016-03-29 15:57:55 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-03-29 15:57:55 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2016-03-29 15:58:32 [scrapy] DEBUG: Retrying <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (failed 1 times): 504 Gateway Time-out
2016-03-29 15:58:55 [scrapy] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2016-03-29 15:59:07 [scrapy] DEBUG: Retrying <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (failed 2 times): 504 Gateway Time-out
2016-03-29 15:59:38 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 15:59:39 [scrapy] DEBUG: Filtered offsite request to 'www.successfactors.com': <GET http://www.successfactors.com/>
2016-03-29 15:59:51 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 15:59:51 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 15:59:52 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 15:59:52 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 15:59:52 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 15:59:52 [scrapy] DEBUG: Filtered offsite request to 'www.cpchem.com': <GET http://www.cpchem.com/>
2016-03-29 15:59:52 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 15:59:55 [scrapy] INFO: Crawled 7 pages (at 7 pages/min), scraped 0 items (at 0 items/min)
2016-03-29 15:59:57 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 15:59:58 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 15:59:59 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 16:00:00 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 16:00:00 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 16:00:03 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 16:00:04 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
2016-03-29 16:00:04 [scrapy] DEBUG: Crawled (200) <POST http://localhost:8050/render.html?job_id=dfe9afdef59811e5b9ad000c29120335> (referer: None)
there scrapyd and splash are running in local, is there any idea for this issue?
thanks in advance
The text was updated successfully, but these errors were encountered: