Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

关于网页downloadsuccess时,code为400的问题 #870

Closed
huangjianfeng1 opened this issue Apr 23, 2019 · 1 comment
Closed

关于网页downloadsuccess时,code为400的问题 #870

huangjianfeng1 opened this issue Apr 23, 2019 · 1 comment

Comments

@huangjianfeng1
Copy link

采集项目时,偶尔会出现400,因为是概率问题,并不会每次都出现,所以添加了对400的处理:
Set acceptStatCodeSet = new HashSet();
acceptStatCodeSet.add(200);
acceptStatCodeSet.add(400);
getSite().setAcceptStatCode(acceptStatCodeSet);
并在process中对400进行判断,并做如下测试:
if(page.getStatusCode()==400){
Request request = page.getRequest();
page.addTargetRequest(request);
return;
}
但发现这条采集并不会重新触发,请问各路大仙们,有何解决办法?

@sutra sutra added the question label Jul 8, 2020
@sutra
Copy link
Collaborator

sutra commented Jul 8, 2020

我觉得应该 override Downloader,将 HTTP 400 多做 page download failed 来处理。

@sutra sutra closed this as completed Jul 19, 2020
@sutra sutra added this to the WebMagic-0.7.4 milestone Oct 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants