Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

每次爬取到这一天就会报错,只能爬取一年不到的数据 #22

Closed
JunkeyLau opened this issue May 21, 2018 · 7 comments
Closed

Comments

@JunkeyLau
Copy link

image
想问问,为什么每次爬取到这一条就会报错啊?

@dataabc
Copy link
Owner

dataabc commented May 22, 2018

如果方便,能否提供爬取的微博id我调试下

@JunkeyLau
Copy link
Author

JunkeyLau commented May 24, 2018 via email

@dataabc
Copy link
Owner

dataabc commented May 24, 2018

test
我没有出错啊,代码就是这个。因为微博数较多,只爬了293至300页,可以正确爬取

@JunkeyLau
Copy link
Author

哦,谢谢了,那可以通过增加延迟来爬更多的数据吗?

@dataabc
Copy link
Owner

dataabc commented May 24, 2018

可以,你可以参考#8

@zyxbcde
Copy link

zyxbcde commented May 29, 2018

有很多是博主点了个禁止评论,然后下面按钮少了一个,这样XPATH就乱了,会导致报错。

@dataabc
Copy link
Owner

dataabc commented May 30, 2018

@zyxbcde
如果可以,能否提供出现错误的微博id,方便调试。因为我找了几个疑似禁止评论的微博,都可以正确爬取。

@dataabc dataabc closed this as completed Nov 25, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants