-
Notifications
You must be signed in to change notification settings - Fork 537
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
大量的start to get None #28
Comments
把cola目录下的data文件夹删掉再试试。 |
删除之后还是会这样, 错误代码: get 1898353550 url: http://weibo.com/1898353550/follow
get 3211200050 url: http://weibo.com/3211200050/follow # 之前都没有出错, 但是好像老师这两个ID在重复出现
Error when fetch url: http://weibo.com/1898353550/follow
Error when get bundle: 1898353550
'NoneType' object has no attribute 'find'
Traceback (most recent call last):
File "/home/jiajun/Code/cola/cola/worker/loader.py", line 229, in _execute_bundle
**options).parse()
File "/home/jiajun/Code/cola/contrib/weibo/parsers.py", line 559, in parse
return self._error(url, e)
File "/home/jiajun/Code/cola/contrib/weibo/parsers.py", line 91, in _error
AttributeError: 'NoneType' object has no attribute 'find'
Finish 1898353550
start to get None
Error when fetch url: http://weibo.com/3211200050/follow
Error when get bundle: 3211200050
'NoneType' object has no attribute 'find'
Traceback (most recent call last):
File "/home/jiajun/Code/cola/cola/worker/loader.py", line 229, in _execute_bundle
**options).parse()
File "/home/jiajun/Code/cola/contrib/weibo/parsers.py", line 559, in parse
return self._error(url, e)
File "/home/jiajun/Code/cola/contrib/weibo/parsers.py", line 91, in _error
raise e
AttributeError: 'NoneType' object has no attribute 'find'
Finish 3211200050
start to get None
start to get None
start to get None
start to get None
start to get None
start to get None
start to get None
Finish visiting pages count: 32 # 这里我停止了
Finish visiting pages count: 32 |
你只访问了用户好友页面,有没有抓微博? |
我是按照wiki里说的做的, |
那就是没有其他设置了。 |
可以的 self.browser.set_handle_gzip(True)
.......
----------------------------------------------------------------------
Ran 7 tests in 10.017s
OK |
我测试了一下,好像确实有问题,不知道是不是这两个用户的页面出现了变化,今天我会看看是什么问题造成的。 |
好的, thank you ;-) |
besides,如果只是用单机版本,最好使用develop分支,这个分支在单机上已经测试很久了,效率上还是稳定性上都比master要高得多,近期开发完成后我会把它merge到master分支。 |
这个问题是新版微博导致的,已经在master和develop分支下分别修复了,还是像之前说的,最好是用develop分支。 |
好的 ;-) |
查询数据库发现没有爬取到数据
The text was updated successfully, but these errors were encountered: