-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows下运行正常,macOS和Linux下均报错,网上查了半天,依然一头雾水,求大神解惑。 #15
Comments
我Windows上就出现这个问题了,同问怎么解决?
|
用之前运行正常的Windows试了下,也出现这个错误了。莫非是代理网站的网页结构变了,导致抓到的内容不符合要求了? |
如果是这样的话,那这个可能性很高,这个作者的另一个代理池代码可以用https://github.com/Germey/ProxyPool,里面爬的网站代码内容部分相同,部分不同 |
是的,亲测有效。 |
经过测试发现,是由于pip安装的redis库太新导致的,换成redis==2.10.6就可以正常用了。 |
的确是这样,我发现最新版Redis的zadd和zincrby函数出现了变化,把程序文件db.py的这两个函数改为self.db.zadd(REDIS_KEY, {proxy: score}) ; self.db.zadd(REDIS_KEY, {proxy: MAX_SCORE});self.db.zincrby(REDIS_KEY, -1, proxy)就能成功运行了 |
哈哈,折腾一下总有收获。 |
代理池开始运行
WARNING: Do not use the development server in a production environment.
Use a production WSGI server instead.
开始抓取代理
获取器开始执行
Crawling http://www.66ip.cn/1.html
正在抓取 http://www.66ip.cn/1.html
抓取成功 http://www.66ip.cn/1.html 521
Crawling http://www.66ip.cn/2.html
正在抓取 http://www.66ip.cn/2.html
抓取成功 http://www.66ip.cn/2.html 521
Crawling http://www.66ip.cn/3.html
正在抓取 http://www.66ip.cn/3.html
抓取成功 http://www.66ip.cn/3.html 521
Crawling http://www.66ip.cn/4.html
正在抓取 http://www.66ip.cn/4.html
抓取成功 http://www.66ip.cn/4.html 521
Crawling http://www.proxy360.cn/Region/China
正在抓取 http://www.proxy360.cn/Region/China
抓取成功 http://www.proxy360.cn/Region/China 400
正在抓取 http://www.goubanjia.com/free/gngn/index.shtml
抓取成功 http://www.goubanjia.com/free/gngn/index.shtml 404
正在抓取 http://www.ip3366.net/?stype=1&page=1
抓取成功 http://www.ip3366.net/?stype=1&page=1 200
成功获取到代理 112.87.254.81:8118
成功获取到代理 103.115.180.96:42556
成功获取到代理 103.218.25.52:53281
成功获取到代理 80.211.55.179:3128
成功获取到代理 137.59.162.178:52497
成功获取到代理 165.90.209.141:31975
成功获取到代理 80.211.84.179:3128
成功获取到代理 103.108.96.159:46258
成功获取到代理 103.106.101.12:45100
成功获取到代理 112.84.85.164:8118
正在抓取 http://www.ip3366.net/?stype=1&page=2
抓取成功 http://www.ip3366.net/?stype=1&page=2 200
成功获取到代理 183.172.131.4:8118
成功获取到代理 112.67.35.134:8118
成功获取到代理 59.110.48.236:3128
成功获取到代理 111.224.137.25:80
成功获取到代理 138.121.31.108:53281
成功获取到代理 103.225.228.101:58732
成功获取到代理 222.181.10.102:8118
成功获取到代理 111.224.34.224:80
成功获取到代理 103.81.15.113:57803
成功获取到代理 101.27.22.144:61234
正在抓取 http://www.ip3366.net/?stype=1&page=3
抓取成功 http://www.ip3366.net/?stype=1&page=3 200
成功获取到代理 119.179.133.233:8060
成功获取到代理 119.179.143.43:8060
成功获取到代理 119.179.143.43:8060
成功获取到代理 106.58.248.101:80
成功获取到代理 119.254.94.71:52811
成功获取到代理 119.179.130.179:8060
成功获取到代理 27.208.85.141:8060
成功获取到代理 123.207.233.182:808
成功获取到代理 112.66.70.180:8060
成功获取到代理 170.82.21.168:53281
Process Process-2:
Traceback (most recent call last):
File "/usr/local/var/pyenv/versions/3.7.1/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/usr/local/var/pyenv/versions/3.7.1/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/Users/hao/Documents/Coding/ProxyPool/proxypool/scheduler.py", line 28, in schedule_getter
getter.run()
File "/Users/hao/Documents/Coding/ProxyPool/proxypool/getter.py", line 30, in run
self.redis.add(proxy)
File "/Users/hao/Documents/Coding/ProxyPool/proxypool/db.py", line 30, in add
return self.db.zadd(REDIS_KEY, score, proxy)
File "/usr/local/var/pyenv/versions/3.7.1/lib/python3.7/site-packages/redis/client.py", line 2263, in zadd
for pair in iteritems(mapping):
File "/usr/local/var/pyenv/versions/3.7.1/lib/python3.7/site-packages/redis/_compat.py", line 123, in iteritems
return iter(x.items())
AttributeError: 'int' object has no attribute 'items'
The text was updated successfully, but these errors were encountered: