Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GUI版最大下载图片数量只能下2000? #9

Closed
whaozl opened this issue Sep 6, 2017 · 7 comments
Closed

GUI版最大下载图片数量只能下2000? #9

whaozl opened this issue Sep 6, 2017 · 7 comments

Comments

@whaozl
Copy link

whaozl commented Sep 6, 2017

No description provided.

@sczhengyabin
Copy link
Collaborator

@whaozl 是的,一个关键词出来的图片,百度的最多,也不超过2000个,所以就设置了2000的上限。

@whaozl
Copy link
Author

whaozl commented Sep 6, 2017

谷歌的会超过2000呢 嘿嘿 比如我想爬cat的图片 谷歌可以出来好多

@sczhengyabin
Copy link
Collaborator

@whaozl 你确定人工数过数量的?我记得虽然可以一直翻,但是最终也只有几百张

@whaozl
Copy link
Author

whaozl commented Sep 7, 2017

@sczhengyabin
Copy link
Collaborator

image

@whaozl
你好,这种情况往往会在代理网速不够快的情况下发生,代码会在打开网页一定时间后检测是否有“显示更多”的按钮出来,如果没有就不会继续往下加载了。 所以要应对这种情况,要么换一个快一点的代理,要么修改代码,增加检测按钮的延时。

@whaozl
Copy link
Author

whaozl commented Sep 7, 2017

@sczhengyabin 为什么您有这么多 我好多关键词都是99,我是直接用国外的服务器下的嘿 速度超快的 您说的检测按钮的延时可以设置吗?具体在哪里嘿?

@sczhengyabin
Copy link
Collaborator

@whaozl
在crawlay.py文件,google_image_url_from_webpage的两个sleep。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants