New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

执行run()之后，怎么判断已完成呢？ #825

Open

zczhzy opened this issue Aug 24, 2018 · 4 comments

zczhzy commented Aug 24, 2018

Spider.create(new TestProcessor()) .addUrl("http://xxx.xxx") .thread(5) .run();
我想在页面爬完之后，做下一步的操作，怎么样才能知道已经爬完了呢？

The text was updated successfully, but these errors were encountered:

gcqst commented Aug 29, 2018

同问

wqqwqqwqq commented Oct 15, 2018 •

edited

个人发现的两个办法：
1.爬取动作从异步改为同步，调用的start方法改为run方法。
2.获取源码中的Spider.java，搜索“Spider {} closed! {} pages downloaded”，在下方写入自定义方法。。

whitefly commented Mar 6, 2021

我最近在基于这个框架开发,也有这个需求
目前解决方法是继承Spider的类,然后在该类中重写run()方法,在super.run()之后插入一个函数作为钩子

lomoye commented Jul 20, 2021

while (spiderWorker.getStatus() != Spider.Status.Stopped) {
try {
sleep(1000);
log.info("spiderWorker running sleep");
} catch (InterruptedException e) {
log.info("spiderWorker interruptedException", e);
}
}
我是这样搞的，在外面死循环判断爬虫的状态

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment