We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
由于多个爬虫运行在一个项目中,在构建Host的过程中,Host源码内部实际上会自动注入多个HostService,除了当前目标爬虫Spider外,这是出于什么样的考虑,是否有更优雅的爬虫启动方式,本身这些爬虫就在一个进程下,作为后台服务在运行,部分公用服务是否考虑共享,部分特有才考虑使用自建,类似插件化?
The text was updated successfully, but these errors were encountered:
抱歉,应该是,注册多个HostService服务,我核对的是泛型Host主机源码
Sorry, something went wrong.
因为各个组件是通过消息队列来通信的,所以 agent 可以部署到各个不同的机器上面共用(使用消息队列)。所有任务的进度又是一个需要中心监控的信息,所以为有一个AgentCenter。单机爬虫为了适配这种模式(接口)都通过内存消息队列通信,所以要启动多个hostservice。 如果在一个进程里要共享这几个组件当然是可以的,只是代码量和设计就变成两种模式了。现在单机版和分布式版本是一致的,就简单一点
明白了,谢谢解答,我按实际需求魔改,再次谢谢。
No branches or pull requests
由于多个爬虫运行在一个项目中,在构建Host的过程中,Host源码内部实际上会自动注入多个HostService,除了当前目标爬虫Spider外,这是出于什么样的考虑,是否有更优雅的爬虫启动方式,本身这些爬虫就在一个进程下,作为后台服务在运行,部分公用服务是否考虑共享,部分特有才考虑使用自建,类似插件化?
The text was updated successfully, but these errors were encountered: