Skip to content

bleachyin/crawlerkeeper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CrawlerKeeper

###A monitor of crawler write by python

Description

  • CrawlerKeeper是基于zookeeper以及thrift实现的python爬虫监控框架

  • CrawlerKeeper底层是基于thrift的rpc接口进行通信,当爬虫通过zookeeper注册节点被服务端获取并且响应后,爬虫客户端会根据服务器端在zookeeper注册节点内的thriftserver信息(ip地址以及端口号)生成相应的thriftclient,同时每个爬虫客户端生成一个 thriftserver, crawlercenter则会针对每个注册的爬虫客户端生成对应的thriftclient,从而达到双向通信的目的。

  • 可视化系统详见 CrawlerCenter

Installation

root@root:~# tar -xzvf crawlerkeeper.tar.gz
root@root:~# cd crawlerkeeper
root@root:~# sudo python setup.py install

Document

image ###Auth & Bug Report dongchenxi@xiaomi.com


image image image image

About

a minitor for crawler cluster for python

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published