Skip to content

collections of crawler demos build on different languages or framworks

Notifications You must be signed in to change notification settings

littleDing/LDCrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

these are collection of crawler demos build on different languages or framworks
they are commomly consist of modules:
	dispatcher	: job & queue logic
	fetcher		: fetch logic, given urls and return html contents
	analyzer	: analyze html to extract infomation & next urls
	pagebase	: page content database
	linkbase	: link summary database
different port may vary a little bit from each other

current edition :
	LDCrawler-Bash	: mainly construct in bash scripts, mixed with a little php & cpp
	LDCrawler-Nodejs-P2P : build with nodejs, a p2p version

on the road-map :
	LDCrawler-Storm : build with storm
	LDCrawler-Nodejs-MasterWorker : build with nodejs, a master-worker version

About

collections of crawler demos build on different languages or framworks

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published