Skip to content

wengang285/ants-go

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ants-go

open source, restful, distributed crawler engine

design of ants-go

ants

I wrote a crawler engine named ants in python base on scrapy. But sometimes, dynamic language is chaos. So I start to write it in a compile language.

scrapy

I design the crawler framework by imitating scrapy. such as downloader,scraper,and the way user write customize spider, but in a compile way

elasticsearch

I design my distributed architecture by imitating elasticsearch. it spire me to do a engine for distributed crawler

requirement

go get github.com/PuerkitoBio/goquery
go get github.com/go-sql-driver/mysql

install

go get github.com/wcong/ants-go
go install github.com/wcong/ants-go

run

cd bin
./ants-go

cluster in one computer

to test cluster in one computer,you can run it from different port in different terminal

one node,use the default port tcp 8300 http 8200

cd bin
./ants-go

the other node set tcp port and http port

cd bin
./ants-go -tcp 9300 -http 9200

flags

there are some flags you can set,check out the help message

./ants-go -h
./ants-go -help

Customize spider

  1. go to spiders
  2. write your spiders follow the example deap_loop_spider.go or go to the spider page
  3. add you spider to spiderMap,follow the example in LoadAllSpiders in load_all_spider.go
  4. install again

About

open source, distributed, restful crawler engine in golang

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Go 100.0%