Skip to content

A Distributed web crawler system. Support for templated spider development.

License

Notifications You must be signed in to change notification settings

zym1115718204/xspider

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Xspider

A Distributed web crawler system. Support for templated spider development.

  • Some features refer to Pyspider and Scrapy, including WebUI with script editor, task monitor, project manager and result viewer.
  • Using IP nodes for distributed node management, using Celery as a distributed task queue.
  • Support nodes task scheduling, including the configuration of single spider frequency and network requests and other parameters.
  • Support the configuration of project priority, task retry-times.

Web dashboard

demo index

script editor

demo data

task log

demo data

data

demo data

Installation

  1. Install Python 2.7
$ brew install python
  1. Install MongoDB & Redis

  2. Clone Xspider Code

git clone https://github.com/zym1115718204/xspider.git
  1. Install Package
$ pip install -r requirements.txt
  1. Run
$ cd xspider/xspider
$ ./run all
  1. Visit: http://localhost:2017

Todo

  • Nodes management

License

Licensed under the Apache License, Version 2.0

About

A Distributed web crawler system. Support for templated spider development.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published