Skip to content

haipz/JandanPicture

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Jandan Picture

This is a Jandan Spider.

Stay Simple, Stay Naive.

Warning

Just for studying. Please don't consume Jandan too much network traffic.

Feature

  • Request by selecting User Agetn from User Agent List randomly
  • Update HTTP Proxy IP by multiple process and check the status of IP automatically
  • Analyze the original picture url and download the popular picture into the ooxx directory
  • Save all items into data.dat

Requirement

  • Python 2.7
  • Scrapy
  • Multiprocessing
  • Proxy by mapleray

Run

  • Windows: Double click run.bat
  • Linux or OS X: Run command scrapy crawl JandanPicture

Author

Haipz @haipz.com

About

A Scrapy Jandan.net Spider

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published