Skip to content

leoyyang/Project-Shixin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

A Crawler based on the scrapy framework

This spider was modified by jpshuimu in order to get the dishonesty information from "http://shixin.court.gov.cn/"

As the website use javascripe to hide the real url, but the logic was quite simple.

I use selector to choose the id from start_url and process it to yield Request.

Build a csvPipeLine to output a csv type result.

In order to speed up the download and avoid the potential ban policy, I use random user-agents and forbiden the using of cookies.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages