Skip to content

practice: using python scrapy to parse a website

Notifications You must be signed in to change notification settings

hungyu/crawl-book

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 

Repository files navigation

Purpose

Using python scrapy to crawl articles

install step

  • install python (version > 3)
	you can use pyenv to control/install different version of python
  • install scrapy
	pip install scrapy
  • install simplified Chinese and Traditional Chinese convertor
	pip install opencc-python-reimplemented

usage

	pyenv exec scrapy runspider scraper.py -a starturl=https://tw.uukanshu.com/b/125477/ -a output=book.txt

note

  • now only support tw.uukanshu.com article

About

practice: using python scrapy to parse a website

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages