Skip to content

txya900619/news-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

news-crawler

A news crawler for Life-Long Learning LM

Support website

Media Name Website Spider Name
自由時報 news.ltn.com.tw ltn
中央社 www.cna.com.tw cna
中國時報 www.chinatimes.com ct
三立新聞 www.setn.com setn
華視新聞 news.cts.com.tw cts

How to use

scrapy crawl <spider_name>

  • if you don't want to save data to database, you can delete NewsCrawlerPGStoragePipeline in setting.py
  • you can change postgresql setting use environment variables, see more info in pipelines.py

About

A news crawler for Life-Long Learning LM

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages