Skip to content

A crawler that can get articles from specific websites based on your keywords. 一个爬虫程序,可以根据你的关键词从特定的网站获取文章。

License

Notifications You must be signed in to change notification settings

kayak4665664/Article-crawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Article-crawler

A crawler that can get articles from specific websites based on your keywords.
一个爬虫程序,可以根据你的关键词从特定的网站获取文章。

This program mainly uses requests and bs4 to get articles. Keywords and website will be saved in the files. The disadvantage is that no pictures are downloaded and useless information is not filtered.
这个程序主要使用requestsbs4来获取文章。关键词和网站将会保存在文件中。缺点是没有下载图片,没有过滤无用信息。

Time: 2020 Spring Semester

For example:

  • The Washington Post_Biden’s Jan. 6 speech sets the stage for a battle for truth and democracy.txt
  • The Washington Post_Biden’s vaccine rules for workers arrive at Supreme Court.txt
  • The Washington Post_Former president Trump embraces a kindred spirit in Hungary.txt
  • The Washington Post_Inside Biden’s decision to forcefully denounce Trump.txt
  • The Washington Post_Six former Biden health advisers call for ‘new’ covid strategy.txt

About

A crawler that can get articles from specific websites based on your keywords. 一个爬虫程序,可以根据你的关键词从特定的网站获取文章。

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages