Skip to content

jwchoi95/PaperCrawler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PaperCrawler

Tutorials for implementing paper crawling from WoS

This is an example of collecting papers from WoS search engine.

Requirements

  • BeautifulSoup

  • Selenium

  • other general libraries such as time, pandas, numpy

Usage

  1. 'Downloader' requires an url link where your search results exist
  2. 'Crawler' provides a set of HTML tags for each url. Here, HTML is a semi-structured data format, and thus parsing is necessary.

FYI

DOI can be used as url (e.g., "https://doi.org/{root_url}"). test_papers.xlsx include DOI information, and it can be used as input of 'Crawler'.

About

Tutorials for implementing paper crawling from WoS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages