A small webscraping framework

uses Selenium WebDriver https://selenium-python.readthedocs.io/api.html with PhantomJS to render html from JavaScript
multithreading: Implement a subclass MyClass(Scraper) of scrapetools.Scraper to run with scrapetools.ScrapePool(MyClass, ...)

Scraper UIs are implemented in IPython Notebooks.

Available scrapers:

AFNOR: Collect meta-data on french standards from AFNOR's website (requires user+password)
WOS: Collect authors and e-mail addresses on publication topics from web of science
ISO: Collect ISO standards
DIN: Collect information on all DIN committees ("Normungsausschuesse"), their sub- and mirror-committees (at ISO, CEN, ...)

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
scrapers		scrapers
.gitignore		.gitignore
ISO.ipynb		ISO.ipynb
README.md		README.md
afnor.ipynb		afnor.ipynb
din-na.ipynb		din-na.ipynb
merge CSVs.ipynb		merge CSVs.ipynb
requirements.txt		requirements.txt
wos.ipynb		wos.ipynb

Provide feedback