Depreciated, replaced by individual repos that start with scrape-
over at https://github.com/xtream1101?tab=repositories
Python web scraper to archive various sites
Developed using Python 3.4
- BeautifulSoup4
- pdfkit
- requests
- wkhtmltopdf (Install)
-
Config: Rename config.ini.sample to config.ini and edit to your needs
-
Running:
python main.py configFile
- Wallhaven
- Hubble Images
- IT eEooks
- Tuebl
- IconFinder
- Find Icons (Currently does not work)
- xkcd
- xkcd's what if
- How Stuff Works
- Questionable Content