Simple spider for scrapy in order to download the pdfs from replacementdocs.com
replacementdocs.com is a great resource for downloading manuals for consoles, unfortunately the page has been broken for a while(april 2018) with no fix in the pipeline
As Im very worried that the site may go down at any given moment due to their issues, I rather have a local copy of all the manuals in case it goes down (plus Im a bit of a data hoarder)
- Do not redownload the same pdf
- Save the pdfs to a different folder based on their source system
pipenv install
pipenv run scrapy crawl rpd
Take into account that replacementdocs provides a free service so dont go download the whole thing for no good reason as its over 9000 manuals and around 25Gb total. If you need the whole pack I can provide a torrent for it as to alleviate the bandwith that would be incurred by multiple users downloading all the manuals.