This project has been superceded by my new dockerized Ebook API.
A program that scrapes links, and other data, to your favorite tech ebooks from allitebooks.com and stores them in a MongoDB instance.
Note. One of this projects dependencies, scrapy-mongodb, does not support Python 3 yet. Therefore, this project only supports Python 2
- Clone this repository.
- Create a new Python Virtual Environment.
pip install -r requirements.txt
-
Create a fresh MongoDB instance either on our local machine or a free 500 MB remote instance from MLab.
-
Replace the following in settings.py with your own credentials:
MONGODB_URI = ''
MONGODB_DATABASE = ''
MONGODB_COLLECTION = ''
- Other MongoDB settings can be found here at the scrapy-mongodb repo.
Anyone, regardless of skill level, is encouraged to give feadback and submit pull requests.
A special thanks to the following:
- The Scraping Hub Team who maintain the Scrapy project;
- Sebastian Dahlgren maintainer of scrapy-mongodb