Code Documentation still in progress
This repository is one part of a housing aggregator web service for students. Its purpose is to scrape rental listings across multiple public listing websites and store them in a mongodb database collection. It also contains additional functionality to review listings, cull expired/old ones, and to migrate the database schema when additional fields are added/removed.
The Scrape submodule is where the centralized scraper lives. Running it will begin the process of scraping listings across all currently supported listing providers based on queries described in the MongoDB database.
To install dependencies:
pip install -r requirements.txt
To run the scraper:
python3 -m app.Scrape
The Cull submodule is where the Culler lives. The Culler looks at every listing currently stored in the database and evaluates each one to determine whether it's still available to rent. It removes expired listings.
To install dependencies:
pip install -r requirements.txt
To run the culler:
python3 -m app.Cull