A scrapy app to crawl company reviews from Indeed
- Python 3.9
- create virtualenv
- activate virtualenv
- update pip
- install deps
Use
pip install -r requirements-dev.txt
for development.
- set environment variables for company
- run scrapy to crawl the company reviews and save in json
python -m venv .venv
.\.venv\Scripts\activate
python -m pip install -U pip
pip install -r requirements-dev.txt
$Env:indeed_company="City-of-Calgary"
scrapy crawl review -O data/reviews_$Env:indeed_company.json
See the crawl.ps1 powershell script for batching example
python -m venv .venv
source .venv/bin/activate
python -m pip install -U pip
pip install -r requirements-dev.txt
export indeed_company="City-of-Calgary"
scrapy crawl review -O data/reviews_$indeed_company.json
See the demo app at https://indeed-municipality-reviews.streamlit.app/ for some crawled company reviews