A production-ready web scraping utility, built to monitor polling data hosted by the Economist data team.
Artifacts from the latest build can be downloaded in the Actions tab.
Artifacts from the latest daily run can be downloaded in the Actions tab.
The build pipeline is also run as a cron job that executes at 17:30 daily, so these artifacts also reflect the most recent poll results.
$ python3.8 -m venv .venv
$ source .venv/bin/activate
$ pip install -r requirements_dev.txt
$ # For information on pollscraper argument:
$ pollscraper --help
$ # To scrape polls, and calculate trends:
$ pollscraper --url https://cdn-dev.economistdatateam.com/jobs/pds/code-test/index.html --results_dir data --quiet
Full testing and linting suite:
$ tox
$ make servedocs
$ bumpversion --current-version <current_version> minor # possible: major / minor / patch
$ git push
$ git push --tags
- Free software: MIT license
- Documentation: https://pollscraper.readthedocs.io.
- Separation of Concerns - separate CI and CD pipelines
- Add separate badges for each new pipeline
- Parameterize the HTTP requests via Click
- Tidy up documentation, remove stale references such as PyPi
This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.