Skip to content

The polling page here contains a list of polls from a presidential election in a hypothetical country. This scraper tool will pull the polls off the polling page, convert them to a CSV, and create a poll average based on those polls. This will be robust to a range of data irregularities.

License

Notifications You must be signed in to change notification settings

AEJaspan/PollScraper

Repository files navigation

PollScraper

Continuous Integration Pipeline

Continuous Deployment Pipeline

Documentation Status

A production-ready web scraping utility, built to monitor polling data hosted by the Economist data team.

Artifacts from the latest build can be downloaded in the Actions tab.

Artifacts from the latest daily run can be downloaded in the Actions tab.

The build pipeline is also run as a cron job that executes at 17:30 daily, so these artifacts also reflect the most recent poll results.

Setup

$ python3.8 -m venv .venv
$ source .venv/bin/activate
$ pip install -r requirements_dev.txt

Run Pipeline

$ # For information on pollscraper argument:
$ pollscraper --help
$ # To scrape polls, and calculate trends:
$ pollscraper --url https://cdn-dev.economistdatateam.com/jobs/pds/code-test/index.html --results_dir data --quiet

Testing

Full testing and linting suite:

$ tox

Building documentation

$ make servedocs

Deployment

$ bumpversion --current-version <current_version> minor # possible: major / minor / patch
$ git push
$ git push --tags

TODO

  • Separation of Concerns - separate CI and CD pipelines
  • Add separate badges for each new pipeline
  • Parameterize the HTTP requests via Click
  • Tidy up documentation, remove stale references such as PyPi

Credits

This package was created with Cookiecutter and the audreyr/cookiecutter-pypackage project template.

About

The polling page here contains a list of polls from a presidential election in a hypothetical country. This scraper tool will pull the polls off the polling page, convert them to a CSV, and create a poll average based on those polls. This will be robust to a range of data irregularities.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published