Skip to content
PDF scraper for Karnataka electoral rolls
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
README.md
ceo-kar-roll-scraper.py
er-test.pdf

README.md

Karnataka electoral roll scraper

This script is intended to scrape the electoral roll from http://ceokarnataka.kar.nic.in and generate a text electoral roll.

This is in a very early draft stage right now. It has some skeleton code to pick up each entry from the roll of a single polling station, and parse out a bunch of fields. This works fine for most cases.

Todo

Lots! :)

  • Deal with names that span multiple lines
  • Actually dump output in a meaningful format (CSV? JSON?)
  • Scrape out polling station information
  • Fix up to run over a dump of the PDFs for all polling stations and build up the electoral roll for Karnataka as a whole

Dependencies

  • Python
  • pdfminer
You can’t perform that action at this time.