Skip to content
Crawl KPU track record data
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
LICENSE
README.md
dapil_id.jpg
kpurawler.py

README.md

kpurawler

Crawl KPU track record data. Features:

  1. Automatically download pdf for all candidates in your Dapil
  2. Skip zero bytes pdf file

Ideas? Please submit issues.

Known issues

  1. Does not check whether pdf file is corrupt or not
  2. DPD is not supported yet, as the page layout is totally different

How to use

Note: This is still MVP, so manual works needed

  1. Clone this repo
  2. Change dapil_id value based on your own dapil, see
  3. Run scrapy runspider kpurawler.py

How to find dapil id

The number highlighted by red circle on the url is what we are looking for. alt text Here's one way that I find works

  1. Go to KPU official site
  2. Click Jenis Pemilihan on the header
  3. Click Pileg 2019 on the dropdown
  4. Click Daerah Pemilihan on the header, you'll see a map of Indonesia
  5. Choose the option, eg. DPRD Kabko, Jawa Barat, Cianjur, you'll see another pop up on the right side
  6. Click your Dapil, and you should be redirected to page illustrated

Prerequisite to use

  1. Scrapy installed. See How to install Scrapy
You can’t perform that action at this time.