This project is an effort to make the data from open.georgia.gov more accessible.
The Salaries and Travel Reimbursements section is fully scraped, and all of the data from 2008 to 2012 has been obtained.
I have not yet looked into what else can be scraped, but it looks like something similar can be done with Other Expenditure Information.
I'm currently in the process of building a web-based API and human interface.
scrape/
-
A web scraping library which pulls data from open.georgia.gov and stores it in JSON format.
database/
-
Defines a PostgreSQL database schema and jOOQ-based Java for it.
json-to-sql/
-
Converts the content of the json
branch to the content of the sql
branch.
web
-
Website using the Play framework.
The json
branch contains the most raw dump of the data from the scraping process.
The sql
branch contains SQL scripts (in PostgreSQL dialect) suitable for filling a
database with schema defined by database/src/main/sql/schema.sql
in the master
branch.