Skip to content

chris-martin/georgia

Repository files navigation

This project is an effort to make the data from open.georgia.gov more accessible.

Progress

The Salaries and Travel Reimbursements section is fully scraped, and all of the data from 2008 to 2012 has been obtained.

I have not yet looked into what else can be scraped, but it looks like something similar can be done with Other Expenditure Information.

I'm currently in the process of building a web-based API and human interface.

Repository contents

master

scrape/ - A web scraping library which pulls data from open.georgia.gov and stores it in JSON format.

database/ - Defines a PostgreSQL database schema and jOOQ-based Java for it.

json-to-sql/ - Converts the content of the json branch to the content of the sql branch.

web - Website using the Play framework.

json

The json branch contains the most raw dump of the data from the scraping process.

sql

The sql branch contains SQL scripts (in PostgreSQL dialect) suitable for filling a database with schema defined by database/src/main/sql/schema.sql in the master branch.

About

Data harvested from open.georgia.gov

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published