This is the code & project documentation for the percs website.
Percs was initially created to provide a basic, searchable index of the NSW Pecuniary interest documents. These files are uploaded as large PDFs, containing scanned images of Minister's submissions. Nearly 100 submissions, dumped into big, unsearchable, opaque files.
Chris Nilsson ran these files through an OCR, indexed the found text, and wrapped this percs site around the result.
The aim is to digitise & index previous, and ongoing years as needed.
Why? These folks are in charge of a good chunk of our money. Having their pecuniary interest declarations more accessible can only help keep things transparent and fair.
Of course, the process isn't perfect. Many documents are handwritten, and difficult enough for humans to read, letalone the OCR.
But, it's a start.
You can get the source code from: https://github.com/otherchirps/percs
Usage instructions, etc, can be found here: http://percs.readthedocs.org/
The site itself, and its custom libraries are available at no charge, and licensed under the Mozilla Public License 2.0. The third party libraries (which actually do the cool stuff) have their own licenses.
Site-specific suggestions and problems should go to the issue tracker.
Any other queries, you can reach me via email.