@pdfliberation

PDF Liberation

A commons for the work of liberating data from PDF files

Python 4 0

python-hocrgeo

Python tool for converting hOCR files to geographic file formats

Updated Aug 14, 2014

Shell 29 5

knowledge

A place to collect and share knowledge about liberating data from PDFs

Updated Aug 7, 2014

USAID-DEC

forked from dbarlett/USAID-DEC

Data from the United States Agency for International Development (USAID) Development Experience Clearinghouse (DEC).

Updated Apr 7, 2014

python-popplergeo

package to convert pdftotext bbox xhtml output to geojson

Updated Feb 23, 2014

Python 22 10

whatwordwhere

forked from jsfenfen/whatwordwhere

Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.

Updated Feb 23, 2014

Python 3 4

OCRToolkit

forked from opensecrets/OCRToolkit

Tools for working with Optical Character Recognition output

Updated Feb 17, 2014

Java 3 0

amnestydata

Amnesty International Torture data

Updated Feb 9, 2014

Python 6 1

Jersey-City-Budget-PDF-Liberation

This project will liberate data from pdf files found on http://www.cityofjerseycity.com/pub-info.aspx?id=2430 and will create .csv and .json files to be uploaded on https://data.openjerseycity.org/dataset/jersey-city-2013-budget-adopted-spending

Updated Jan 25, 2014

pdfliberation.github.io

Homepage for this organization

Updated Jan 24, 2014

NYCEDCprosedatascraper

This uses regular expressions (in php, but can be any language) get data from the NYC EDC newsletters

Updated Jan 22, 2014

Python 2 2

financial_disclosure_scraping

(DC team) experimenting with available options for extracting info from PFDs

Updated Jan 20, 2014

Java 0 3

housedisc

forked from pkaeding/housedisc

Updated Jan 20, 2014

Python 1 2

pdf-hacks-2014

forked from palamago/pdf-hacks-2014

PDF liberation Hackaton - http://pdfliberation.wordpress.com/

Updated Jan 20, 2014

pdfHarvester

forked from hansthompson/pdfHarvester

Updated Jan 20, 2014

pdf-hackathon

Resources related to PDF Liberation hackathon

Updated Jan 19, 2014

JavaScript 5 3

pdf_table_extraction

experimenting with pdf2text and python pdf-table-extract

Updated Jan 19, 2014

crime-stats-utah

forked from todrobbins/crime-stats-utah

Crime Statistics for the State of Utah

Updated Jan 19, 2014

pdf-liberation-examples

forked from mroswell/pdf-liberation-examples

displaying various pdf liberation tools, at PDF Liberation Hackathon

Updated Jan 18, 2014

assembly

A forum of sorts. Where we gather to discuss Issues.

Updated Jan 17, 2014