- Bay Area
- Joined on
- whatwordwhere 39 Tooling to extract data from scanned paper forms OCR-ed by Tesseract using the HOCR standard.
- plpython_textmatch 6 Add some fuzzy string match operations to postgreSQL
- paper_fec 4 Parse the OCR'ed paper FEC filings (as well as the electronic ones)
- easy-stats-113 3 Data from the census bureau's "easy stats" site--the first available on the 113th Congress.
- pdf_bbox_utils 3 Helpers to create .csv files of word-level bounding boxes from text-based pdfs, or from hocr output.
Repositories contributed to
- sunlightlabs/read_FEC 16 Turn raw electronic FEC filings into meaningful data
- jeffbarrera/iowa-caucus 0 GOP Iowa Caucus prediction challenge for Poli Sci 355B
- california-civic-data-coalition/django-calaccess-raw-data 35 A Django app to download, extract and load campaign finance and lobbying activity data from the California Secretary of State's CAL-ACCESS database
- jsvine/pdfplumber 10 Plumb a PDF for detailed information about each char, rectangle, line, et cetera.
- newsdev/nyt-pyfec 15 A Python library for downloading, parsing and cleaning Federal Election Commission filings.
Contributions in the last year 474 total Feb 13, 2015 – Feb 13, 2016
Longest streak 6 days August 23 – August 28
Current streak 0 days Last contributed
- Pushed 1 commit to california-civic-data-coalition/django-calaccess-raw-data Feb 11
- Pushed 2 commits to jsfenfen/pdf_word_extract Feb 8
- Pushed 4 commits to jsfenfen/pdf_bbox_utils Feb 8