Grow your team on GitHub
GitHub is home to over 28 million developers working together. Join them to grow your own development teams, manage permissions, and collaborate on projects.Sign up
Intuitive and configurable search interface for document archives.
Universal backend for indexing, storing, and querying documents.
Ansible roles for deployment. In development, expect problems.
NSA documents in machine readable form
Scripts for managing scrapers
OCR server for hosted archiving service
Test data for Transparency Toolkit development
Upload application for documents in archiving service.
Manages communications over UDP between different parts of the pipeline
Methods for encrypting and verifying documents. Utility gem for document processing pipeline.
Web crawling and document processing through a usable interface.
Backend for processing document suggestions from LookingGlass
Raw data and scripts for Surveillance Research Archive
OCRs document and extracts metadata
Runs block of code on every file in directory
Main repository for Transparency Toolkit
API for calling crawlers
Collects listings for jobs that require security clearance.
Incremental crawler result reporting for Transparency Toolkit
Dataspec for cleared job listings
A crawler for Twitter
A collection of branding, interfaces, and other visual resources!
LookingGlass dataspec for tweets
Crawls public LinkedIn profiles
Scrapes all pages on any site you specify for keywords.
Resume data and scripts for managing it
A crawler for converting email files on disk to JSON
Dataspec for emails