Scala library that shells to Tesseract to make PDFs searchable
Open source large document set visualization platform
Scan a folder of document files of all types and extract the text into a CSV suitable for Overview
An Overview plugin: display an editable wordcloud
Run Overview on your own system
Everything needed to create Overview docker images on Docker Hub
Tools for deploying Overview to Amazon Web Services
Splits text into tokens in as many languages as possible
File management and metadata extraction for the Open Syllabus Project.
Overview plugin that filters words, looking for entities
Run regular expressions on a document set.
C++ ternary search tree, for Node Buffers
An Overview plugin: search for many things at once
Deploys Overview plugins to AWS
Browse the directory structure of a document collection.
Tells Elastic Load Balancer we're online
C++ unordered_set, for Node Buffers
Fast, correct C++ bloom filter for Node
Data structure to create, combine and count Arrays of String tokens
A basic search interface onto the OSP citation data.