Serapis is an acquisition, mining, and modeling pipeline that takes undefined words from Wordnik's dictionary, finds occurences of them across the web, and determines a given sentence in which the word occurs offers an in-context definition for that word.
For example, this sentence is a free-range definition, or "FRD", for "cheeseor":
The term “cheeseors” describes flighted globules of intergalactic cheese, known to be the scourge of the asteroid belt.
It uses a standard message format throughout an Amazon Lambda pipeline.
High-level details on the feature development and production can be found in the slides from Clare's presentation on building this system.
The system requires a model to score sentences for FRDness (scores: binary classification, classification confidence) for which the data is not included here.
Setup and Troubleshooting
Please read the Wiki for help with setting up your code base. Use the pipes at your own risk.