SORTA initial release (feature/ontologyservice)
Background
SORTA is a tool, built on MOLGENIS, that is able to semi-automatically match the input terms with standard codes such as ontology terms or local terminologies. For each of the input terms, SORTA provides a list most relevant candidate codes based on the lexical similarity in percentage, users can pick out the correct matches from the suggested list.
Technical design
SORTA is built based on ElasticSearch in combination of N-gram string matching algorithm in order to achieve high performance and accuracy.
Major features
- Support importing standard codes in OWL format and Excel EMX format
- Support semantic search (Elasticsearch + N-gram) for the input terms