Skip to content

SORTA initial release (feature/ontologyservice)

Compare
Choose a tag to compare
@ChaoPang ChaoPang released this 01 Apr 13:19

Background

SORTA is a tool, built on MOLGENIS, that is able to semi-automatically match the input terms with standard codes such as ontology terms or local terminologies. For each of the input terms, SORTA provides a list most relevant candidate codes based on the lexical similarity in percentage, users can pick out the correct matches from the suggested list.

Technical design

SORTA is built based on ElasticSearch in combination of N-gram string matching algorithm in order to achieve high performance and accuracy.

Major features

  • Support importing standard codes in OWL format and Excel EMX format
  • Support semantic search (Elasticsearch + N-gram) for the input terms