Skip to content

Release_HESML_V2R1

Latest
Compare
Choose a tag to compare
@alicialara alicialara released this 19 Sep 21:17
· 146 commits to master since this release

This new release of HESML http://hesml.lsi.uned.es is a special version focused on sentence similarity methods in the biomedical domain. The main novelties introduced by HESML V2R1 are as follows: (1) the software implementation of a new package for the evaluation of sentence similarity methods; (2) the software implementation of most of the sentence similarity methods in the biomedical domain; (3) the implementation of a new package for sentence pre-processing together with a set of sentence pre-processing configurations; (4) the integration of the three main biomedical NER tools, Metamap , MetamapLite and cTAKES; (5) the software implementation of a parser based on the averaging Simple Word EMbeddings (SWEM) models introduced by Shen et al. for efficiently loading and evaluating FastText-based and other word embedding models; (6) the integration of Python wrappers for the evaluation of BERT Universal Sentence Encoder (USE) and Flair models; and finally, (7) the software implementation of a new string-based sentence similarity method based on the aggregation of the Li et al. similarity and Block distance measures, called LiBlock, as well as eight new variants of the ontology-based methods proposed by Sogancioglu et al., and a new pre-trained word embedding model based on FastText and trained on the full-text of the articles in the PMC-BioC corpus.