Skip to content

zbmed-semtec/protein-function-embeddings-thesis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comparative analysis of protein function text-based embeddings and its potential for prediction tasks

DOI DOI

This thesis explores how information for protein functions can be exploited through embeddings so that the produced information can be used to improve protein function annotations. The underlying hypothesis here is that any pair of proteins with high sequence similarity will also share a similar biological function which would be reflected by the corresponding protein embeddings. The comparion and evaluation of this is done using two text-driven embedding approaches: Word2doc2Vec and Hybrid-Word2doc2Vec.