AI assembly of biological wordnets
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


AI assembly of biological wordnets


biosemble is a Python natural language processing (NLP) software program for assembling biological wordnets from structured and unstructured biological text. Structured text includes resources like biologically relevant dictionaries and encyclopedias, while unstructured text includes biologically relevant textbooks.

How good is it?

biosemble can autonomously identify leukemia as a blood cancer, and CD38 as a glycoprotein on the cell surface that is relevant to leukemia:


Not too bad!


Structured biological text

biosemble uses part-of-speech (POS) tagging to assemble similar words across a wide array of biologically relevant dictionaries and encyclopedias.

Unstructured biological text1

biosemble uses Word2Vec which is a Neural Network based algprithm to produce a group of related models that are used to produce word embeddings. Using biosemble you can pass in your custom argumetns based on the input data, required to generate the most precise results.


Coming soon!