Skip to content

ekg/embeddna

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DNA embeddings from scratch

This small experiment develops embeddings for subsequences of many genomes and compares them to each other with a k-nearest-neighbor vector comparison library.

You can run it with python embed.py.

It needs CUDA, PyTorch, Biopython, Annoy to work, and a GPU to work at a reasonable speed.

Setup with conda env create --file=environment.yml, and then conda activate embeddna.

About

test of DNA embeddings using a transformer

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages