Skip to content

Generate simulated DNA sequences for regulatory genomics problems

License

Notifications You must be signed in to change notification settings

kchu25/SimDNA.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

52 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SimDNA

Stable Dev Build Status Coverage

Yes, we can generate synthetic DNA sequence motifs datasets in the following way -- i.i.d. background, and a profile that represents a motif that corresponds to a product multinomial (i.e., PWMs) -- and then plant a realization of that profile at some randomly chosen position for each generated background sequence. But this motif problem is way too easy to tackle. How about we simulate the motif as a mixture of profiles, where each profile may share some identical patterns (i.e., overlaps)? Moreover, what if a motif has a blocked structure such that variable spacings exist between each two adajacent blocks (i.e., gaps)? Maybe let's simulate a mixture of blocked-structured profiles as our ground truth motif? This package creates such patterns.

Basic examples

Coming soon

Undetectable patterns

Coming soon

About

Generate simulated DNA sequences for regulatory genomics problems

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages