Skip to content

mationai/seq-genomics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Genome Sequencing with Seq

Solutions (spoilers!) for Coursera Bioinformatics / Stepik Genome Sequencing course with Seq

Motivation

I've always wanted to learn Bioinformatics / Genome Sequencing. Seq-lang's debut last week is the prefect catalyst for starting.

Seq seems like the perfect tool since it was created specifically for the job. The downside is that it is too young, so many features and "batteries" you expect from python, the language it is based on, are missing. In addition, the following some of the things I felt missing working on the problem set:

  • functools
  • better parse errors
  • splat / unpack
  • defaultdict

That being said, its speed, statical typing, pipeline operator |>, and similarities with python have made me overlook its early drawbacks easily.

Setup and Running

  1. Follow instructions to Install

  2. Setup your editor to associate .seq as python.

  3. Run via seqc FILE.seq ARG, eg.:

seqc stepik/1-1.seq 1   # run exercise 1 in stepik/1-1.seq
seqc stepik/1-1.seq 2-4 # run exercise 2-4 in stepik/1-1.seq
seqc stepik/1-1.seq     # run all exercises in stepik/1-1.seq

Seq Language Notes

Iteration

Not sure if

for kmr in dna.kmers[Kmer[4]](1):

is faster than

for kmr in dna.split(k, 1):

or not, but since Kmer[n] requires n to be a compile time constant, it' not too useful here.

Hamming Distance

abs(k1 - k2) # vs. len([1 for i in range(len(k1)) if k1[i] != k2[i]])

Missing Features

reversed=bool in .sort() or sorted(), so using [::-1]

lambda - Need to define and pass the function

About

Coursera Bioinformatics / Stepik Genome Sequencing with seq-lang

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published