## Biokotlin Basic Tutorial
Modeled from Tutorial Points
https://www.tutorialspoint.com/biopython/biopython_sequence.htm

A sequence is series of letters used to represent an organism’s protein, DNA or RNA. It is represented by Seq class. Seq class is defined in Bio.Seq module.

Let’s create a simple sequence in Biopython as shown below −

In [None]:
// This file is needed to pull in the biokotlin code
// THis is old code - pull from repository instead
//If this does not exist run from cmdline: ./gradlew shadowjar
//@file:DependsOn("../biokotlin-0.03-all.jar")
@file:Repository("https://jcenter.bintray.com/")
@file:DependsOn("org.biokotlin:biokotlin:0.03")

In [None]:
import biokotlin.seq.*

In [None]:
val seq = Seq("GCAGAT")

In [None]:
seq.complement()

In [None]:
// This line will print both the codon table ID (default = 1) and the translated sequence
seq.translate()

In [None]:
seq.transcribe()

In [None]:
// Define a protein sequence
val proseq = ProteinSeq("GCAGAT")

In [None]:
// count the number of Glysine in this sequence
val gCount = proseq.count(AminoAcid.G)

In [None]:
// Put the above together in a code-snippet
import biokotlin.seq.*
val seq = Seq("GCAGAT")
seq.complement()
seq.translate()
seq.transcribe()
val proseq = ProteinSeq("GCAGAT")
val gCount = proseq.count(AminoAcid.G)

In [None]:
// print the value of AminoAcid.G 3 letter name, and the number of times 
// Glysine appears in the defined Protein sequence
println("${AminoAcid.G.name3letter} count is $gCount")

In [None]:
val proseq2 = ProteinSeq("RMFGVE")
proseq2 + proseq

In [None]:
// Example:  Transcribe DNA to RNA
val dna = NucSeq("GCTA")  //inferred DNA
val rna = NucSeq("GCUA")  //inferred RNA
val rnaSpecified = NucSeq("GCACCCCC", NUC.RNA)

println(dna.transcribe() == rna)  // should print "true"
println(dna) // should be GCTA
println(dna.repr()) // should be NucSeqByte('GCTA',[A,C,G,T])

In [None]:
val bigSeq = seq * 1000000
bigSeq.count(Seq("TGC"))

In [None]:
//Count reverse complement
val startTime = System.currentTimeMillis()
val palSeq = Seq("GATATCC") * 10_000_000
var totalCount = 0
var count = 0
for(i in 0..(palSeq.size()-7)) {
    totalCount++
    val site = palSeq[i..(i+5)]
   // println("$site rev ${site.reverse_complement()}")
    if (site == site.reverse_complement()) count++
}
println("count=$count, totalCount=$totalCount")
println("TotalTime ${System.currentTimeMillis()-startTime}ms")

Perform DNA translation

In [None]:
val startTime = System.currentTimeMillis()
val palSeq = Seq("GATATCC") * 100_000_000
val pro = palSeq.translate()
println(pro[0..3]) // performs a substring, "end" is exclusive, so only 3 printed (Protein bug - should be inclusive)
println("TotalTime ${System.currentTimeMillis()-startTime}ms")