## Biyoinformatik çalışmalarına bazı örnekler

* Genom projelerinden elde edilen verinin düzenlenmesi
* Hastalıklara ait genlerin keşfedilmesi ve toplumdaki sıklığının belirlenmesi
* Hastalıkların altında yatan genetik temellerin araştırılması
* Bir DNA dizisinin hangi gene ve canlıya ait olduğunun bulunması
* Bir protein dizisinin hangi gene ve canlıya ait olduğunun bulunması
* Büyük ölçekli gen ifade analizleri
* Protein dizisinden 3 boyutlu yapı tahmini
* İlaç etkileşimlerinin araştırılması
* Yeni ilaçların keşfi (Bilgisayar destekli ilaç tasarımı)
* Karşılaştırmalı genomiks/proteomiks 
* Moleküler evrim, moleküler filogenetik
* Kütle spektrometresi ile protein tanımlama
* Biyolojik anlamlandırma: Anotasyon ve yolak analizleri
* Metastaz yapan kanser hücrelerinin modellenmesi
* Canlıların yaşam döngülerinin ve yayılışlarının modellenmesi (ekolojik modelleme)
* Meta-analizler
* Yeni Nesil Sekanslama veri analizleri
* Mikrodizi analizleri

Biopython websitesi (http://www.biopython.org) Python tabanli biyoenformatik modulleri ve kodu sagliyor. Icinde kullanabileceginiz cesitli siniflar (class) var, farkli dosya tiplerini (BLAST, Clustalw, FASTA, Genbank,..) analiz etmenize imkan verecek ozelliklere sahip, cesitli online kaynaklara ulasmaniza olanak sagliyor. Dokumantasyonu da oldukca iyi, buradan daha ayrintili bilgiye ulasabilirsiniz. 

In [None]:
#Command line'da bu sekilde yukleyelim
!pip install biopython

In [None]:
#Biopython kutuphanesinde kullanilan ornek dosyaya (orkide bitkisinin DNA dizileri) bakalim 
# Dosyayi indirelim
from Bio import SeqIO
import requests

def get_file(url, filename): 
    """"bu fonksiyonla verdigimiz bir url ve dosya adi ile dosyaya ulasiyoruz"""
    res = requests.get(url)
    if res.status_code != 200:
        raise Exception("Could not get file")
    
    #dosyayi yazmak icin aciyoruz
    with open(filename, 'w') as fh:
        fh.write(res.text)


def process_file(filename, file_type):
    """dosyayi proses eder, icindeki bilgileri raporlar"""
    for seq_record in SeqIO.parse(filename, file_type):
        print(seq_record.id)
        print(repr(seq_record.seq))
        print(len(seq_record))


fasta_url = 'https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.fasta'
filename = "ls_orchid.fasta"
file_type = "fasta"
get_file(fasta_url, filename)
process_file(filename, file_type)

genbank_url = "https://raw.githubusercontent.com/biopython/biopython/master/Doc/examples/ls_orchid.gbk"
filename = "ls_orchid.gbk"
file_type = "genbank"
get_file(genbank_url, filename)
process_file(filename, file_type)

In [None]:
from Bio import SeqIO
for seq_record in list(SeqIO.parse("../data/ls_orchid.fasta", "fasta"))[:5]: #dosyayi koydugunuz yerin adresi buraya
    print(seq_record.id)
    print(repr(seq_record.seq))
    print(len(seq_record))

In [None]:
from Bio import SeqIO
for seq_record in list(SeqIO.parse("../data/ls_orchid.gbk", "genbank"))[:5]:
    print(seq_record.id)
    print(repr(seq_record.seq))
    print(len(seq_record))

gbk genbank formati. Burada gordugunuz gibi **seq_record.id** icin, fasta formatinin ciktisindan farkli olarak daha kisa bir string uretti.

In [None]:
# DNA veya RNA dizileri biyoenformatigin en merkez objeleri. Bugun Biopython'un Seq objesini epey kullanacagiz. 
from Bio.Seq import Seq ##Seq burada Biopython kutuphanesinin icinde yer alan siniflardan birine ait bir obje
my_seq = Seq("GAGGTGGCTCGTGCGAAGTCGTCG")
for index, letter in enumerate(my_seq): #enumerate fonksiyonunu hatirlayalim
    print("%i %s" % (index, letter))
print('my sequence length is:', len(my_seq))

In [None]:
#biz burada herhangi bir string verip asagidaki islemleri yaptiramayiz. 
my_seq.complement()

In [None]:
my_seq.reverse_complement()

In [None]:
#tipki stringlerde oldugu gibi Seq objesinde herhangi bir sekans icin de slicing yapabiliriz.

print(my_seq[0]) #first letter
print(my_seq[4]) #third letter
print(my_seq[-1]) #last letter

In [None]:
#Seq objesinde de .count() metodu var. 

print("AAAA".count("AA"))
print(Seq("AAAA").count("AA"))

In [None]:
DNA_dizi1 = Seq('GATCGACtgatgctCAAGCTGCCTATATAGGATCGAAAATCGC')
print(len(DNA_dizi1)) 
#DNA_dizi1 = DNA_dizi1.lower()
print(DNA_dizi1.count("G"))
print(100 * float(DNA_dizi1.count("G") + DNA_dizi1.count("C")) / len(DNA_dizi1)) #GC nasil hesaplanir
#print(100 * float(DNA_dizi1.count("g") + DNA_dizi1.count("c")) / len(DNA_dizi1)) #GC nasil hesaplanir

## GC yuzdesini hesaplayalim

* Molekuler biyolji ve genetikte GC yuzdesi/orani, DNA veya RNA dizilerinde Guanin ve Sitozin bazlarinin yuzdesi demektir. 
* GC baz çiftinde üç hidrojen bağı vardır, AT'de ise yalnızca iki; bunun sonucu olarak GC çifti daha kararlıdır. Bu yuzden Polimeraz zincir reaksiyonu tepkimeleri tasarlanırken de GC içeriği ve ergime sıcaklıgi göz önüne alınır. 

In [None]:
# Biopython'dan GC metodunu da kullanabiliriz, hatta bu oldukca guvenilir, cunku kucuk harfli bazlari da hesaba katiyor
# Yazdigimiz kodda dikkat etmemiz gereken bir husus, datayi analiz etmeden once standardize hale getirmek
# mesela yukardaki ornekte .lower() metodunu kullanip sonra GC yuzdesi hesaplamak mantikli olabilirdi

from Bio.SeqUtils import GC

my_seq = Seq('GATCGACtgatgctCAAGCTGCCTATATAGGATCGAAAATCGC')
GC(my_seq)

# Bio.SeqUtils.GC() fonksiyonu ayrica degenerate bazlari da hesaba katiyor (ornek, S = G veya C demek).

In [None]:
#sekans slicing yapalim
my_seq = Seq("GATCGATGGGCCTATATAGGATCGAAAATCGC")
my_seq[4:12]

In [None]:
print(my_seq[0::3])
print(my_seq[1::3])
print(my_seq[2::3])

In [None]:
#sekansi tersine cevirelim
my_seq[::-1]

In [None]:
#sekansi string formatina da cevirebiliriz
str(my_seq)

In [None]:
#sekansi fasta formatina cevirelim
fasta_format_string = ">Name\n%s\n" % my_seq
print(fasta_format_string)

In [None]:
#Peki sekanslari birbirine nasil ekleyebiliriz?
#Yontem 1: Biopython kullanmadan
from Bio.Seq import Seq
list_of_seqs = [Seq("ACGT"), Seq("AACC"), Seq("GGTT")]
concatenated = Seq("")
for s in list_of_seqs:
    concatenated += s
concatenated

In [None]:
#Peki sekanslari birbirine nasil ekleyebiliriz?
#Yontem 2: Biopython kullanarak
from Bio.Seq import Seq
contigs = [Seq("ATG"), Seq("ATCCCAGATGATG"), Seq("TTGCA")]
spacer = Seq("N"*5)
spacer.join(contigs)

In [None]:
#Biopython'da yer alan built in yontemleri kullanarak complement ya da reverse complement elde edebiliriz
from Bio.Seq import Seq
my_seq = Seq("GATCGATGGGCCTATATAGGATCGAAAATCGC")
print(my_seq)
print(my_seq.complement())
print(my_seq.reverse_complement())

In [None]:
# bahsettigim gibi bu basit islemle de reverse diziyi elde edebiliriz. 
my_seq[::-1]

In [None]:
#Peki built fonksiyonlari kullanmadan kendimiz fonksiyon yazsak?

#reverse complement problem

def rev_complement(sequence):
    '''bir DNA dizisi alir ve onun reverse complement dizisini uretir.'''
    rev_comp = ''
    sequence = sequence.lower() #dikkat!
    for letter in sequence:
        if letter == 'a':
            rev_comp += 't'
        elif letter == 'c':
            rev_comp += 'g'
        elif letter == 't':
            rev_comp += 'a'
        else:
            rev_comp += 'c'
    return rev_comp[::-1].upper()

sequence = 'TTCTAGGGGTTACATGTAGAGCGGGTTAACTCCAATGATAGCCATTACCCCCCACTGTGCAAGTTGGCTACCCTCGACGAACCGCTTGCCGGGCTCTACCCCGATCGTCATCCGTTTGTCCCTGATATGGTTTGCTATCGAATAGATTACATTAATCATCAATTTCACGCGTAGGATCTATAATTCTGGAAAGCTGCAGCTCTTTGAACCGAGGAAGAACGCCAATTATGATAGAGCATTCAGCTACAGTGACATGGCCCGGACACAAAGCTCGACAGCACTACTGTAAGAAGTGGTATCGATCGCTCCTACCGAAGCGATGACTCAATTTCTATCACAGATCAACCTCGCTAGATGCTCTAAGGTGTTTCACTACACGGAATACAATGCGTCAACACTATCGCAAACGTAAGTGCTGTGTGTGTGCACAGACTTGCCATCTCAGAGTCTGCAGAGCTATCAGTTGAGTTCCGAGTAACTACGACTGTATAGATACCGCTGTCACAAGAACCTCGAGCTAGATGTTTCAAGGAGGTTACGTTAGCCAAGCTTAAGGTTCGCCATCGTCAGAGTTTGAAGTATTCCCTGACGCATGCATCCCGTAGACTCTTAGCTACACGCCGCTGTCCTTTACGTCGCGCCCGCGGAATATGACCAGAGACACGCGCTGCGCGTGTAGGGCATGACGTAATTGCGCGCTTCCGTCGTATCTAAGACATCTTACGCGATGAAAGGGCGTCGTGGTGAGAATCGTTCTCATACAGCCGGTTCGTCACGCGATACGAGACTCTACTCTCTGAATGTTCGTTCTTGAAACATAAGGCTTTCATGAGACGGGACAAGTTACTTTTGTTGAGAACCACATCCACGATACGACTCCCCACTTTATGCCTTGGAACTAGATGTAAATGGGACAACGGCGCGACCCTCGGTATAGTAGGGATTGTACAAGTGGGTTGGAAAATTATTGTGCAATTTTTAGTCGCCGCCCACGAGGCGTGTGCTGGCGATAAATACCCCCAAGCGAGAGTTGTCTGCTGCACGGTCAAGGCCGTTGTGGTACAGAGGTACCATCCGGGTCCGCCACGTACCTCGACGAGGAGTGATTGGACACGACTCTGGAAATTGCTACAGTAGTTAACTGAATATCGCAAACCATTGTTCACAGGGACCGCGCAATGTTCTTCTTCTCTCCGGCTCCGAGACTAGGGAATTAACAGTGTGGAGACCTTGGGGTGAGGTAGGATGTGTAGCCTCTGGTATACGAAAACAACGCATCCGACACCAATATGTTATAGTGTCATATCCGGGGGGAGCCGGAGAAGTGACCGTTATGCCACAATGTCCCCGAGATCTCGTAGGTCTTTTGAGGAGTTACGCCTAAGCGTGTAACGAGACCTGCGCTTTACGAATATTGGTTGCGGAGGGGAGTCTTTCGATCGTGCGTTGCTGACGTAGAATGAGGACGGCCAACAAATCCGATCGATCAGGTTTCTCTGTATATGACAGGCGGGCAGTTGATAATAGGTTCTACTGTAGAAGAATCTCATGTCCTGTGTCTGTACCCTCTCGTCATAGCGCCCTGCCCCTTTCACTACCACCCGCCGGGGGGTACAAGGGGTTCGTAAACAGCAACTCTTCATAGTCATTAAGCGATGTTATGAGACAAGTCAAGTGGGACATTAAGCTGGACAGATGTTGCGACGCGACCCACGTCACACGGAGTGTGGACCTGAGCTAGCGCCAGACGATGCTCGCACGGCCAACGTGTGGGGCCAGATCCGCCAATTGTAAAACAACCGGCGCACCACCGGTAACCGTTAATCCGGCAGCGTTAAATCTCGCAATTTGTTGCAAGCACAAATAGAGTTATCCGCACTCTATTTTGTTTGAGAGGACGCGACCGGAGGCATTTAAGAAGCGTGTTACAGGGGCACGTACTACAATATTTGGATAGGCTAAAAAGTACACTTGCAGGGGTATCCACTATCCCTTCCGACGTAGATCTGCACGGGAATATCTGCAGCTGGCGAGAGGAAACGTACCCGCTTGAAGCGGATCGGTAGTGTTTGGAGAGATAGTTGGGTTATGGACTCATGAAGACGTTAAAAAAGGCCAACTTAGCCAGACTTGGGCTCGTGAGGCCGATGGAAAGTCTAGCTGGTTACAATCAGTGAGATTTGTCTGCCGCTCTGGACTTCGCCGTCGATGCCCCGTGACTTGCCAGTCAGCCCGGTCCCCGCTTAATCCACATCAGCGCATTCCTCACCTCCGGCCCCCTGCGATCGGACCGAGCCCTTCTCGAGATTCATTATGCTAAGTAGCCGACCTCAATCACGTAATAACGTTATGGGTATCGACGGGGCACCGACTGCCAGAGACGTCGCTAAGATGGGTTTTCGGGTGTAAAACAATCGTCATTATACAGCCAGGATAGGACATTCCGTGCGTTAATGTATAAAATTGGGCTGCGCCGGGCGCCCGTGGTGAGTTGCCTACGAGCCAACTATGAGTATACAGTATCCACTGTCCACACGCCGTAAGCATCGGTAAGGACAACGAATGGTACTTCGATCAGTAGCCAGGATCCGATGGCGCAAATGGCTTGTGTACCCTGCGTTTATTGGACTGTCTATTATGCGATACCACCCATCAAGTGACCCCTAGTGATAGGTGGACGCGATGGCTGCCCTGCAACCGCTCATACGGCATGACTAAGCGTGTGGATGGTCATCTCCATTCGATCACGTGTTACCTCTGGCATACCCATCGATTTTCGAATTATTCTCTGCAACAAATGATGGGAGCAGAAAACACTGACCAGACCCGCGTACAGACCGTATCCACTGCATTTTTTGCGCTGGGCATATGACACCTTTGCCAAAAGGGTTTGCATTCAGGTGAAAGAGAGCAAAGCACGTACGGCGCGGCGTGGGAATTATCGGATTCCCTTGGTACCCTACGTCACGAGACAGTCCCAGACCCTTATTTGTCGTAACCGAAAAGAAAGGGCAGGAAACCTAGATAAAATGGGCAAAAGTAGTCTGATGAGTCCGTATTGTTAGTTTACGAACTCCGGACAACTATCCAGAGTATAGACAACTTCCTATTCCAGGCTATGAATTTAGCCCAAGCACCTTTCCCAGTTTTTAATTTGGAGGGGTCCCACTTTTAGGTACATACAGTTGATCGGTCCTTTCACATATTGTCGCGTGCCTGACCACGTTAAGGCCCGATCGGAATTAGTAGCTATTGCCAAAAGAAATGAGATGTACATAGCCCTGCGGGGAACGATCCTACACCTGTCACGGGGTAATTTTTTGAGCCGTGCCTTCCTAGGGTGAGGTAGTCGTGCTCCGGCTCGAAGAATATTCTATTACCTTCTGTCGGGTGAGAGTAATGGTACGCGAGCGACGCCACTACACCGGATTAGGGGTTAGGCGAGCTATGTCGGGCCGGTGAGACTTCAGCATGCATTCAAATGGGCCTATAGTAAGCTTGTGAGCCACTACGGCTGTCCCAAGATCTTAGTTCCCGCGAGTCTGGACTCCGCCTATATAGTAGACTTCTAGTGTAGTTGAGCTTGACCCAGATCACGTAATACCGATAACTAGCGGGTCCGGTGGTTTCTTTGCGAAACATTACAGACCGTACGAATAGTTCATTATGTGAGCTGAGCCACCCGACGACCCCGAAGAGCCGTAGAGTTGACCTGCGGTGCCATTAGAGTGCCCTGGCCCGGAGCGTAGCGAACGGAGGCTGCTATGCCAAATCCCTAAGTCCTACAAAGCTCTCGGAACACATGGTCCCCACGTAGTCAGAAAGCACCTACCAAACATTTAGGCCGCCCGTGTGGGAGAAGCCAACACCCGGGTACCAGCATTACTCGTGATCTAGTACTGAAAATACTAAGTTCACTCCACTCTAGCTCCCGCTTCTGGGTATTAGAAAGCAGAACTCGACGGGTATAAATCAATTCCTAATACGCTGTGACTGATAATACGTGTCCTGGGGGCGGCTATGCCTAGTTAGGTGTGAATTCGACGTAGTCGGACACACGCAGTAGTCATAAGGCAAGGTATTCCAACAAATAGTCCCGGGGGATCATACTAAAATGGTATTGACATTCGGCTCGTGGCCTAGAACTGAAACAAGCCTTAGATCGTTATGGGTTAGAGCCTAGTGAGGAGAAGCAGGCATTCCGACTCTCAAGGTCCCCAGCAACGATACGTTACGTGTAACGAGAGATACTGTGCATTATATACCTTCGCGCGGACTTGATAGAAAGGAGACGCGTGAGCAATGTATCTCACTGATCAGGCGAAGTCTACCACTGAGCCCTGAAACGCACGGTATAAGCGAGACAAGGCCGTATCGGGGAAACTTCTGGTAAACCACCGGGCCCTAGCGTGTAGATTGTAAACCCTATGGTTCGCCATCACCAGCCTTTGACGAGAACAGCCCATGCCTATCTTTGGACCAGTGTCCTGCATTCCGTCCGCGTTAAGAACCGTGTAAAGTAGACTGACTAATGGTGTTGAGTCAGAGGCCTTGCTCGCGTGGCAGTTTTCATGCTGGGTCGAGCCTAACCAAATGTCTAAACGGATGGCAACCCTTGGATACCTCGCGTACGTCCAAGACAGGCTAGCTACCTGCGCTTCCCATCTTTGCTGCCGAGATAGGCTACGGTACGGGATGTGAGTGTAAAAATTGACAAGTCAAGAGGCTAGATTCGATCTTTTATTGCGCAGGTCATGCACGGCCTCCACGTGTCAAGTGTCGGGCCCCAGACTCCGAGGGTCCCTGGCTTCGGCATTCGGAACGTCCCAATTTTAAGACCTAAGTTGGGATCCGCTCAGAACAGCTGTCACGGTAATTCGAGTTTATTGGTACACCATTCGTAGACTGCGGTTAAGGATTACCCTACTCATGCTATCAACGGAGGTAGACACTCATGAAGGATACGAAGACCCGATAATACCGCATGGATATCAACATTTGTTGACCATACAACCAGATGCCTGCACCTGATGATTGGGCAGGTCGAAGATCGTGGTTAGTGCACGATAGGCCGAGCTAGACTCGATGCAGAAAGCGTGATTGATTTCGCGCCCATAGACTAAGGTAAACGCACTCTAGTGGATCAGCTTCATCCACACGGAACTCAAGCTGCACCGCAAGTTCTGTGTGATTGAGGACAACCACCCTGGATTCAAATTAGATGAATAGACAACCGTGATTTGACGAGAGGGTGTGGTGTCGAACCACCGACACTTAGACCTACTTTAGTTCCCAAGTAAATGAGGTCTCGAGATTATCCGCCCTGAAAAGTGTCTAGCTCATTGAGTACTTCCAATTGTTATGGATAGATTTCATGGTCCTTCTACTTCCGCTCGAATAGAGCGACAGCCCTGGGCACTCGGCAAATGAACCAGAATTAGATTGTACATATCAACATATGTTGATGCCGCACTCACTCGCATTCTCGCTGTTCTAACACCCCGCTTCCATCATCTAGCTACGTCCGGCCAAATATACAGGCTGGAGGCGCTGGCCAGGGGCATCCGCCTGACTGCATTTGATTTGGTGAGCTCTGCTGAAGTTTCCCCATATACCCTCATAAGTACCCTTCTGCACAGGTCTGTGCTTCACATTACAGGGTGCAATCCAATACTGTTGTATGCTGACGCCTTCCTATGTGCGAAAGGCCGTTAACACCGCTGCTGATACTCCTTATGCATTTTGCGATTATCGGTCGAACCAGTTGGAACATGACCGTCTTCCAGCGGCTCTGCGTTAGCACCACCATGAAACTTATTTGATACAGCCCAGCAGGCAAGTCGAAGACGAGGCCCACGGTGACTAGTGTTGTAACACTACCGATCTGATGTCGAAAGGTCAATAACCCAACGTCAGGGATTCGCTAACGAGGGTGTTTTATGTGACAGAAATTACGAATAAAAGAATAGCAGCCTTCCTGGCTCCGCATAGCGGAACATATCTTTACGGCCTATGTAAATATACTCCAGGCGAGGGCATAATTATTTCATAATGTTCCTTAGTAGGACAAAATAGCGGAAGACTATAAGTCCAGAGAGAGCGGAGCGCATCATCACCCTCGCGTCCTCTCATCAGGGTGTTCTCGGCCACAGTCCGTTGTTTCATTTTAAGTGCTCATGACTTTGATCTGTCTGCAAATACATGAAATTGATGATGCCGGCGTGATTGGGAATCGCCGTGGAACGTCCGTGTGTGATTCGCGCGAAAGTGTTTGGTCGTTCCTCGCCCAACCAAAGGCAGTGACACAGGTAAAATACTCGGTTAGAGAGGCTCACGAGACTTCGGACACGAAGGGAAAATTAGGGTAGCGCTGACTGCGCCGAGCATACTACGAGTCCCGCGCCCGAGGGACTCGTGAGGGCGACCATCACCGCTGTTTTCGGGGGCCCCACCGTTCATAATTCCACTACGCAAAACTAGGCTTGGGGTGTTGGTGATGGTTCCAGTCTGCGTCCTCTAGCAGGGCTCCCGCCTGGTAAGGCCACTTATACTAACCTTTTTGGTTTTTGCAGGAAACCGTTCTAAGACTGATAATCTTTAATAATCAATGCTTGGCGAGCTAACAGCAATGAGATGCGGGTCACCATAACGGGTTTAAACTCCGAACGAGAGACTGGTCGATGCATACGCACCCTCCCCACGGGGAGGGGAACTGGTACTAAGCTCACACGCGTGCTAACGTCAGCGGGACCTTTTAATCATCAGAGACTTTAGGACCCTACCACCCCGATAGTAATACGCACTATCTGAGGGGTTAGTCCGCGAGCCCTCTCTCTCCTTTGGACTTGGACCTGCCGATTGCAATAATGGTCGGTGTATGGTTTAATGCTACTCTGCGCTCGGACGGTCCTCAAGCTGAGGGGTGGAAGAACGCGTACCATTTCTACACGCCACCGTAGCAATAATTTAGAAGCCGTTTTTTATTCATCGTCAAATTTTTGTTTGCGGCTGTGTCTGACGTCCGTCGGCAGTATGAACATCTATCAGCGCATCGAAATGAGGTGAGATATCTCTCTGTCAATCGACTCCGTTTAGTTCATTGTATTGACGAGATTTTGATGGTCACAACCTCATCGCTATCAAACCTCGGACAGTACCGGCCTAGAGCGCTTCGGTTTTGTGCTAGTAAGGGATAATCGCTAATTAGAGGTCATGGCGATCCCCACAGTACCCGGTAGGCAAGGGAGCTCGTGCCAGCTAAGATAACCGCTAGAATTACAAACCGTGTTGATTCTCCGGCATGTTCCGGGACGTGGAGTAAATCCTGATGTCACCAGGCGACCTTTTCAATAATAATCTCGTCTACAACTCTGTTATCATGTACCAGTGCGCAGAACGCAGGTCACATGTCCTCTCAGCACATAGTACGAATGGAGTTCATTAAACCATGTGATATGAGCGGTTGTTACAAAGACGCGATCCAGCGGATCTTTTTGTCGAGGGCGTCGAGTTATCACGAAAGCGCTGCTCAGGGAAATTCTCGCGTATCTGGTTAGCATCTTTACCGGCACAATGACTAATGCTTCGTAGTCTAACTAGGCCTAGTAACGATCCTATATTCCAACCGGGACGCACGCACCCAAGTCTACAGGTAAATGCATTGTCAATGTGTAAGGTCACATCTCAGTTTACAAGTGTCCTCAGAACTGACGTCTAGATAGCAAACTGTGGAAGCACACGGCGGTCCGGGGTAAAAAGCTGTCCTATAGGAAGATAAAAATACTAATAACGTTGATGGCAGAATATTATTATTTGGTTTAACTGCTCCCGAAGGACGCTATGTTGTTAAAGTTGCCTAAATGCTTCTTTTTGTATAATCGGTTATGACCCTTTCGAAATAATACATGTTCTCCGTACCTTTCAACTTGGAATACACACTGGTCCCATCTCGTATCTATCAAAGCGACGTGTGCCTATCTAGCGTTCCAGGTGCTAATTAGGAGGACAAGCGGATTGTTGGAATTCGCAGTTAGTAACATGCAGAGCTCAGCCCCGTCCTCCACAGTGGTTATGACTTGGAAAGCCCTATCCGCCAAATAACAAGCTATCCATCAGTGCGGCGATACAAGTGCCAATATGCCACAAATCGACCACCCAAAAGTGGATTAACTGTCATG'
print(rev_complement(sequence))      

#bu aslinda basit sayilabilecek bir fonksiyon. Mesela degenerate bazlari hesaba katmiyor. 

# Odev 4.1: Complement veya reverse complement molekuler biyolojide ne anlama gelir, 3-4 cumlede aciklayin, ornek verin (markdown kullanabilirsiniz).
# Odev 4.2: Ayrica Genbank'ten indirdiginiz bir DNA dizisinin complement ve reverse complementinin ciktisini uretin. 
# Odev 4.3: Biopython'un built in reverse complement fonksiyonu nasil calisiyor? Yukardaki fonksiyonumuza ne ekleyelim ki, degenerate bazlari da hesaba katalim?

#### Bir diger class da SeqRecord ya da Sequence Record. Bu sinif/class bir DNA/RNA dizisini depolar, yanisira bu diziye air tanimlayici (identifier), isim ve detaylari da kaydeder. Biopython kutuphanesinin Bio.SeqIO modulu SeqRecord objelerine uyumlu calisir, onlardan farkli tip sekans dosyalarinin okur ve cikti uretir. 