# 1. Source

Click on the link to go to the source web page of **Rosalind**: [GenBank Introduction](https://rosalind.info/problems/gbk/)

 **Problem**
 
 ![GenBank Introduction](gbk_problem.png "GenBank Introduction")

**Sample Dataset**

Anthoxanthum<br>
2003/7/25<br>
2005/12/27

**Sample Output**

7

# 2. Workspace

In [1]:
# read the input file

with open('gbk_test.txt', 'r') as file:
    genus = file.readline().rstrip()
    startDate = file.readline().rstrip()
    endDate = file.readline().rstrip()
    
# print

print(genus)
print(startDate)
print(endDate)

Anthoxanthum
2003/7/25
2005/12/27


In [2]:
# start the session with NCBI

from Bio import Entrez

Entrez.email = '<your@mail.address>'

In [3]:
# build the search term

genus_filter = f'{genus}[Organism]'
date_filter = f'{startDate}:{endDate}[dp]'
    
# two parts should be connected with 'AND'

search_term = genus_filter + ' AND ' + date_filter

# print

print(search_term)

Anthoxanthum[Organism] AND 2003/7/25:2005/12/27[dp]


In [4]:
# query nucleotide database with the given organism name

handle = Entrez.esearch(db = 'nucleotide',
                        term = search_term)

record = Entrez.read(handle)

In [5]:
# see all records

record

{'Count': '7', 'RetMax': '7', 'RetStart': '0', 'IdList': ['33413983', '33413982', '33413981', '33413980', '71056585', '57283843', '57283791'], 'TranslationSet': [{'From': 'Anthoxanthum[Organism]', 'To': '"Anthoxanthum"[Organism]'}], 'TranslationStack': [{'Term': '"Anthoxanthum"[Organism]', 'Field': 'Organism', 'Count': '85416', 'Explode': 'Y'}, {'Term': '2003/07/25[PDAT]', 'Field': 'PDAT', 'Count': '0', 'Explode': 'N'}, {'Term': '2005/12/27[PDAT]', 'Field': 'PDAT', 'Count': '0', 'Explode': 'N'}, 'RANGE', 'AND'], 'QueryTranslation': '"Anthoxanthum"[Organism] AND 2003/07/25[PDAT] : 2005/12/27[PDAT]'}

In [6]:
# for the answer thaat rosalind request

record['Count']

'7'

# 3. Implementation

In [7]:
def gbk(filename, mail_address):
    
    '''
    input
        a file contains genus name, start and end dates
        mail_address to connect NCBI
    process
        query genbank nucleotide db for genus as organism between dates
    output
        prints number of entries to console
        writes number of entreis to a file
    '''
    
    from Bio import Entrez
    
    # read input file
    with open(filename, 'r') as file:
        genus = file.readline().rstrip()
        startDate = file.readline().rstrip()
        endDate = file.readline().rstrip()
        
    # build search term
    search_term = f'{genus}[Organism] AND {startDate}:{endDate}[dp]'
        
    # start NSBI session
    Entrez.email = mail_address
    
    # query nucleotide db
    handle = Entrez.esearch(db = 'nucleotide', term = search_term)
    record = Entrez.read(handle)
    
    # result
    entry_number = record['Count']
    
    # print answer to console
    print('\n\x1B[1mANSWER\x1B[0m\n______\n')
    print(f'{entry_number}')
    
    # open file and write answer
    file = open(f'{filename.split(".")[0]}_answer.txt', 'w')
    file.write(f'{entry_number}')
    file.close()
    print('\n\n#! The answer has been written into the file:',
          f'\x1B[1m./{filename.split(".")[0]}_answer.txt\x1B[0m\n')

# 4. Execution

In [8]:
gbk('gbk_test.txt', '<your@mail.address>')


[1mANSWER[0m
______

7


#! The answer has been written into the file: [1m./gbk_test_answer.txt[0m



In [9]:
gbk('rosalind_gbk_1_dataset.txt', '<your@mail.address>')


[1mANSWER[0m
______

1725


#! The answer has been written into the file: [1m./rosalind_gbk_1_dataset_answer.txt[0m



In [10]:
gbk('rosalind_gbk_2_dataset.txt', '<your@mail.address>')


[1mANSWER[0m
______

35


#! The answer has been written into the file: [1m./rosalind_gbk_2_dataset_answer.txt[0m



In [11]:
gbk('rosalind_gbk.txt', '<your@mail.address>')


[1mANSWER[0m
______

84


#! The answer has been written into the file: [1m./rosalind_gbk_answer.txt[0m



<p style='text-align: right;'>
    <!--<b><font size = '5'>Contact</font></b><br>-->
    <b>Orcun Tasar</b><br>
    <i>Bioinformatician / Data Scientist</i><br>
    orcuntasar |at@| ogr.iu.edu.tr<br>
    tasar.orcun |at@| gmail.com<br>
    <a href = 'https://www.linkedin.com/in/orçun-taşar-7b5992a1/'>Linkedin</a> | <a href = 'https://www.instagram.com/shatranuchor/'>Instagram</a>
</p>