![rmotr](https://user-images.githubusercontent.com/7065401/52071918-bda15380-2562-11e9-828c-7f95297e4a82.png)
<hr style="margin-bottom: 40px;">

# Python for Genomics 
## Section 5: Writing and Converting Files Exercises 

![purple-divider](https://user-images.githubusercontent.com/7065401/52071927-c1cd7100-2562-11e9-908a-dde91ba14e59.png)

## Let's write your freshly extracted NP sequences to a file using `SeqIO.write()`

Remember those Ituri province NP (nucleoprotein) sequences from that genbank file that you saved to a list (in the last exercise notebook from section4)? I'll rewrite the code for that below:

In [3]:
from Bio import SeqIO

ituri_seqs = SeqIO.parse('data/Ituri_sequences.gb', 'genbank')

NP_list = []

for record in ituri_seqs:
    for feature in record.features:
        if feature.type == 'gene' and feature.qualifiers['gene'] == ['NP']:
            NP_gene = feature.extract(record.seq)
            NP_list.append(NP_gene)

### 1 - Go ahead and take a look at NP_list

In [None]:
# your code goes here...

In [4]:
NP_list

[Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),
 Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),
 Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),
 Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),
 Seq('TCGTCCTCAGAAAGTTTGGATGACGCCGAATCTCACTGAATCTGACATGGATTA...TGA', IUPACAmbiguousDNA()),
 Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),
 Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),
 Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),
 Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),
 Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),
 Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()),

### 2 - What type of objects are contained within that list? We'll need Seq Records. Can you rewrite the code above a create a new list (with a new name) of Seq Records? Check the length of the list and take a look at the contents of the list. 

In [None]:
# your code goes here...

In [5]:
from Bio.Seq import Seq
from Bio.SeqRecord import SeqRecord


ituri_seqs = SeqIO.parse('data/Ituri_sequences.gb', 'genbank')
NP_records = []

for record in ituri_seqs:
    for feature in record.features:
        if feature.type == 'gene' and feature.qualifiers['gene'] == ['NP']:
            NP_gene_seq = feature.extract(record.seq)
            NP_record = SeqRecord(NP_gene_seq,
                                  id=record.id, name=record.name,
                                  description=record.description)
            NP_records.append(NP_record)


print(len(NP_records))
NP_records

59


[SeqRecord(seq=Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()), id='MK163675.1', name='MK163675', description='Zaire ebolavirus isolate Ebola virus/H.sapiens-wt/COD/2018/Ituri-BEN294, partial genome', dbxrefs=[]),
 SeqRecord(seq=Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()), id='MK163674.1', name='MK163674', description='Zaire ebolavirus isolate Ebola virus/H.sapiens-wt/COD/2018/Ituri-BEN292, partial genome', dbxrefs=[]),
 SeqRecord(seq=Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()), id='MK163673.1', name='MK163673', description='Zaire ebolavirus isolate Ebola virus/H.sapiens-wt/COD/2018/Ituri-BEN286, partial genome', dbxrefs=[]),
 SeqRecord(seq=Seq('GAGGAAGATTAATAATTTTCCTCTCATTGAAATTTATATCGGAATTTAAATTGA...AAA', IUPACAmbiguousDNA()), id='MK163672.1', name='MK163672', description='Zaire ebolavirus isolate Ebola virus/H.sapiens-wt/COD/2018/Ituri-BEN274, partial ge

## 3 - Now use the `SeqIO.write()` to write this list to a genbank file to a name of your choice.

In [None]:
# your code goes here...

In [8]:
SeqIO.write(NP_records, 'data/output/solutions/Ituri_NPgenes.gb', 'genbank')

59

## 4 - Convert your newly created genbank file into a fasta file.

In [None]:
# your code goes here...

In [9]:
SeqIO.convert('data/output/solutions/Ituri_NPgenes.gb', 'genbank', 'data/output/solutions/Ituri_NPgenes.fasta', 'fasta')

59

Great Job! 😃