Adapted from [https://github.com/PacktPublishing/Bioinformatics-with-Python-Cookbook-Second-Edition](https://github.com/PacktPublishing/Bioinformatics-with-Python-Cookbook-Second-Edition), Chapter 2.

First let us retrieve a gene sequence. Here we test it with [human Lactase](https://www.ncbi.nlm.nih.gov/gene/?term=nm_002299) (LCT) 

In [None]:
from Bio import Entrez, Seq, SeqIO
from Bio.Alphabet import IUPAC

# change this for any other gene ID or add a list of them (in such case you will need to modify the code)
ID = 'NM_002299'

Entrez.email = "jvilla@uic.cat" 
hdl = Entrez.efetch(db='nucleotide', id=[ID], rettype='fasta')  # Lactase gene
seq = SeqIO.read(hdl, 'fasta')

We will first save it as a FASTA file

In [None]:
filename = 'data/'+ID+'.fasta'
w_hdl = open(filename, 'w')
SeqIO.write([seq], w_hdl, 'fasta')
w_hdl.close()

let us read the file we just created

In [None]:
recs = SeqIO.parse(filename, 'fasta')
for rec in recs:
    seq = rec.seq
    print(rec.description)
    print(seq[:50]+'...') 
    print(seq.alphabet)

let us change the alphabet in a way that the object "knows" this is DNA

In [None]:
seq = Seq.Seq(str(seq), IUPAC.unambiguous_dna)
seq

as it is DNA, we can now transcribe it into rna

In [None]:
rna = seq.transcribe()
rna

and now into protein code

In [None]:
prot = seq.translate()
prot