# Construction of pYPKa_A_EcfabZ

This notebook describe the construction of the _E. coli_ vector [pYPKa_A_EcfabZ](pYPKa_A_EcfabZ.gb).

![pYPKa_A plasmid](pYPK_A.png "pYPKa_A plasmid")

In [22]:
from pydna.readers import read
from pydna.genbank import Genbank
from pydna.parsers import parse_primers
from pydna.amplify import pcr
from pydna.amplify import Anneal

The vector backbone [pYPKa](pYPKa.gb) is read from a local file.

In [23]:
pYPKa = read("pYPKa.gb")

The restriction enzyme [AjiI](http://rebase.neb.com/rebase/enz/AjiI.html) is imported from [Biopython](http://biopython.org)

In [24]:
from Bio.Restriction import AjiI

The plasmid is linearized with the enzyme.

In [25]:
pYPKa_AjiI  = pYPKa.linearize(AjiI)

Access to [Genbank](http://www.ncbi.nlm.nih.gov/nuccore) is needed in order to download the template.
If the email address below is not yours, change it before executing this script as you must always give NCBI a way to contact you when using their service.

In [26]:
gb = Genbank("bjornjobb@gmail.com")

The template is downloaded from Genbank below.

In [27]:
template  = gb.nucleotide(" NC_000913 REGION: 202101..202556")
template

In [28]:
template.express("sce")

|    cds    |  len  |  cai  |   gc  |  sta  | stp | n-end | CGA | CGG | CGC | CCG | CTC | GCG |  rare |
|-----------|-------|-------|-------|-------|-----|-------|-----|-----|-----|-----|-----|-----|-------|
| TTG...TGA | 152.0 | 0.599 | 0.504 | 0.069 | 0.3 | >30 h |  0  |  1  |  6  |  6  |  0  |  2  | 0.099 |

The two primers below are used to amplify the insert.

In [29]:
fp,rp =  parse_primers(""">718_EcfabZ_fw
                          aaATGACTACTAACACTCATACTCTGCAG
                          >719_EcfabZ_rv
                          TCAGGCCTCCCGGCTA""")

The gene is amplifed using the primers specified above.

In [30]:
ins = pcr(fp, rp, template)

In [44]:
ins[2:].translate()

SeqRecord(seq=Seq('MTTNTHTLQIEEILELLPHRFPFLLVDRVLDFEEGRFLRAVKNVSVNEPFFQGH...EA*'), id='<unknown id>', name='<unknown name>', description='<unknown description>', dbxrefs=[])

The primers anneal on the template like this.

In [31]:
ins.figure()

   5TGACTACTAACACTCATACTCTGCAG...TAGCCGGGAGGCCTGA3
                                 ||||||||||||||||
                                3ATCGGCCCTCCGGACT5
5aaATGACTACTAACACTCATACTCTGCAG3
    ||||||||||||||||||||||||||
   3ACTGATGATTGTGAGTATGAGACGTC...ATCGGCCCTCCGGACT5

A suggested PCR program.

In [32]:
ins.program()

|95°C|95°C               |    |tmf:63.5
|____|_____          72°C|72°C|tmr:68.9
|5min|30s  \ 59.4°C _____|____|30s/kb
|    |      \______/ 0:13|5min|GC 0%
|    |       30s         |    |458bp


The final vector is:

In [33]:
pYPKa_A_EcfabZ = (pYPKa_AjiI  + ins).looped().synced(pYPKa)

The vector with reverse insert is created below. This vector theoretically make up
fifty percent of the clones. The PCR strategy below is used to identify the correct clones.

In [34]:
pYPKa_A_EcfabZb = (pYPKa_AjiI  + ins.rc()).looped().synced(pYPKa)

A combination of standard primers and the newly designed primers are 
used for the strategy to identify correct clones.
Standard primers are listed [here](standard_primers.txt).
The standard primers are read into a dictonary in the code cell below.

In [35]:
p = { x.id: x for x in parse_primers("standard_primers.txt") }

## Diagnostic PCR confirmation of pYPKa_A_EcfabZ
The correct structure of pYPKa_A_EcfabZ is confirmed by PCR using standard primers
577 and 342 that are vector specific together with the EcfabZfw primer specific for the insert 
in a multiplex PCR reaction with three primers present in the PCR reaction.

Two PCR products are expected if the insert was sucessfully cloned, sizes depending
on the orientation of the insert. 
If the vector is empty, only one short product is formed.

## Expected PCR products sizes:

pYPKa_A_EcfabZ with insert in correct orientation.

In [36]:
Anneal( (p['577'], p['342'], fp), pYPKa_A_EcfabZ).products

[Amplicon(1392), Amplicon(1174)]

pYPKa_A_EcfabZ with insert in reverse orientation.

In [37]:
Anneal( (p['577'], p['342'], fp), pYPKa_A_EcfabZb).products

[Amplicon(1392), Amplicon(676)]

Empty clone

In [38]:
Anneal( (p['577'], p['342'], fp), pYPKa).products

[Amplicon(934)]

The cseguid checksum for the resulting plasmid is calculated for future reference.
The [cseguid checksum](http://pydna.readthedocs.org/en/latest/pydna.html#pydna.utils.cseguid) 
uniquely identifies a circular double stranded sequence.

In [39]:
pYPKa_A_EcfabZ.cseguid()

ey9WAsmXdh9qWnQq0F5d0k8LMjc

The file is given a name based on the cloned insert

In [40]:
pYPKa_A_EcfabZ.locus = "pYPKa_A_EcfabZ"[:16]

Sequence is stamped with the cseguid checksum. 
This can be used to verify the integrity of the sequence file.

In [41]:
pYPKa_A_EcfabZ.stamp()

cSEGUID_ey9WAsmXdh9qWnQq0F5d0k8LMjc

The sequence is written to a local file.

In [42]:
pYPKa_A_EcfabZ.write("pYPKa_A_EcfabZ.gb")