# Construction of pYPKa_A_EcfabF

This notebook describe the construction of the _E. coli_ vector [pYPKa_A_EcfabF](pYPKa_A_EcfabF.gb).

![pYPKa_A plasmid](pYPK_A.png "pYPKa_A plasmid")

In [22]:
from pydna.readers import read
from pydna.genbank import Genbank
from pydna.parsers import parse_primers
from pydna.amplify import pcr
from pydna.amplify import Anneal

The vector backbone [pYPKa](pYPKa.gb) is read from a local file.

In [23]:
pYPKa = read("pYPKa.gb")

The restriction enzyme [AjiI](http://rebase.neb.com/rebase/enz/AjiI.html) is imported from [Biopython](http://biopython.org)

In [24]:
from Bio.Restriction import AjiI

The plasmid is linearized with the enzyme.

In [25]:
pYPKa_AjiI  = pYPKa.linearize(AjiI)

Access to [Genbank](http://www.ncbi.nlm.nih.gov/nuccore) is needed in order to download the template.
If the email address below is not yours, change it before executing this script as you must always give NCBI a way to contact you when using their service.

In [26]:
gb = Genbank("bjornjobb@gmail.com")

The template is downloaded from Genbank below.

In [27]:
template  = gb.nucleotide(" NC_000913 REGION: 1151939..1153180")
template

In [28]:
template.express("sce")

|    cds    |  len  |  cai  |   gc  | sta  | stp  | n-end | CGA | CGG | CGC | CCG | CTC | GCG |  rare |
|-----------|-------|-------|-------|------|------|-------|-----|-----|-----|-----|-----|-----|-------|
| GTG...TAA | 414.0 | 0.618 | 0.533 | None | 0.47 | >30 h |  0  |  0  |  7  |  10 |  2  |  17 | 0.087 |

The two primers below are used to amplify the insert.

In [29]:
fp,rp =  parse_primers(""">712_EcfabF_fw
                          aaATGTCTAAGCGTCGTGTAGTTGT
                          >713_EcfabF_rv
                          TTAGATCTTTTTAAAGATCAAAGAAC""")

The gene is amplifed using the primers specified above.

In [30]:
ins = pcr(fp, rp, template)

In [31]:
ins[2:].translate()

SeqRecord(seq=Seq('MSKRRVVVTGLGMLSPVGNTVESTWKALLAGQSGISLIDHFDTSAYATKFAGLV...KI*'), id='<unknown id>', name='<unknown name>', description='<unknown description>', dbxrefs=[])

The primers anneal on the template like this.

In [10]:
ins.figure()

   5TGTCTAAGCGTCGTGTAGTTGT...GTTCTTTGATCTTTAAAAAGATCTAA3
                             ||||||||||||||||||||||||||
                            3CAAGAAACTAGAAATTTTTCTAGATT5
5aaATGTCTAAGCGTCGTGTAGTTGT3
    ||||||||||||||||||||||
   3ACAGATTCGCAGCACATCAACA...CAAGAAACTAGAAATTTTTCTAGATT5

A suggested PCR program.

In [11]:
ins.program()

|95°C|95°C               |    |tmf:63.2
|____|_____          72°C|72°C|tmr:54.6
|5min|30s  \ 58.2°C _____|____|30s/kb
|    |      \______/ 0:37|5min|GC 0%
|    |       30s         |    |1244bp


The final vector is:

In [12]:
pYPKa_A_EcfabF = (pYPKa_AjiI  + ins).looped().synced(pYPKa)

The vector with reverse insert is created below. This vector theoretically make up
fifty percent of the clones. The PCR strategy below is used to identify the correct clones.

In [13]:
pYPKa_A_EcfabFb = (pYPKa_AjiI  + ins.rc()).looped().synced(pYPKa)

A combination of standard primers and the newly designed primers are 
used for the strategy to identify correct clones.
Standard primers are listed [here](standard_primers.txt).
The standard primers are read into a dictonary in the code cell below.

In [14]:
p = { x.id: x for x in parse_primers("standard_primers.txt") }

## Diagnostic PCR confirmation of pYPKa_A_EcfabF
The correct structure of pYPKa_A_EcfabF is confirmed by PCR using standard primers
577 and 342 that are vector specific together with the EcfabFfw primer specific for the insert 
in a multiplex PCR reaction with three primers present in the PCR reaction.

Two PCR products are expected if the insert was sucessfully cloned, sizes depending
on the orientation of the insert. 
If the vector is empty, only one short product is formed.

## Expected PCR products sizes:

pYPKa_A_EcfabF with insert in correct orientation.

In [15]:
Anneal( (p['577'], p['342'], fp), pYPKa_A_EcfabF).products

[Amplicon(2178), Amplicon(1960)]

pYPKa_A_EcfabF with insert in reverse orientation.

In [16]:
Anneal( (p['577'], p['342'], fp), pYPKa_A_EcfabFb).products

[Amplicon(2178), Amplicon(1462)]

Empty clone

In [17]:
Anneal( (p['577'], p['342'], fp), pYPKa).products

[Amplicon(934)]

The cseguid checksum for the resulting plasmid is calculated for future reference.
The [cseguid checksum](http://pydna.readthedocs.org/en/latest/pydna.html#pydna.utils.cseguid) 
uniquely identifies a circular double stranded sequence.

In [18]:
pYPKa_A_EcfabF.cseguid()

ovENIp7X3u4jp91e4_1mK-o-Tt0

The file is given a name based on the cloned insert

In [19]:
pYPKa_A_EcfabF.locus = "pYPKa_A_EcfabF"[:16]

Sequence is stamped with the cseguid checksum. 
This can be used to verify the integrity of the sequence file.

In [20]:
pYPKa_A_EcfabF.stamp()

cSEGUID_ovENIp7X3u4jp91e4_1mK-o-Tt0

The sequence is written to a local file.

In [21]:
pYPKa_A_EcfabF.write("pYPKa_A_EcfabF.gb")