# Construction of pYPK0_A_Athmod1

This notebook describe the construction of the _E. coli_ vector [pYPK0_A_Athmod1](pYPK0_A_Athmod1.gb).
Primers needed for the amplification of the insert are designed in this notebook.

In [27]:
from pydna.readers import read
from pydna.genbank import Genbank
from pydna.parsers import parse_primers
from pydna.design  import primer_design
from pydna.amplify import pcr
from pydna.assembly import Assembly
from pydna.amplify import Anneal

The vector backbone [pYPK0](pYPK0.gb) is read from a local file.

In [28]:
pYPK0 = read("pYPK0.gb")

The restriction enzyme [AscI](http://rebase.neb.com/rebase/enz/AscI.html) is imported from [Biopython](http://biopython.org)

In [29]:
from Bio.Restriction import AscI

The plasmid is linearized with the enzyme.

In [30]:
pYPK0_AscI  = pYPK0.linearize(AscI)

Access to [Genbank](http://www.ncbi.nlm.nih.gov/nuccore) is needed in order to download the template.
If the email address below is not yours, change it before executing this script as you must always give 
NCBI a way to contact you when using their service.

In [31]:
gb = Genbank("bjornjobb@gmail.com")

The template is downloaded from Genbank below.

In [32]:
template  = gb.nucleotide("NM_126612 REGION: 198..1370")
template

Primers are needed to PCR amplify the insert. The forward primer adds two adenines in front of the start codon
which is a feature commonly found in highly expressed _S. cerevisiae_ genes.

In [33]:
fp_tail = "actttctcactagtgacctgcagccgacAATG"
rp_tail = "atcctgatgcgtttgtctgcacagatggCAC"

Primers are designed in the code cell below.

In [34]:
ins = primer_design(template)
fp = fp_tail + ins.forward_primer
rp = rp_tail + ins.reverse_primer

The primers are included in the [new_primer.txt](new_primers.txt) file.

In [35]:
print(fp.format("fasta"))
print(rp.format("fasta"))
with open("new_primers.txt", "a+") as f:
    f.write(fp.format("fasta"))
    f.write(rp.format("fasta"))

>f1173 NM_126612.3
actttctcactagtgacctgcagccgacAATGATGGCGGCTACAG

>r1173 NM_126612.3
atcctgatgcgtttgtctgcacagatggCACCTAATTCTTGCTGTTAAGGT



The newly designed primers are used to amplify the insert.

In [36]:
ins = pcr(fp, rp, template)

In [37]:
ins

The primers anneal on the template like this.

In [38]:
ins.figure()

                                5ATGGCGGCTACAG...ACCTTAACAGCAAGAATTAG3
                                                 ||||||||||||||||||||
                                                3TGGAATTGTCGTTCTTAATCCACggtagacacgtctgtttgcgtagtccta5
5actttctcactagtgacctgcagccgacAATGATGGCGGCTACAG3
                                 |||||||||||||
                                3TACCGCCGATGTC...TGGAATTGTCGTTCTTAATC5

A suggested PCR program.

In [39]:
ins.program()

|95°C|95°C               |    |tmf:54.1
|____|_____          72°C|72°C|tmr:54.8
|5min|30s  \ 55.9°C _____|____|30s/kb
|    |      \______/ 0:37|5min|GC 45%
|    |       30s         |    |1236bp


The linearzed vector and the insert are joined by homologous recombination.

In [40]:
asm = Assembly((pYPK0_AscI,ins))
asm

Assembly
fragments..: 5766bp 1236bp
limit(bp)..: 25
G.nodes....: 4
algorithm..: common_sub_strings

Usually two equally sized products are formed.

In [41]:
circular_products = asm.assemble_circular()
circular_products

[Contig(o6939), Contig(o6939)]

The first sequence is chosen.

In [42]:
candidate = circular_products[0]

The final vector is:

In [43]:
pYPK0_A_Athmod1 = candidate.synced(pYPK0)

A combination of standard primers and the gene specific primers are 
used for the strategy to identify correct clones.
Standard primers are listed [here](standard_primers.txt).
The standard primers are read into a dictonary in the code cell below.

In [44]:
p = { x.id: x for x in parse_primers("standard_primers.txt") }

## Diagnostic PCR confirmation of pYPK0_A_Athmod1

The correct structure of pYPK0_A_Athmod1 is confirmed by PCR using standard primers
577 and 342 that are vector specific together with the Athmod1fw primer specific for the insert 
in a multiplex PCR reaction with three primers present in the PCR reaction.

Two PCR products are expected if the insert was sucessfully cloned, sizes depending
on the orientation of the insert. 
If the vector is empty, only one short product is formed.

## Expected PCR products sizes:

pYPK0_A_Athmod1 with insert in correct orientation.

In [45]:
Anneal( (p['577'], p['342'], fp), pYPK0_A_Athmod1).products

[Amplicon(2111), Amplicon(1921)]

Empty clone

In [46]:
Anneal( (p['577'], p['342'], fp), pYPK0).products

[Amplicon(934)]

The cseguid checksum for the resulting plasmid is calculated for future reference.
The [cseguid checksum](http://pydna.readthedocs.org/en/latest/pydna.html#pydna.utils.cseguid) 
uniquely identifies a circular double stranded sequence.

In [47]:
pYPK0_A_Athmod1.cseguid()

toFBuJo7mRXEpZBG0y9CPIWM2jo

The file is given a name based on the cloned insert

In [48]:
pYPK0_A_Athmod1.locus = "pYPK0_A_Athmod1"[:16]

Sequence is stamped with the cseguid checksum. 
This can be used to verify the integrity of the sequence file.

In [49]:
pYPK0_A_Athmod1.stamp()

cSEGUID_toFBuJo7mRXEpZBG0y9CPIWM2jo

The sequence is written to a local file.

In [51]:
pYPK0_A_Athmod1.write("pYPK0_A_Athmod1.gb")