# pYPKa_Z_TEF1 and pYPKa_E_TEF1

This Jupyter notebook describe the construction of E. coli vectors [pYPKa_Z_TEF1](pYPKa_Z_TEF1.gb) and [pYPKa_E_TEF1](pYPKa_E_TEF1.gb). 
These two vectors share backbone and insert, but in the former, the insert is cloned using the restriction
enzyme [ZraI](http://rebase.neb.com/rebase/enz/ZraI.html) while in the latter [EcoRV](http://rebase.neb.com/rebase/enz/EcoRV.html) is used.

The insert cloned in ZraI is meant to function as a promoter, while in the EcoRV site it is meant to be  used as a terminator.

Links to the sequence of each vector in Genbank format can be found at the bottom of this document.

![pYPKa_Z and pYPKa_E](figure_pYPKa_ZE.png "pYPKa_Z or pYPKa_E plasmid")

The Python package [pydna](https://pypi.python.org/pypi/pydna/) package is imported in the code cell below to provide 
the cloning functionality. There is a [publication](http://www.biomedcentral.com/1471-2105/16/142) describing pydna as well as
[documentation](http://pydna.readthedocs.io/) available online. Pydna is developed on [Github](https://github.com/BjornFJohansson/pydna).

In [1]:
from pydna.readers import read
from pydna.parsers import parse_primers
from pydna.genbank import Genbank
from pydna.amplify import pcr
from pydna.amplify import Anneal

The vector backbone pYPKa is read from a local [file](pYPKa.gb).

In [2]:
pYPKa = read("pYPKa.gb")

Both restriction enzymes are imported from [Biopython](http://biopython.org/wiki/Main_Page)

In [3]:
from Bio.Restriction import ZraI, EcoRV

The vector is cut with both enzymes.

In [4]:
pYPKa_ZraI  = pYPKa.linearize(ZraI)
pYPKa_EcoRV = pYPKa.linearize(EcoRV)

The template below comes from a Genbank [record](http://www.ncbi.nlm.nih.gov/nuccore/BK006949.2).
Access to Genbank is needed in order to download the template.
If you execute this script, change the email address below to your own.
Always tell Genbank who you are, when using their web service.

In [5]:
gb = Genbank("bjornjobb@gmail.com")

The template is downloaded from Genbank below.

In [6]:
template = gb.nucleotide("BK006949.2 REGION: 700015..700593")

The template is a 579 bp linear DNA fragment.

In [7]:
template

The insert has the sequence shown below.

In [8]:
print(str(template.seq))

ACAATGCATACTTTGTACGTTCAAAATACAATGCAGTAGATATATTTATGCATATTACATATAATACATATCACATAGGAAGCAACAGGCGCGTTGGACTTTTAATTTTCGAGGACCGCGAATCCTTACATCACACCCAATCCCCCACAAGTGATCCCCCACACACCATAGCTTCAAAATGTTTCTACTCCTTTTTTACTCTTCCAGATTTTCTCGGACTCCGCGCATCGCCGTACCACTTCAAAACACCCAAGCACAGCATACTAAATTTCCCCTCTTTCTTCCTCTAGGGTGTCGTTAATTACCCGTACTAAAGGTTTGGAAAAGAAAAAAGAGACCGCCTCGTTTCTTTTTCTTCGTCGAAAAAGGCAATAAAAATTTTTATCACGTTTCTTTTTCTTGAAAATTTTTTTTTTTGATTTTTTTCTCTTTCGATGACCTCCCATTGATATTTAAGTTAATAAACGGTCTTCAATTTCTCAAGTTTCAGTTTCATTTTTCTTGTTCTATTACAACTTTTTTTACTTCTTGCTCATTAGAAAGAAAGCATAGCAATCTAATCTAAGTTTTAATTACAAA


The seguid checksum of the template should be

```aV8DVOYw0hNwHWvnBDmtevIV3F8```

In [9]:
template.seguid()

aV8DVOYw0hNwHWvnBDmtevIV3F8

Two primers are used to amplify the insert:

In [10]:
fp,rp = parse_primers(""">417_ScTEF1tpf
                         TTAAATAACAATGCATACTTTGTACGTTCA
                         >626_ScTEF1tpr_PacI
                         taattaaTTTGTAATTAAAACTTAGATTAGATTGC""")

PCR to create the insert using the primers above.

In [11]:
prd = pcr(fp, rp, template)
assert str(fp.seq) in prd

The PCR product has this length in bp.

In [12]:
len(prd)

593

A figure of the primers annealing on template.

In [13]:
prd.figure()

       5ACAATGCATACTTTGTACGTTCA...GCAATCTAATCTAAGTTTTAATTACAAA3
                                  ||||||||||||||||||||||||||||
                                 3CGTTAGATTAGATTCAAAATTAATGTTTaattaat5
5TTAAATAACAATGCATACTTTGTACGTTCA3
        |||||||||||||||||||||||
       3TGTTACGTATGAAACATGCAAGT...CGTTAGATTAGATTCAAAATTAATGTTT5

A suggested PCR program.

In [14]:
prd.program()

|95°C|95°C               |    |tmf:59.9
|____|_____          72°C|72°C|tmr:56.6
|5min|30s  \ 52.9°C _____|____|60s/kb
|    |      \______/ 0:35|5min|GC 34%
|    |       30s         |    |593bp

The final vectors are:

In [15]:
pYPKa_Z_TEF1 = (pYPKa_ZraI  + prd).looped().synced(pYPKa)
pYPKa_E_TEF1 = (pYPKa_EcoRV + prd).looped().synced(pYPKa)

The final vectors with reverse inserts are created below. These vectors theoretically make up
fifty percent of the clones. The PCR strategy outlined below can be used to identify clones with the insert
in the correct orientation.

In [16]:
pYPKa_Z_TEF1b = (pYPKa_ZraI  + prd.rc()).looped().synced(pYPKa)
pYPKa_E_TEF1b = (pYPKa_EcoRV + prd.rc()).looped().synced(pYPKa)

A combination of yeast pathway kit standard primers and the primers above are
used for the strategy to identify correct clones.
Standard primers used in the yeast are listed [here](standard_primers.txt).

In [17]:
p = { x.id: x for x in parse_primers(""">577
                                         gttctgatcctcgagcatcttaagaattc
                                         >578
                                         gttcttgtctcattgccacattcataagt
                                         >468
                                         gtcgaggaacgccaggttgcccact
                                         >467
                                         ATTTAAatcctgatgcgtttgtctgcacaga
                                         >567
                                         GTcggctgcaggtcactagtgag
                                         >568
                                         GTGCcatctgtgcagacaaacg
                                         >775
                                         gcggccgctgacTTAAAT
                                         >778
                                         ggtaaatccggatTAATTAA
                                         >342
                                         CCTTTTTACGGTTCCTGGCCT""") }

## Diagnostic PCR confirmation

The correct structure of pYPKa_Z_TEF1 is confirmed by PCR using standard primers
577 and 342 that are vector specific together with the TEF1fw primer specific for the insert
in a multiplex PCR reaction with
all three primers present.

Two PCR products are expected if the insert was cloned, the sizes depend
on the orientation. If the vector is empty or contains another insert, only one
product is formed.

#### Expected PCR products sizes from pYPKa_Z_TEF1:

pYPKa_Z_TEF1 with insert in correct orientation.

In [18]:
Anneal( (p['577'], p['342'], fp), pYPKa_Z_TEF1).products

[Amplicon(1527), Amplicon(1359)]

pYPKa_Z_TEF1 with insert in reverse orientation.

In [19]:
Anneal( (p['577'], p['342'], fp), pYPKa_Z_TEF1b).products

[Amplicon(1527), Amplicon(761)]

Empty pYPKa clone.

In [20]:
Anneal( (p['577'], p['342'], fp), pYPKa).products

[Amplicon(934)]

#### Expected PCR products sizes pYPKa_E_TEF1:

pYPKa_E_TEF1 with insert in correct orientation.

In [21]:
Anneal( (p['577'], p['342'], fp), pYPKa_E_TEF1).products

[Amplicon(1527), Amplicon(1278)]

pYPKa_E_TEF1 with insert in reverse orientation.

In [22]:
Anneal( (p['577'], p['342'], fp), pYPKa_E_TEF1b).products


[Amplicon(1527), Amplicon(842)]

Calculate cseguid checksums for the resulting plasmids for future reference.
This checksum that uniquely describes a circular double stranded
sequence. See this [blog post](https://ochsavidare.blogspot.com/2016/02/checksum-for-circular-biological.html) for more info.

pYPKa_Z_TEF1.cseguid() should be inRfvhISKV2ATGWH7MRU60tr5sk

In [23]:
print(pYPKa_Z_TEF1.cseguid())
assert pYPKa_Z_TEF1.cseguid() == "inRfvhISKV2ATGWH7MRU60tr5sk"

inRfvhISKV2ATGWH7MRU60tr5sk


pYPKa_E_TEF1.cseguid() should be 9SZBF1DJW78geb7jSQon9GsXwPQ

In [24]:
print(pYPKa_E_TEF1.cseguid())
assert pYPKa_E_TEF1.cseguid() == "9SZBF1DJW78geb7jSQon9GsXwPQ"

9SZBF1DJW78geb7jSQon9GsXwPQ


The sequences are named based on the name of the cloned insert.

In [25]:
pYPKa_Z_TEF1.locus = "pYPKa_Z_TEF1"[:16]
pYPKa_E_TEF1.locus = "pYPKa_Z_TEF1"[:16]

Stamp sequence with cseguid checksum. This can be used to verify the
integrity of the sequence file.

In [26]:
pYPKa_Z_TEF1.stamp()
pYPKa_E_TEF1.stamp()

cSEGUID_9SZBF1DJW78geb7jSQon9GsXwPQ

pYPKa_Z_TEF1 is written to a local file:

In [27]:
pYPKa_Z_TEF1.write("pYPKa_Z_TEF1.gb")

pYPKa_E_TEF1 is written to a local file:

In [28]:
pYPKa_E_TEF1.write("pYPKa_E_TEF1.gb")