# pYPK0_TPI1_EcfabA_PMP3

This notebook describes the assembly of the [_Saccaromyces cerevisiae_](www.yeastgenome.org)
single gene expression vector pYPK0_TPI1_EcfabA_PMP3.

It is made by _in-vivo_ homologous recombination between three PCR products and one linear vector fragment.
The PCR products are a promoter generated from a pYPK_Z vector, a gene from a pYPKa_A vector and 
a terminator from a pYPKa_E vector. The three PCR products are joined with
a linearized [pYPKpw](https://github.com/BjornFJohansson/ypk-xylose-pathways/blob/master/notebooks/pYPKpw.ipynb) 
backbone vector that has the [URA3](http://www.yeastgenome.org/locus/S000000747/overview) 
marker and a _S. crevisiae_ [2 micron](http://blog.addgene.org/plasmids-101-yeast-vectors) origin of replication. 

The four linear DNA fragments are joined by homologous recombination in a 
[_Saccharomyces cerevisiae_](http://wiki.yeastgenome.org/index.php/Commonly_used_strains) ura3 mutant.

![pYPK0_promoter_gene_terminator](tp_g_tp.png "pYPK0_promoter_gene_terminator")

A part of the [pydna](https://pypi.python.org/pypi/pydna/) package is imported in the code cell below.

In [1]:
from pydna.parsers import parse_primers
from pydna.readers import read
from pydna.amplify import pcr
from pydna.assembly import Assembly

The Yeast Pathway Kit [standard primers](standard_primers.txt) are read into a dictionary in the code cell below.

In [2]:
p = { x.id: x for x in parse_primers("standard_primers.txt") }

The backbone vector [pYPKpw](pYPKpw.gb) is read from a local file in the code cell below.

In [3]:
pYPKpw = read("pYPKpw.gb")

The backbone vector is linearized by digestion with [EcoRV](http://rebase.neb.com/rebase/enz/EcoRV.html).
The restriction enzyme functionality is provided by [biopython](http://biopython.org).

In [4]:
from Bio.Restriction import EcoRV

pYPK_EcoRV = pYPKpw.linearize(EcoRV)

The pYPKa derived _E. coli_ plasmids containing [promoter](pYPKa_Z_TPI1.gb), [gene](pYPKa_A_EcfabA.gb) and [terminator](pYPKa_E_PMP3.gb)
are read into three variables below.

In [5]:
promoter_template   = read("pYPKa_Z_TPI1.gb")
gene_template       = read("pYPKa_A_EcfabA.gb")
terminator_template = read("pYPKa_E_PMP3.gb")

The construction of the three vector above are described in the [pYPKa_ZE_TPI1](pYPKa_ZE_TPI1.ipynb) and [pYPKa_A_EcfabA](pYPKa_A_EcfabA.ipynb) notebooks.

Three DNA fragments are PCR amplified using [standard primers](standard_primers.txt). Suggested PCR programs can be found at the end of this document.

In [6]:
prom = pcr( p['577'], p['567'], promoter_template)
gene = pcr( p['468'], p['467'], gene_template)
term = pcr( p['568'], p['578'], terminator_template)

The four linear DNA fragments are mixed and transformed
to a _Saccharomyces cerevisiae_ ura3 mutant.

The fragments will be assembled by _in-vivo_ [homologous recombination](http://www.ncbi.nlm.nih.gov/pubmed/2828185):

In [7]:
asm = Assembly( (pYPK_EcoRV, prom, gene, term), limit=31 )

asm

Assembly
fragments..: 5603bp 814bp 608bp 1096bp
limit(bp)..: 31
G.nodes....: 8
algorithm..: common_sub_strings

The representation of the asm object above should normally indicate one circcular product only.
More than one circular products might indicate an incorrect assembly strategy or represent
by-products that might arise in the assembly process.
The largest recombination product is chosen as candidate for the pYPK0_TPI1_EcfabA_PMP3 vector.

In [8]:
candidate = asm.assemble_circular()[0]

candidate.figure()

 -|pYPKpw_lin|124
|             \/
|             /\
|             124|814bp_PCR_prod|50
|                                \/
|                                /\
|                                50|608bp_PCR_prod|37
|                                                  \/
|                                                  /\
|                                                  37|1096bp_PCR_prod|242
|                                                                     \/
|                                                                     /\
|                                                                     242-
|                                                                        |
 ------------------------------------------------------------------------

The candidate vector is synchronized to the backbone vector. This means that
the plasmid origin is shifted so that it matches the pYPKpw backbone vector.

In [9]:
result = candidate.synced(pYPKpw)

### Diagnostic PCR confirmation

The structure of the final vector is confirmed by two
separate PCR reactions, one for the promoter and gene and
one for the gene and terminator.

PCR using standard primers 577 and 467 to amplify promoter and gene.

In [10]:
product = pcr( p['577'], p['467'], result)

A correct clone should give this size in base pairs:

In [11]:
print(len(product))

1372


If the promoter is missing from the assembly, the PCR product will have this size in base pairs:

In [12]:
print(len(product) - len(prom))

558


If the gene is missing from the assembly, the PCR product will have this size in base pairs:

In [13]:
print(len(product) - len(gene))

764


PCR using standard primers 468 and 578 to amplify gene and terminator.

In [14]:
product2 = pcr( p['468'], p['578'], result)

A correct clone should give this size:

In [15]:
print(len(product2))

1667


If the gene is missing from the assembly, the PCR product will have this size in base pairs:

In [16]:
print(len(product2) - len(gene))

1059


If the terminator is missing from the assembly, the PCR product will have this size in base pairs:

In [17]:
print(len(product2) - len(term))

571


The cseguid checksum for the resulting plasmid is calculated for future reference.
The [cseguid checksum](http://pydna.readthedocs.org/en/latest/pydna.html#pydna.utils.cseguid) 
uniquely identifies a circular double stranded sequence.

In [18]:
result.cseguid()

b69UJ57c-OMvguc2CZReyYqmq4E

The file is named based on the nemas of promoter, gene and terminator.

In [19]:
result.locus = "pYPK0_tp_g_tp"
result.definition = "pYPK0_TPI1_EcfabA_PMP3"

Sequence is stamped with cseguid checksum. This can be used to verify the 
integrity of the sequence file.

In [20]:
result.stamp()

cSEGUID_b69UJ57c-OMvguc2CZReyYqmq4E

Write sequence to a local file.

In [21]:
result.write("pYPK0_TPI1_EcfabA_PMP3.gb")

## PCR programs for the amplification of Promoter, Gene and Terminator

see cell #6

Promoter

In [22]:
prom.program()

|95°C|95°C               |    |tmf:64.6
|____|_____          72°C|72°C|tmr:69.7
|3min|30s  \ 57.2°C _____|____|45s/kb
|    |      \______/ 0:36|5min|GC 39%
|    |       30s         |    |814bp

Gene

In [23]:
gene.program()

|95°C|95°C               |    |tmf:76.9
|____|_____          72°C|72°C|tmr:67.9
|3min|30s  \ 61.8°C _____|____|45s/kb
|    |      \______/ 0:30|5min|GC 53%
|    |       30s         |    |608bp

Terminator

In [24]:
term.program()

|95°C|95°C               |    |tmf:66.1
|____|_____          72°C|72°C|tmr:65.0
|3min|30s  \ 58.6°C _____|____|45s/kb
|    |      \______/ 0:49|5min|GC 43%
|    |       30s         |    |1096bp