# Pathway pYPK0-FASII π10 

- 13 genes 
- 25932 bp
- cSEGUID: EOIaBvkL78JFvx0IJ3NUV9JPLOI


	pYPK0_
    PDC1_  EcfabH_
	TEF1_  EcfabD_
	FBA1_  EcfabG_
	RPL22A_EcacpP_
	TDH3_  EcfabF_
	UTR2_  EcfabB_
	TPI1_  EcfabA_
	PMP3_  EcfabZ_
	ENO2_  Athmod1_
	RPL5_  AthfatA1_
	RPL16A_AthfatB_
	RPL17A_EcacpH_
	RPL16B_EcacpS_
	TMA19

This notebook describes the assembly of 13 single gene expression cassettes into a single pathway. 
Notebooks describing the single gene expression vectors are linked at the end of this document as are notebooks 
describing pYPKa promoter, gene and terminator vectors. Specific primers needed are also listed below.

![pathway with N genes](pw.png "pathway with N genes")

The assembly performed here is based on a systematic file name: pYPK0_PDC1_EcfabH_TEF1_EcfabD_FBA1_EcfabG_RPL22A_EcacpP_TDH3_EcfabF_UTR2_EcfabB_TPI1_EcfabA_PMP3_EcfabZ_ENO2_Athmod1_RPL5_AthfatA1_RPL16A_AthfatB_RPL17A_EcacpH_RPL16B_EcacpS_TMA19
A part of the [pydna](https://pypi.python.org/pypi/pydna/) package is imported in the code cell below.

In [1]:
from pydna.parsers import parse_primers
from pydna.readers import read
from pydna.amplify import pcr
from pydna.assembly import Assembly
from IPython.display import display

Initiate the standard primers needed to amplify each cassette.
The first cassette in the pathway is amplified with standard
primers 577 and 778, the last with
775 and 578 and all others with 775 and 778.
Standard primers are listed [here](standard_primers.txt).

In [2]:
p = { x.id: x for x in parse_primers("standard_primers.txt") }

Restriction enzymes are imported from the Biopython package.

In [3]:
from Bio.Restriction import EcoRV, NotI, PacI

The backbone vector is linearized with [EcoRV](http://rebase.neb.com/rebase/enz/EcoRV.html).

In [4]:
pYPKpw = read("pYPKpw.gb")

The cassette_products variable holds the list of expression cassette PCR products fragments to
be assembled.

In [5]:
cassette_products = []

The expression cassettes comes from a series of single gene expression vectors 
held in the template_vectors list.

In [6]:
cassette_vectors = ("pYPK0_PDC1_EcfabH_TEF1.gb",
                    "pYPK0_TEF1_EcfabD_FBA1.gb",
                    "pYPK0_FBA1_EcfabG_RPL22A.gb",
                    "pYPK0_RPL22A_EcacpP_TDH3.gb",
                    "pYPK0_TDH3_EcfabF_UTR2.gb",
                    "pYPK0_UTR2_EcfabB_TPI1.gb",
                    "pYPK0_TPI1_EcfabA_PMP3.gb",
                    "pYPK0_PMP3_EcfabZ_ENO2.gb",
                    "pYPK0_ENO2_Athmod1_RPL5.gb",
                    "pYPK0_RPL5_AthfatA1_RPL16A.gb",
                    "pYPK0_RPL16A_AthfatB_RPL17A.gb",
                    "pYPK0_RPL17A_EcacpH_RPL16B.gb",
                    "pYPK0_RPL16B_EcacpS_TMA19.gb",)
template_vectors = [read(v.strip()) for v in cassette_vectors if v.strip()]
for tv in  template_vectors:
    display(tv)

The first cassette in the pathway is amplified with standard primers 577 and 778. Suggested PCR conditions can be found at the end of this document.

In [7]:
cassette_products.append( pcr( p['577'], p['778'],  template_vectors[0] ) )

Cassettes in the middle cassettes are amplified with standard primers 775 and 778. Suggested PCR conditions can be found at the end of this document.

In [8]:
cassette_products.extend( pcr( p['775'], p['778'], v) for v in template_vectors[1:-1] ) 

The last cassette in the pathway is amplified with standard primers 775 and 578. Suggested PCR conditions can be found at the end of this document.

In [9]:
cassette_products.append( pcr( p['775'], p['578'], template_vectors[-1] ) )

The cassettes are given names based on their order in the final construct in the code cell below.

In [10]:
for i, cp in enumerate(cassette_products):
	cp.name = "Cassette {}".format(i+1)
	print(cp.name)

Cassette 1
Cassette 2
Cassette 3
Cassette 4
Cassette 5
Cassette 6
Cassette 7
Cassette 8
Cassette 9
Cassette 10
Cassette 11
Cassette 12
Cassette 13


Cassettes and plasmid backbone are joined by homologous recombination in a Saccharomyces cerevisiae ura3 host
which selects for the URA3 gene in pYPKpw.

In [11]:
asm = Assembly( [pYPKpw.linearize(EcoRV)] + cassette_products, limit=167-47-10)
asm

Assembly
fragments..: 5603bp 2780bp 2275bp 1927bp 1497bp 2696bp 2559bp 1993bp 1868bp 2332bp 2053bp 2337bp 2084bp 1901bp
limit(bp)..: 110
G.nodes....: 28
algorithm..: common_sub_strings

Normally, only one circular product should be formed since the 
homology limit is quite large (see cell above). More than one 
circular products might indicate an incorrect strategy. 
The largest recombination product is chosen as candidate for 
the pYPK0_PDC1_EcfabH_TEF1_EcfabD_FBA1_EcfabG_RPL22A_EcacpP_TDH3_EcfabF_UTR2_EcfabB_TPI1_EcfabA_PMP3_EcfabZ_ENO2_Athmod1_RPL5_AthfatA1_RPL16A_AthfatB_RPL17A_EcacpH_RPL16B_EcacpS_TMA19 pathway.

In [12]:
candidate = asm.assemble_circular()[0]

This assembly figure shows how the fragments came together.

In [13]:
candidate.figure()

 -|pYPKpw_lin|124
|             \/
|             /\
|             124|Cassette 1|593
|                            \/
|                            /\
|                            593|Cassette 2|644
|                                           \/
|                                           /\
|                                           644|Cassette 3|440
|                                                          \/
|                                                          /\
|                                                          440|Cassette 4|712
|                                                                         \/
|                                                                         /\
|                                                                         712|Cassette 5|634
|                                                                                        \/
|                                                                                        /\
|            

The final pathway is synchronized to the backbone vector. This means that
the plasmid origin is shifted so that it matches the original.

In [14]:
pw = candidate.synced(pYPKpw)

The cseguid checksum for the resulting plasmid is calculated for future reference.
The [cseguid checksum](http://pydna.readthedocs.org/en/latest/pydna.html#pydna.utils.cseguid) 
uniquely identifies a circular double stranded sequence.

In [15]:
pw.cseguid()

EOIaBvkL78JFvx0IJ3NUV9JPLOI

The file is given a name based on the sequence of expressed genes.

In [16]:
pw.locus = "pw"
pw.definition = "π10"

Stamp sequence with cseguid checksum. This can be used to verify the 
integrity of the sequence file.

In [17]:
pw.stamp()

cSEGUID_EOIaBvkL78JFvx0IJ3NUV9JPLOI

Write sequence to a local file.

In [18]:
pw.write("pYPK0_PDC1_EcfabH_TEF1_EcfabD_FBA1_EcfabG_RPL22A_EcacpP_TDH3_EcfabF_UTR2_EcfabB_TPI1_EcfabA_PMP3_EcfabZ_ENO2_Athmod1_RPL5_AthfatA1_RPL16A_AthfatB_RPL17A_EcacpH_RPL16B_EcacpS_TMA19.gb")

The pathway can be extended by digestion with either NotI or PacI or both provided that the enzymes cut once in the final pathway sequence.

In [19]:
print("NotI cuts {} time(s) and PacI cuts {} time(s) in the final pathway.".format(len(pw.cut(NotI)), len(pw.cut(PacI))))

NotI cuts 1 time(s) and PacI cuts 2 time(s) in the final pathway.


### Single gene expression vectors ( pYPK0_promoter_gene_terminator ) needed for assembly.

Hyperlinks to notebook files describing the single gene expression plasmids needed for the assembly.


[pYPK0_PDC1_EcfabH_TEF1.ipynb](pYPK0_PDC1_EcfabH_TEF1.ipynb)

[pYPK0_TEF1_EcfabD_FBA1.ipynb](pYPK0_TEF1_EcfabD_FBA1.ipynb)

[pYPK0_FBA1_EcfabG_RPL22A.ipynb](pYPK0_FBA1_EcfabG_RPL22A.ipynb)

[pYPK0_RPL22A_EcacpP_TDH3.ipynb](pYPK0_RPL22A_EcacpP_TDH3.ipynb)

[pYPK0_TDH3_EcfabF_UTR2.ipynb](pYPK0_TDH3_EcfabF_UTR2.ipynb)

[pYPK0_UTR2_EcfabB_TPI1.ipynb](pYPK0_UTR2_EcfabB_TPI1.ipynb)

[pYPK0_TPI1_EcfabA_PMP3.ipynb](pYPK0_TPI1_EcfabA_PMP3.ipynb)

[pYPK0_PMP3_EcfabZ_ENO2.ipynb](pYPK0_PMP3_EcfabZ_ENO2.ipynb)

[pYPK0_ENO2_Athmod1_RPL5.ipynb](pYPK0_ENO2_Athmod1_RPL5.ipynb)

[pYPK0_RPL5_AthfatA1_RPL16A.ipynb](pYPK0_RPL5_AthfatA1_RPL16A.ipynb)

[pYPK0_RPL16A_AthfatB_RPL17A.ipynb](pYPK0_RPL16A_AthfatB_RPL17A.ipynb)

[pYPK0_RPL17A_EcacpH_RPL16B.ipynb](pYPK0_RPL17A_EcacpH_RPL16B.ipynb)

[pYPK0_RPL16B_EcacpS_TMA19.ipynb](pYPK0_RPL16B_EcacpS_TMA19.ipynb)


### Suggested PCR conditions

In [20]:
for prd in cassette_products:
	print("\n\n\n\n")
	print("product name:", prd.name)
	print("forward primer", prd.forward_primer.name)
	print("reverse primer", prd.reverse_primer.name)
	print(prd.program())






product name: Cassette 1
forward primer 577
reverse primer 778
|95°C|95°C               |    |tmf:64.6
|____|_____          72°C|72°C|tmr:53.9
|3min|30s  \ 55.5°C _____|____|45s/kb
|    |      \______/ 2:05|5min|GC 43%
|    |       30s         |    |2780bp





product name: Cassette 2
forward primer 775
reverse primer 778
|95°C|95°C               |    |tmf:63.9
|____|_____          72°C|72°C|tmr:53.9
|3min|30s  \ 55.6°C _____|____|45s/kb
|    |      \______/ 1:42|5min|GC 44%
|    |       30s         |    |2275bp





product name: Cassette 3
forward primer 775
reverse primer 778
|95°C|95°C               |    |tmf:63.9
|____|_____          72°C|72°C|tmr:53.9
|3min|30s  \ 54.6°C _____|____|45s/kb
|    |      \______/ 1:26|5min|GC 40%
|    |       30s         |    |1927bp





product name: Cassette 4
forward primer 775
reverse primer 778
|95°C|95°C               |    |tmf:63.9
|____|_____          72°C|72°C|tmr:53.9
|3min|30s  \ 53.2°C _____|____|45s/kb
|    |      \______/ 1:07|5m