# Cookbook for pydna

Björn Johansson
CBMA
University of Minho
Braga
Portugal

![logo](logo.png "logo")

## What is pydna?

Pydna is a python package that provides functions and data types to deal with double stranded DNA. It depends on Biopython (a python bioinformatics package), networkx (a graph theory package) and numpy (a mathematics package).

## What does Python dna provide?

Python dna provide classes and functions for molecular biology using python. Notably, PCR, cut and paste cloning (sub-cloning) and homologous recombination between linear DNA fragments are supported. Most functionality is implemented as methods for the double stranded DNA sequence record classes “Dseq” and "Dseqrecord", which are a subclasses of the Biopython Seq and SeqRecord classes, respectively.

Pydna was designed to semantically imitate how sub-cloning experiments are typically documented in scientific literature. One use case for pydna is to create executable documentation for a sub-cloning experiment. The pydna code unambiguously describe the experiment, and can be executed to yield the sequence of the of the resulting DNA molecule(s) and all intermediary steps.  Pydna code describing a sub cloning is reasonably compact and also meant to be easily readable.

Look [here](https://github.com/BjornFJohansson/pydna-examples) for examples.

### Example 2: Sub cloning by restriction digestion and ligation

The construction of the vector YEp24PGK_XK is described on page 4250 in the publication below:

[Johansson et al., “Xylulokinase Overexpression in Two Strains of Saccharomyces cerevisiae Also Expressing Xylose Reductase and Xylitol Dehydrogenase and Its Effect on Fermentation of Xylose and Lignocellulosic Hydrolysate” Applied and Environmental Microb](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC93154/)

Briefly, the XKS1 gene from Saccharomyces cerevisiae is amplified by PCR using two primers called primer1 and primer3. The primers add restriction sites for BamHI to the ends of the  XKS1 gene. The gene is digested with BamHI and ligated to the YEp24PGK plasmid that has previously been digested with BglII which cut the plasmid in one location. The two enzymes are compatible so fragments cut with either enzyme can be ligated together. Fig 1 shows an image outlining the strategy.

![Figure1](figure1.png)

In [1]:
from pydna.genbank import Genbank

In [2]:
gb = Genbank("myemail@mydomain.com")

In [3]:
YEp24PGK = gb.nucleotide("KC562906")

The representation of the YEp24PGK object includes a link to the record on Genbank.

In [4]:
YEp24PGK

In [5]:
from Bio.Restriction import BglII

In [6]:
yep_bgl = YEp24PGK.linearize(BglII)

In [7]:
yep_bgl

Dseqrecord(-9641)

In [8]:
yep_bgl.seq

Dseq(-9641)
GATCTCCC..AAAA    
    AGGG..TTTTCTAG

In [9]:
from pydna.parsers import parse_primers

In [10]:
p1, p3 = parse_primers('''
>primer1
GCGGATCCTCTAGAATGGTTTGTTCAGTAATTCAG
>primer3
AGATCTGGATCCTTAGATGAGAGTCTTTTCCAG''')

In [11]:
XKS1 = gb.nucleotide("Z72979").rc()

In [12]:
XKS1

In [13]:
from pydna.amplify import pcr

In [14]:
PCR_prod = pcr( p1, p3, XKS1 )

In [15]:
from Bio.Restriction import BamHI, BglII

In [16]:
stuffer1, insert, stuffer2 = PCR_prod.cut(BamHI)

Primer1 and 3 add restriction sites to the PCR product. The stuffer fragments are temoved after digestion.

In [17]:
stuffer1, insert, stuffer2

(Dseqrecord(-7), Dseqrecord(-1819), Dseqrecord(-11))

In [18]:
YEp24PGK = gb.nucleotide("KC562906")

In [19]:
YEp24PGK_BglII = YEp24PGK.linearize(BglII)

In [20]:
YEp24PGK_XK = ( YEp24PGK_BglII + insert ).looped()

In [21]:
YEp24PGK_XK = YEp24PGK_XK.synced(YEp24PGK)

In [22]:
YEp24PGK_XK.write("YEp24PGK_XK.gb")

### Example 2: Sub cloning by homologous recombination

The construction of the vector pGUP1 is described in the publication:

Régine Bosson, Malika Jaquenoud, and Andreas Conzelmann, “GUP1 of Saccharomyces cerevisiae Encodes an O-acyltransferase Involved in Remodeling of the GPI Anchor,” Molecular Biology ofthe Cell 17, no. 6 (June 2006): 2636–2645.

Our objective is to replicate the cloning steps using pydna so that we can have the final sequence of the plasmid.

The cloning is described in the paper on page 2637 on the upper left side of the publication:

"The expression vectors harboring GUP1 or GUP1H447A were obtained as follows: the open reading frame of GUP1 was amplified by PCR using plasmid pBH2178 (kind gift from Morten Kielland-Brandt) as a template and using primers  and , underlined sequences being homologous to the target vector pGREG505 (Jansen et al., 2005). The PCR fragment was purified by a PCR purification kit (QIAGEN, Chatsworth, CA) and introduced into pGREG505 by co transfection into yeast cells thus generating pGUP1 (Jansen et al., 2005)."

Briefly, two primers (GUP1rec1sens and GUP1rec2AS) were used to amplify the GUP1 gene from Saccharomyces cerevisiae chromosomal DNA using the two primers  

>GUP1rec1sens 
gaattcgatatcaagcttatcgataccgatgtcgctgatcagcatcctgtc

>GUP1rec2AS
gacataactaattacatgactcgaggtcgactcagcattttaggtaaattccg

Then the vector pGREG505 was digested with the restriction enzyme SalI. This is not mentioned in  Bosson et. al, but they make a reference to Jansen 2005:

Jansen G, Wu C, Schade B, Thomas DY, Whiteway M. 2005. Drag&Drop cloning in yeast. Gene, 344: 43–51. 

Jansen et al describe the pGREG505 vector and that it is digested with SalI before cloning. The SalI digests the vector in two places, so a fragment containing the HIS3 gene is removed.

The SalI sites are visible in the plasmid drawing in Fig. 3.

![figure3](pGREG505.png)

In [23]:
from pydna.all import *
gb = Genbank("myemail@mydomain.com")

In [24]:
GUP1rec1sens, GUP1rec2AS = parse_primers('''
>GUP1rec1sens
gaattcgatatcaagcttatcgataccgatgtcgctgatcagcatcctgtc
>GUP1rec2AS
gacataactaattacatgactcgaggtcgactcagcattttaggtaaattccg
''')

In [25]:
pGREG505 = read("pGREG505.gb")

In [26]:
GUP1_locus = gb.nucleotide("Z72606")

In [27]:
insert = pcr(GUP1rec1sens, GUP1rec2AS, GUP1_locus)

In [28]:
from Bio.Restriction import SalI

In [29]:
his3_stuffer,lin_vect = pGREG505.cut(SalI)

In [30]:
his3_stuffer, lin_vect

(Dseqrecord(-1172), Dseqrecord(-8301))

In [31]:
asm = Assembly( (lin_vect, insert) )

In [32]:
asm

## Assembly object ##
fragments....: 8301bp 1742bp
limit(bp)....: 25
G.nodes......: 8
algorithm....: common_sub_strings
linear(4)....: -10013 -10011 -32 -30
circular(1)..: o9981

In [33]:
pGUP1 = asm.circular_products[0]

In [34]:
pGUP1.cseguid()

0R8hr15t-psjHVuuTj_JufGxOPg

In [35]:
pGUP1 = pGUP1.synced(pGREG505)

In [36]:
pGUP1.write("pGUP1.gb")