# Plasmid: A python based tool for gene editing
This demo illustrates the various functions contained in the `designer` class, which can be used to design primers for extension PCR, gibson assembly, and golden gate assembly.

## Table of contents
1) [Reading and viewing genbank files](#reading_genbank)

    a) [Visualizing genbank records](#visualizing_genbank)

    b) [Selecting and filtering for genomic features](#filtering_genbank)
    
    c) [Gene translation to amino acids](#gene_translation)
    
2) [Creating new genbank records](#gene_concat)

    a) [Writing genbank files](#genbank_write)
    
    b) [Sequence annotation](#gene_annotation)
    
    c) [Searching for open reading frames](#gene_ORF)
    

In [86]:
import plasmid as pge
import importlib
importlib.reload(pge)

<module 'plasmid' from '/home/zchen/Public/python/lib/python3.11/site-packages/plasmid/__init__.py'>

In [None]:
print(RFP.translate())

df = pLac + 'gagacc' + RBS + 'ggtctc' + RFP
aaseq = 'DGALKGEIKMRLKLKDG'
df = df.annotate(name='peptide', sequence=aaseq, color='orange')
df = df.drop_duplicates()
print(df.get_colored())

MASSEDVIKEFMRFKVRMEGSVNGHEFEIEGEGEGRPYEGTQTAKLKVTKGGPLPFAWDILSPQFQYGSKAYVKHPADIPDYLKLSFPEGFKWERVMNFEDGGVVTVTQDSSLQDGEFIYKVKLRGTNFPSDGPVMQKKTMGWEASTERMYPEDGALKGEIKMRLKLKDGGHYDAEVKTTYMAKKPVQLPGAYKTDIKLDITSHNEDYTIVEQYERAEGRHSTGA*
[38;2;0;128;128mAATTGACAATGTGAGCGAGTAACAAGATACTGAGCACA[39mgagacc[38;2;0;255;255mAAAGAGGAGAAA[39mggtctc[38;2;128;0;64mATGGCGAGTAGCGAAGACGTTATCAAAGAGTTCATGCGTTTCAAAGTTCGTATGGAAGGTTCCGTTAACGGTCACGAGTTCGAAATCGAAGGTGAAGGTGAAGGTCGTCCGTACGAAGGTACCCAGACCGCTAAACTGAAAGTTACCAAAGGTGGTCCGCTGCCGTTCGCTTGGGACATCCTGTCCCCGCAGTTCCAGTACGGTTCCAAAGCTTACGTTAAACACCCGGCTGACATCCCGGACTACCTGAAACTGTCCTTCCCGGAAGGTTTCAAATGGGAACGTGTTATGAACTTCGAAGACGGTGGTGTTGTTACCGTTACCCAGGACTCCTCCCTGCAAGACGGTGAGTTCATCTACAAAGTTAAACTGCGTGGTACCAACTTCCCGTCCGACGGTCCGGTTATGCAGAAAAAAACCATGGGTTGGGAAGCTTCCACCGAACGTATGTACCCGGAA[39m[38;2;255;165;0mGACGGTGCTCTGAAAGGTGAAATCAAAATGCGTCTGAAACTGAAAGACGGT[39m[38;2;128;0;64mGGTCACTACGACGCTGAAGTTAAAACCACCTACATGGCTAAAAAACCGGTTCAGCTGCCGGGTGCTTACAAAACCGACATCAAACTGGACATCACCTC



## Extension PCR
This library provides functionality for designing primers for cloning the DNA constructs using the `Design` class.

In [125]:
pcr = pge.Designer()
help(pcr.xtPCR)

Help on method xtPCR in module plasmid.designer:

xtPCR(fL, seq, fR=None, padding=[2, 2], niter=3, w=[10, 100, 1, 1, 2], get_cost=False) method of plasmid.designer.Designer instance
    Find primers which can seed and extend a PCR fragment
    fL = flanking sequence on 5' end
    seq = sequence on 3' end which gets amplified
    fR = flanking sequence on 3' end
    padding = number of extra primers to try
    w = weights for cost function
    method = optimization method
    returns list of primers



The following shows to obtain extension PCR primers that will add promoter and rbs sequences to the RFP gene.

In [12]:
# slice out the RFP gene
RFP = pge.read_genbank('../data/dcas9_RFP.gb')
RFP = RFP[RFP['locus_tag'].str.contains('mRFP')].splice()
# slice out the ribosome binding site
RBS = pge.read_genbank('../data/xRFP.gb')
RBS = RBS[RBS['locus_tag'].str.contains('BBa_B0034')].splice()
# slice out the promoter
pLac = pge.read_genbank('../data/xRFP.gb')
pLac = pLac[pLac['locus_tag'].str.contains('pLac')].splice()

# assemble the promoter, rbs, and mRFP
df = pLac + 'gagacc' + RBS + 'ggtctc' + RFP

reading  ../data/dcas9_RFP.gb  as genbank file
reading  ../data/xRFP.gb  as genbank file
reading  ../data/xRFP.gb  as genbank file


In [13]:
pcr = pge.Designer()
pcr.params['xtPCR']['Tm'] = 55         # target annealing temperature for xtPCR
pcr.params['xtPCR']['len'] = [15, 60]  # defines the [min, max] primer lengths
pcr.params['verbose'] = False

insert = pLac + 'gagacc' + RBS + 'ggtctc'
res = pcr.xtPCR(insert, RFP, ' ')
print(res)
print(res.values)

running fwd
running rev
  locus_tag         Tm                                           sequence   
0       0_F  55.381908  AGATACTGAGCACAgagaccAAAGAGGAGAAAggtctc ATGGCGA...  \
1     fin_F  55.348127      AATTGACAATGTGAGCGAGTAACA AGATACTGAGCACAgagacc   
0     fin_R  56.777386                                 TTAAGCACCGGTGGAGTG   

                 annealed  strand  
0  ATGGCGAGTAGCGAAGACGTTA       1  
1    AGATACTGAGCACAgagacc       1  
0      TTAAGCACCGGTGGAGTG      -1  
[['0_F' 55.38190849297598
  'AGATACTGAGCACAgagaccAAAGAGGAGAAAggtctc ATGGCGAGTAGCGAAGACGTTA'
  'ATGGCGAGTAGCGAAGACGTTA' 1]
 ['fin_F' 55.34812743102748
  'AATTGACAATGTGAGCGAGTAACA AGATACTGAGCACAgagacc' 'AGATACTGAGCACAgagacc'
  1]
 ['fin_R' 56.777386483231 '  TTAAGCACCGGTGGAGTG' 'TTAAGCACCGGTGGAGTG' -1]]


## Gibson assembly
The following shows how to design primers for gibson assembly.

In [14]:
pcr = pge.Designer()
help(pcr.Gibson)

Help on method Gibson in module plasmid.designer:

Gibson(seqlist, w=[10, 1], method='differential_evolution', circular=True) method of plasmid.designer.Designer instance
    Design primers for gibson assembly
    seqlist = list of sequences to assemble via gibson in order 
    circular = assemble fragments into a circular construct
    returns list of primers



In [15]:
def get_parts():
    # slice out the LacI gene
    LacI = pge.read_genbank('../data/xRFP.gb')
    LacI = LacI[LacI['locus_tag'].str.contains('LacI')].splice()

    # slice out the RFP gene
    RFP = pge.read_genbank('../data/dcas9_RFP.gb')
    RFP = RFP[RFP['locus_tag'].str.contains('mRFP')].splice()

    # slice out the origin of replication
    df = pge.read_genbank('../data/xRFP.gb')
    vec = df[df['locus_tag'].str.contains('pSC101')]
    start = vec['start'][0]
    stop = vec['end'][0]
    vec = df[start:stop]
    return LacI, RFP, vec

In [16]:
LacI, RFP, vec = get_parts()
seq = []
seq+= [[' ',LacI,'AAAActttt']]
seq+= [[' ',RFP,'CGCCctttt']]
seq+= [[' ',vec,'GGGGctttt']]

pcr = pge.Designer()
pcr.params['gibson']['Tm'] = 50     # target annealing temperature of gibson fragments    
pcr.params['gibson']['window'] = 30 # +/i window in bp around frag edges to look for gibson overlap
pcr.params['gibson']['len'] = 20    # length of gibson overlap

pcr.params['xtPCR']['Tm'] = 55         # target annealing temperature for xtPCR
pcr.params['xtPCR']['len'] = [15, 60]  # defines the [min, max] primer lengths
pcr.params['xtPCR']['nM'] = [20, 500]  # defines the [seed, finisher] primer conc in nM
pcr.params['verbose'] = False

res = pcr.Gibson(seq)
print(res)

reading  ../data/xRFP.gb  as genbank file
reading  ../data/dcas9_RFP.gb  as genbank file
reading  ../data/xRFP.gb  as genbank file
res.x [10.95407762 17.41633213 26.61257437]
res.fun -57.0
exclude: []
overlaps: ['GCGGGCAGTAAAAAActttt', 'TTAACGCCctttt CTGTCA', 'tttt ATGGTGAATGTGAAA']
Tm overlap: [49.290031925644485, 48.14694596334914, 41.2522012369904]
processing primers for frag 0
running fwd
running rev
processing primers for frag 1
running fwd
running rev
processing primers for frag 2
running fwd
running rev
     locus_tag         Tm                                           sequence   
0  frag0_fin_F  55.851352                      tttt  ATGGTGAATGTGAAACCAGTAAC  \
1  frag0_fin_R  56.106442                        aaaagTTTT TTACTGCCCGCTTTCCA   
2  frag1_fin_F  55.335316            GCGGGCAGTAAAAAActttt  ATGGCGAGTAGCGAAGA   
3  frag1_fin_R  56.777386                TGACAG aaaagGGCG TTAAGCACCGGTGGAGTG   
4  frag2_fin_F  55.363272               TTAACGCCctttt  CTGTCAGACCAAGTTTACGAG   
5  f

## Golden gate assembly
The following shows how to design primers for golden gate assembly

In [17]:
pcr = pge.Designer()
help(pcr.GoldenGate)

Help on method GoldenGate in module plasmid.designer:

GoldenGate(seqlist, exclude=[], w=[0, 1], circular=True) method of plasmid.designer.Designer instance
    Design primers for goldengate assembly
    seqlist = list of sequences to assemble
    exclude = sites to exclude
    circular = assemble fragments into a circular construct
    returns list of primers



In [18]:
LacI, RFP, vec = get_parts()
seq = []
seq+= [['',LacI,'AAAActttt']]
seq+= [['',RFP,'CGCCctttt']]
seq+= [['',vec,'GGGGctttt']]

pcr = pge.Designer()
pcr.params['goldengate']['window'] = 20 # +/i window in bp around frag edges to look for overlap
pcr.params['goldengate']['ggN'] = 4     # length of golden gate overlap
pcr.params['goldengate']['ggsite'] = 'GGTCTCc'     # golden gate enzyme site
pcr.params['goldengate']['padding'] = 'atatatatgg' # padding around the golden gate site
pcr.params['xtPCR']['len'] = [15, 60]  # defines the [min, max] primer lengths
pcr.params['xtPCR']['nM'] = [20, 500] # defines the [seed, finisher] primer conc in nM
pcr.params['xtPCR']['Tm'] = 55 # defines the [seed, finisher] primer conc in nM

res = pcr.GoldenGate(seq)
print(res)

reading  ../data/xRFP.gb  as genbank file
reading  ../data/dcas9_RFP.gb  as genbank file
reading  ../data/xRFP.gb  as genbank file
res.x [27.21765366 15.28962375 12.57292265]
res.fun -12.0
exclude: []
overlaps: ['GTAG', 'cttt', 'GGGc']
Tm overlap: [-63.72743868625798, -70.43593665137047, -41.01531509599255]
processing primers for frag 0
running fwd


  df = fun(x) - f0


running rev
processing primers for frag 1
running fwd


  df = fun(x) - f0


running rev
processing primers for frag 2
running fwd


  df = fun(x) - f0


running rev
     locus_tag         Tm                                           sequence   
0  frag0_fin_F  55.851352  atatatatggGGTCTCcGGGctttt ATGGTGAATGTGAAACCAGTAAC  \
1  frag0_fin_R  56.106442  atatatatggGGTCTCcCTACTCGCCATaaaagTTTT TTACTGCC...   
2  frag1_fin_F  55.129431           atatatatggGGTCTCc GTAGCGAAGACGTTATCAAAGA   
3  frag1_fin_R  56.777386       atatatatggGGTCTCcaaagGGCG TTAAGCACCGGTGGAGTG   
4  frag2_fin_F  55.363272       atatatatggGGTCTCcctttt CTGTCAGACCAAGTTTACGAG   
5  frag2_fin_R  54.627626     atatatatggGGTCTCcgCCCC GTTACATTGTCGATCTGTTCATG   
6         seq0        NaN  atatatatggGGTCTCcGGGcttttATGGTGAATGTGAAACCAGTA...   
7         seq1        NaN  atatatatggGGTCTCcGTAGCGAAGACGTTATCAAAGAGTTCATG...   
8         seq2        NaN  atatatatggGGTCTCccttttCTGTCAGACCAAGTTTACGAGCTC...   

                  annealed  strand  
0  ATGGTGAATGTGAAACCAGTAAC     1.0  
1        TTACTGCCCGCTTTCCA    -1.0  
2   GTAGCGAAGACGTTATCAAAGA     1.0  
3       TTAAGCACCGGTGGAGTG    -1.0  
4 

  df = fun(x) - f0
