# QUEEN Cloning Script for a Large Catalog of Plasmids
### Samuel King | Yachie Lab | Last updated: January 2023

This is a QUEEN script for constructing plasmids by Digestion-Ligation, Gibson Assembly, and Golden Gate Assembly for the plasmids described in the CloneSelect paper.

### Setting up QUEEN

* I recommend using Anaconda as a distribution and coding your QUEEN scripts in JupyterLab.
* This tutorial https://biocircuits.github.io/chapters/00_setting_up_python_computing_environment.html describes very clearly how to set up your environment with Anaconda.
* Write out the exact cloning plan for your desired plasmid step-by-step, including template plasmid names, primer names, enzyme names, etc., before starting your QUEEN script.
* If designing many plasmids, a spreadsheet is useful to carry the information. Then, you can code your functions to work with the spreadsheet items.
* The full tutorial for QUEEN can be found here: https://github.com/yachielab/QUEEN

### After setting up Anaconda and getting JupyterLab going, start your script like this:

In [2316]:
%matplotlib agg
#!pip install python-queen  #install QUEEN here!
#!conda install graphviz    #graphviz is not automatically installed by Anaconda and you need to install it yourself here
import sys
from QUEEN.queen import *
set_namespace(globals())
from QUEEN import cutsite as cs
import re
import pandas as pd
import numpy as np
pd.set_option('display.max_colwidth', None)
from Bio import GenBank, SeqIO

### QUEEN Wrapper Functions
Here we provide the following new functions that build upon Regular Expression and QUEEN to improve QUEEN's functionality for a large catalog of plasmids:  
  
__Sequence search functions__:
* `find_primer_binding`: Provides a QUEEN feature list of a primer binding sequence on a template when given primer and template QUEEN objects.
* `primer_binding_object`: Provides a QUEEN feature object of a (18bp+) primer binding sequence on a template when given primer and template QUEEN objects.
* `primer_binding_object_short`: Provides a QUEEN feature object of a short (15bp+) primer binding sequence on a template when given primer and template QUEEN objects.
* `find_primer_binding_str`: Provides a string of a primer binding sequence on a template when given primer (specified as 'FW' or 'RV") and template QUEEN objects.
* `find_primer_binding_str_short`: Provides a string of a primer binding sequence on a template when given primer (specified as 'FW' or 'RV") and template QUEEN objects (with reduced annealing requirement).
* `primer_binding_start_pos`: Provides the start index position of the template sequence of a primer binding on a searched template sequence.
* `primer_binding_end_pos`: Provides the end index position of the template sequence of a primer binding on a searched template sequence.
* `find_primer_binding_primer`: Provides a QUEEN feature list of a primer binding sequence on another primer when given two primer QUEEN objects.
* `find_primer_binding_primer_str`: Provides a string of a primer binding sequence on another primer when given two primer QUEEN objects.
* `find_primer_binding_primer_str_short`: Provides a string of a primer binding sequence on another primer when given two primer QUEEN objects (with reduced annealing requirement).
* `find_primer_binding_primer_start`: Provides the start index position of the binding portion of a primer sequence bound on a searched primer.
* `find_primer_binding_primer_end`: Provides the end index position of the binding portion of a primer sequence bound on a searched primer.
* `find_similar_sequence`: Provides the string of the specified binding portion of a primer that contains a specified number of mismatches to a template/plasmid string.

__DNA modification functions__:
* `generate_complement`: Provides a string of the complementary sequence to the provided string DNA (not the reverse complement, this is what flipdna does).
* `generate_dsdna`: Provides a double-stranded QUEEN object when given a single-stranded string DNA.
* `anneal_oligos`: Creates a double-stranded DNA with or without overhangs on either side when given two single-stranded QUEEN objects.
    * __Helper functions__:
        * `five_prime_overhang`: Provides the 5' overhang of a ssDNA with a known annealing sequence to another ssDNA.
        * `three_prime_overhang`: Provides the 3' overhang of a ssDNA with a known annealing sequence to another ssDNA.
        * `primer_overhangs`: Provides both the 5' and 3' overhangs of a ssDNA with a known annealing sequence to another ssDNA.
    * `anneal_oligos_object`: Same as `anneal_oligos` but provides the double-stranded QUEEN object.
    * `anneal_oligos_object_short`: Same as `anneal_oligos_object` but has a reduced annealing requirement.
* `gibson_assembly`: Provides the circular ligated product QUEEN object of 2 overlapping DNA fragments (akin to a Gibson Assembly reaction).
* `stitch_fragments`: Provides the linear ligated product QUEEN object of 2 overlapping DNA fragments.
* `template_free_pcr`: Provides the linear PCR product QUEEN object of 2 annealing primers.
    
__Wrapper cloning functions__:
* `create_pcr_product`: Provides a PCR product QUEEN object when given primer and template QUEEN objects.
* `create_pcr_product_special`: Provides a PCR product QUEEN object when given a *short* Fw primer (15bp+), Rv primer, and template QUEEN objects (special case).
* `create_pcr_product_mismatches`: Provides a PCR product QUEEN object when given a Fw primer with mismatches, a Rv primer with no mismatches, and a template/plasmid QUEEN object.
* `double_digest_insert`: Provides a digested QUEEN object when given two restriction enzymes and the insert QUEEN object.
* `double_digest_backbone`: Provides a digested QUEEN object when given two restriction enzymes and the backbone QUEEN object.
* `typeIIS_digest_insert`: Provides a digested QUEEN object when given a Type-IIS restriction enzyme and the insert QUEEN object.
* `typeIIS_digest_backbone`: Provides a digested QUEEN object when given a Type-IIS restriction enzyme and the backbone QUEEN object. 

## Sequence search functions

In [1957]:
def find_primer_binding(primer,    #QUEEN object
                        template   #QUEEN object
                       ):
    """
    Given a primer and template sequence, finds the initial binding site of the primer in the template sequence, whether the primer has an overhang or not.
    Returns the features of the primer binding site including the start position, end position, if it's the + or - strand, and the binding sequence.
    Requires minimum 18bp annealing between primer and template. The 3' end of the binding primer must anneal.
    """
    if len(primer.seq) < 18:
        return 'No primer binding site found.'
    else:
        pass
    
    n = len(primer.seq)     #length of primer
    binding_site = []       #empty list for candidate binding sites
    i=-18                   #initial binding site length, always minimum 18bp in PCR
    
    while i <= n:
        template.searchsequence(query=primer.seq[i:], product="match")   #cumulatively search for each 3' nucleotide in the primer until the max amount of primer matches the template
        binding_site.append(match)                                       #as i iterates, each candidate match is listed in binding_site
        if match == []:                                                  #match is a feature list object, thus QUEEN creates an empty feature list '[]' if there are no matching characters
            break                                                        #end the loop when there are no more characters in the primer that match the template sequence
        elif -i == n:                                                    #i in our loop is always negative, so when -i == the total length of the primer (n), end the loop
            break                                                        #end the loop when there are no more characters in the primer to analyze
        i = i-1                                                          #iterate back from each 3' nucleotide while searching
    
    if binding_site == [[]]:                                             #if there is no binding sequence, the binding_site list will be empty
        return 'No primer binding site found.'                           #notify that no binding sequence for the primer was found in the template
    elif binding_site[-1] == []:                                         #the while loop ends the binding_site list with an empty feature list if the primer has an overhang
        return template.printfeature(binding_site[-2], seq=True, attribute=["start", "end", "strand"])      #show the last matching binding site for a primer with an overhang
    else:
        return template.printfeature(binding_site[-1], seq=True, attribute=["start", "end", "strand"])      #show the last matching binding site for a primer with no overhang


#Examples
    
QUEEN(record="139107", dbtype="addgene", product="pSI_626")

#Example 1: Rv primer binds to 24 bp in pSI_626, but has 21 bp overhang that actually matches another part of pSI_626
QUEEN(seq="CCACCTTGCGCTTCTTCTTTGGGCCTGGGTTGTCCCGCAGGTACT",product="SK132")
find_primer_binding(SK132, pSI_626) #works!

#Example 2: Fw primer binds to 23 bp in pSI_626 with no overhang
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")
find_primer_binding(SI1313, pSI_626) #works!

#Example 3: Fw primer binds to 45 bp in pSI_626 with no overhang
QUEEN(seq="cctgtctcagctgggaggtgacGGCGGAGGAGGAACTGGAGGAGG", product="SI1304")
find_primer_binding(SI1304, pSI_626) #works!

#Example 4: The 3'-most 18bp of the fw primer do not bind pSI_626, and the 25bp 5' overhang matches pSI_626
QUEEN(seq="cactatagggagagccgccaccatgggacacactctttacgcc", product="SK130")
find_primer_binding(SK130, pSI_626) #works!

#Example 5: None of the rv primer matches pSI_626
QUEEN(seq="gccagcacccccttcaagtt", product="SK109")
find_primer_binding(SK109, pSI_626) #works!

#Example 6: Primer is shorter than 18bp
QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003") #template
QUEEN(seq="ACACGCTCGGT", product="SI2004cantanneal") #primer
find_primer_binding(SI2004cantanneal, SI2003) #works!

start  end   strand  sequence                  
5341   5365  -       GGCCTGGGTTGTCCCGCAGGTACT  

start  end   strand  sequence                 
4744   4767  +       CCAAAGAGGTGCTGGACGCCACC  

start  end   strand  sequence                                       
4811   4856  +       CCTGTCTCAGCTGGGAGGTGACGGCGGAGGAGGAACTGGAGGAGG  



'No primer binding site found.'

In [1958]:
def primer_binding_object(primer,   #QUEEN object
                          template  #QUEEN object
                         ):
    """
    Given a primer and template sequence, finds the initial binding site of the primer in the template sequence, whether the primer has an overhang or not.
    Returns the template.searchsequence feature object of the primer binding site that QUEEN can use to perform cropDNA.
    Requires minimum 18bp annealing between primer and template. The 3' end of the binding primer must anneal.
    """
    n = len(primer.seq)     #length of primer
    binding_site = []       #empty list for candidate binding sites
    i=-18                   #initial binding site length, always minimum 18bp in PCR
    
    while i <= n:
        template.searchsequence(query=primer.seq[i:], product="match")   #cumulatively search for each 3' nucleotide in the primer until the max amount of primer matches the template
        binding_site.append(match)                                       #as i iterates, each candidate match is listed in binding_site
        if match == []:                                                  #match is a feature list object, thus QUEEN creates an empty feature list '[]' if there are no matching characters
            break                                                        #end the loop when there are no more characters in the primer that match the template sequence
        elif -i == n:                                                    #i in our loop is always negative, so when -i == the total length of the primer (n), end the loop
            break                                                        #end the loop when there are no more characters in the primer to analyze
        i = i-1                                                          #iterate back from each 3' nucleotide while searching
    
    if binding_site == [[]]:                                             #if there is no binding sequence, the binding_site list will be empty
        return 'No primer binding site found.'                           #notify that no binding sequence for the primer was found in the template
    elif binding_site[-1] == []:                                         #the while loop ends the binding_site list with an empty feature list if the primer has an overhang
        return binding_site[-2]                                          #show the last matching binding site for a primer with an overhang
    else:
        return binding_site[-1]                                          #show the last matching binding site for a primer with no overhang



QUEEN(record="139107", dbtype="addgene", product="pSI_626")

#Example 1: Rv primer binds to 24 bp in pSI_626, but has 21 bp overhang that actually matches another part of pSI_626
#QUEEN(seq="CCACCTTGCGCTTCTTCTTTGGGCCTGGGTTGTCCCGCAGGTACT",product="SK132")
#primer_binding_object(SK132, pSI_626) #works!

#Example 2: Fw primer binds to 23 bp in pSI_626 with no overhang
#QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")
#primer_binding_object(SI1313, pSI_626) #works!

#Example 3: Fw primer binds to 45 bp in pSI_626 with no overhang
#QUEEN(seq="cctgtctcagctgggaggtgacGGCGGAGGAGGAACTGGAGGAGG", product="SI1304")
#primer_binding_object(SI1304, pSI_626) #works!

#Example 4: The 3'-most 18bp of the fw primer do not bind pSI_626, and the 25bp 5' overhang matches pSI_626
#QUEEN(seq="cactatagggagagccgccaccatgggacacactctttacgcc", product="SK130")
#primer_binding_object(SK130, pSI_626) #works!

#Example 5: None of the rv primer matches pSI_626
#QUEEN(seq="gccagcacccccttcaagtt", product="SK109")
#primer_binding_object(SK109, pSI_626) #works!

<queen.QUEEN object; project='pSI_802', length='8725 bp', topology='circular'>

In [1959]:
def primer_binding_object_short(primer,   #QUEEN object
                                template  #QUEEN object
                                ):
    """
    *FUNCTION SPECIALLY MADE FOR BINDING PRIMERS THAT ANNEAL 15 BP OR LONGER.*
    Given a primer and template sequence, finds the initial binding site of the primer in the template sequence, whether the primer has an overhang or not.
    Returns the template.searchsequence feature object of the primer binding site that QUEEN can use to perform cropDNA.
    Requires minimum 15bp annealing between primer and template. The 3' end of the binding primer must anneal.
    """
    n = len(primer.seq)      #length of primer
    binding_site = []        #empty list for candidate binding sites
    i=-15                    #initial binding site length, special for primers with binding sites less than 18 bp long
    
    while i <= n:
        template.searchsequence(query=primer.seq[i:], product="match")   #cumulatively search for each 3' nucleotide in the primer until the max amount of primer matches the template
        binding_site.append(match)                                       #as i iterates, each candidate match is listed in binding_site
        if match == []:                                                  #match is a feature list object, thus QUEEN creates an empty feature list '[]' if there are no matching characters
            break                                                        #end the loop when there are no more characters in the primer that match the template sequence
        elif -i == n:                                                    #i in our loop is always negative, so when -i == the total length of the primer (n), end the loop
            break                                                        #end the loop when there are no more characters in the primer to analyze
        i = i-1                                                          #iterate back from each 3' nucleotide while searching
    
    if binding_site == [[]]:                                             #if there is no binding sequence, the binding_site list will be empty
        return 'No primer binding site found.'                           #notify that no binding sequence for the primer was found in the template
    elif binding_site[-1] == []:                                         #the while loop ends the binding_site list with an empty feature list if the primer has an overhang
        return binding_site[-2]                                          #show the last matching binding site for a primer with an overhang
    else:
        return binding_site[-1]                                          #show the last matching binding site for a primer with no overhang



QUEEN(record="139107", dbtype="addgene", product="pSI_626")
QUEEN(record="36084", dbtype="addgene", product="pLV_mCherry")

#Example 1: Rv primer binds to 24 bp in pSI_626, but has 21 bp overhang that actually matches another part of pSI_626
QUEEN(seq="CCACCTTGCGCTTCTTCTTTGGGCCTGGGTTGTCCCGCAGGTACT",product="SK132")
primer_binding_object_short(SK132, pSI_626) #works!

#Example 2: Fw primer binds to 23 bp in pSI_626 with no overhang
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")
primer_binding_object_short(SI1313, pSI_626) #works!

#Example 3: Fw primer binds to 45 bp in pSI_626 with no overhang
QUEEN(seq="cctgtctcagctgggaggtgacGGCGGAGGAGGAACTGGAGGAGG", product="SI1304")
primer_binding_object_short(SI1304, pSI_626) #works!

#Example 4: The 3'-most 18bp of the fw primer do not bind pSI_626, and the 25bp 5' overhang matches pSI_626
QUEEN(seq="cactatagggagagccgccaccatgggacacactctttacgcc", product="SK130")
primer_binding_object_short(SK130, pSI_626) #works!

#Example 5: None of the rv primer matches pSI_626
QUEEN(seq="gccagcacccccttcaagtt", product="SK109")
primer_binding_object_short(SK109, pSI_626) #works!

#Example 6: Rv primer matches 21 bp in pLV_mCherry
QUEEN(seq="AATTGGATCCTTACTTGTACAGCTCGTCCA", product="SI680")
primer_binding_object_short(SI680, pLV_mCherry) #works!

#Example 7: Fw primer matches 16 bp in pLV_mCherry
QUEEN(seq="GGTGAATTCCCGAGCGTGTCAGGGTGACCGTGAGCAAGGGCGAGG", product="SI1293")
primer_binding_object_short(SI1293, pLV_mCherry) #works!
#add more examples for primers that have mismatches in their binding site

[DNAfeature(FeatureLocation(ExactPosition(4701), ExactPosition(4717), strand=-1), type='misc_feature')]

In [1960]:
def find_primer_binding_str(primer_type,  #'FW' or 'RV'
                            primer,       #QUEEN object
                            template      #QUEEN object
                           ):
    """
    Given a type of primer ('FW' or 'RV'), the primer, and template sequence, finds the initial binding site of the primer in the template sequence, whether the primer has an overhang or not.
    Returns the string sequence of the portion of the primer that binds the template sequence.
    Requires minimum 18bp annealing between primer and template. The 3' end of the binding primer must anneal.
    """
    # Note: re.search cannot search in the 3' to 5' direction on a QUEEN DNA object, so the function will not work for Rv primers unless the template is flipped
    # For example:
      # match1 = re.search('hell', 'hello') #Finds match='hell'
      # print(match1)
      # match2 = re.search('lleh', 'hello') #Finds None
      # print(match2)
    
    n = len(primer.seq)                  #length of primer
    i=-18                                #starting point of i
    flippedtemplate = flipdna(template)  #reverse complement of template
    binding_site = []
    
    
    if primer_type == 'FW':                                 #for searching templates with Fw primer
        while i <= n:
            match = re.search(primer.seq[i:], template.seq) #search for each character in the primer string until the maximum amount of the primer matches the template string
            binding_site.append(match)                      #add matching string to the list binding_site
            if match == None:                               #end loop when there are no more matches
                break
            elif -i == n:                                   #end loop when there are no is no more primer string to iterate on
                break
            i = i-1
    elif primer_type == 'RV':                               #for searching templates with Rv primer; template must be reverse complement because re can't search on dsDNA like QUEEN can
        while i <= n:
            match = re.search(primer.seq[i:], flippedtemplate.seq)
            binding_site.append(match)
            if match == None:
                break
            elif -i == n:
                break
            i = i-1
            
    if len(binding_site) == 1 and binding_site[-1] == None:  #if there is no binding sequence, the binding_site list will only contain 1 NoneType object
        return 'No primer binding site found.'               #notify that no binding sequence for the primer was found in the template
    elif binding_site[-1] == None:                           #the while loop ends the binding_site list with a NoneType object if the primer has an overhang
        return binding_site[-2].group()                      #show the last matching binding site for a primer with an overhang
    else:
        return binding_site[-1].group()                      #show the last matching binding site for a primer with no overhang
        

#Examples

QUEEN(record="139107", dbtype="addgene", product="pSI_626")

#Example 1: Rv primer binds to 24 bp in pSI_626, but has 21 bp overhang that actually matches another part of pSI_626
QUEEN(seq="CCACCTTGCGCTTCTTCTTTGGGCCTGGGTTGTCCCGCAGGTACT",product="SK132")  #Rv primer that binds 24bp of pSI_626
find_primer_binding(SK132, pSI_626)                                         #Finds GGCCTGGGTTGTCCCGCAGGTACT 
find_primer_binding_str('RV', SK132, pSI_626)                               #Returns string of binding portion of primer

#Example 2: Fw primer binds to 23 bp in pSI_626 with no overhang
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")                      #Fw primer that has no overhang and binds all 23bp
find_primer_binding(SI1313, pSI_626)                                        #Finds CCAAAGAGGTGCTGGACGCCACC
find_primer_binding_str('FW', SI1313, pSI_626)                              #Returns string of binding portion of primer, no opportunity to return None as the last call

#Example 3: Fw primer binds to 23 bp in pSI_626 with no overhang but is given with wrong primer type
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")                      #Fw primer that has no overhang and binds all 23bp
find_primer_binding(SI1313, pSI_626)                                        #Finds CCAAAGAGGTGCTGGACGCCACC
find_primer_binding_str('RV', SI1313, pSI_626)                              #Returns 'No primer binding site found.'

#Example 4: Rv primer binds to a Fw primer
QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003") 
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")
find_primer_binding_str('RV', SI2004, SI2003)


start  end   strand  sequence                  
5341   5365  -       GGCCTGGGTTGTCCCGCAGGTACT  

start  end   strand  sequence                 
4744   4767  +       CCAAAGAGGTGCTGGACGCCACC  

start  end   strand  sequence                 
4744   4767  +       CCAAAGAGGTGCTGGACGCCACC  



'CCCTTATGACCCTGACACGCTCGGT'

In [1961]:
def find_primer_binding_str_short(primer_type,  #'FW' or 'RV'
                                  primer,       #QUEEN object
                                  template      #QUEEN object
                                 ):
    """
    Given a type of primer ('FW' or 'RV'), the primer, and template sequence, finds the initial binding site of the primer in the template sequence, whether the primer has an overhang or not.
    Returns the string sequence of the portion of the primer that binds the template sequence.
    Requires minimum 15bp annealing between primer and template. The 3' end of the binding primer must anneal.
    """
    # Note: re.search cannot search in the 3' to 5' direction on a QUEEN DNA object, so the function will not work for Rv primers unless the template is flipped
    # For example:
      # match1 = re.search('hell', 'hello') #Finds match='hell'
      # print(match1)
      # match2 = re.search('lleh', 'hello') #Finds None
      # print(match2)
    
    n = len(primer.seq)                  #length of primer
    i=-15                                #starting point of i
    flippedtemplate = flipdna(template)  #reverse complement of template
    binding_site = []
    
    
    if primer_type == 'FW':                                 #for searching templates with Fw primer
        while i <= n:
            match = re.search(primer.seq[i:], template.seq) #search for each character in the primer string until the maximum amount of the primer matches the template string
            binding_site.append(match)                      #add matching string to the list binding_site
            if match == None:                               #end loop when there are no more matches
                break
            elif -i == n:                                   #end loop when there are no is no more primer string to iterate on
                break
            i = i-1
    elif primer_type == 'RV':                               #for searching templates with Rv primer; template must be reverse complement because re can't search on dsDNA like QUEEN can
        while i <= n:
            match = re.search(primer.seq[i:], flippedtemplate.seq)
            binding_site.append(match)
            if match == None:
                break
            elif -i == n:
                break
            i = i-1
            
    if len(binding_site) == 1 and binding_site[-1] == None:  #if there is no binding sequence, the binding_site list will only contain 1 NoneType object
        return 'No primer binding site found.'               #notify that no binding sequence for the primer was found in the template
    elif binding_site[-1] == None:                           #the while loop ends the binding_site list with a NoneType object if the primer has an overhang
        return binding_site[-2].group()                      #show the last matching binding site for a primer with an overhang
    else:
        return binding_site[-1].group()                      #show the last matching binding site for a primer with no overhang
        

#Examples

QUEEN(record="139107", dbtype="addgene", product="pSI_626")

#Example 1: Rv primer binds to 24 bp in pSI_626, but has 21 bp overhang that actually matches another part of pSI_626
QUEEN(seq="CCACCTTGCGCTTCTTCTTTGGGCCTGGGTTGTCCCGCAGGTACT",product="SK132")  #Rv primer that binds 24bp of pSI_626
find_primer_binding(SK132, pSI_626)                                         #Finds GGCCTGGGTTGTCCCGCAGGTACT 
find_primer_binding_str_short('RV', SK132, pSI_626)                               #Returns string of binding portion of primer

#Example 2: Fw primer binds to 23 bp in pSI_626 with no overhang
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")                      #Fw primer that has no overhang and binds all 23bp
find_primer_binding(SI1313, pSI_626)                                        #Finds CCAAAGAGGTGCTGGACGCCACC
find_primer_binding_str_short('FW', SI1313, pSI_626)                              #Returns string of binding portion of primer, no opportunity to return None as the last call

#Example 3: Fw primer binds to 23 bp in pSI_626 with no overhang but is given with wrong primer type
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")                      #Fw primer that has no overhang and binds all 23bp
find_primer_binding(SI1313, pSI_626)                                        #Finds CCAAAGAGGTGCTGGACGCCACC
find_primer_binding_str_short('RV', SI1313, pSI_626)                              #Returns 'No primer binding site found.'

#Example 4: Rv primer binds to a Fw primer
QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003") 
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")
find_primer_binding_str_short('RV', SI2004, SI2003)


start  end   strand  sequence                  
5341   5365  -       GGCCTGGGTTGTCCCGCAGGTACT  

start  end   strand  sequence                 
4744   4767  +       CCAAAGAGGTGCTGGACGCCACC  

start  end   strand  sequence                 
4744   4767  +       CCAAAGAGGTGCTGGACGCCACC  



'CCCTTATGACCCTGACACGCTCGGT'

In [1962]:
def primer_binding_start_pos(primer_type,  #'FW' or 'RV'
                             primer,       #QUEEN object
                             template      #QUEEN object
                            ):
    """
    Given a type of primer ('FW' or 'RV'), the primer, and template sequence, finds the initial binding site of the primer in the template sequence, whether the primer has an overhang or not.
    Returns the start index position of the template sequence from the portion of the primer that binds the template sequence.
    Requires minimum 18bp annealing between primer and template. The 3' end of the binding primer must anneal.
    """
    # Note: re.search cannot search in the 3' to 5' direction on a QUEEN DNA object, so the function will not work for Rv primers unless the template is flipped
    # For example:
      # match1 = re.search('hell', 'hello') #Finds match='hell'
      # print(match1)
      # match2 = re.search('lleh', 'hello') #Finds None
      # print(match2)
    
    n = len(primer.seq)                  #length of primer
    i=-18                                #starting point of i
    flippedtemplate = flipdna(template)  #reverse complement of template
    binding_site = []
    
    
    if primer_type == 'FW':                                 #for searching templates with Fw primer
        while i <= n:
            match = re.search(primer.seq[i:], template.seq) #search for each character in the primer string until the maximum amount of the primer matches the template string
            binding_site.append(match)                      #add matching string to the list binding_site
            if match == None:                               #end loop when there are no more matches
                break
            elif -i == n:                                   #end loop when there are no is no more primer string to iterate on
                break
            i = i-1
    elif primer_type == 'RV':                               #for searching templates with Rv primer; template must be reverse complement because re can't search on dsDNA like QUEEN can
        while i <= n:
            match = re.search(primer.seq[i:], flippedtemplate.seq)
            binding_site.append(match)
            if match == None:
                break
            elif -i == n:
                break
            i = i-1
            
    if len(binding_site) == 1 and binding_site[-1] == None:  #if there is no binding sequence, the binding_site list will only contain 1 NoneType object
        return 'No primer binding site found.'               #notify that no binding sequence for the primer was found in the template
    elif binding_site[-1] == None:                           #the while loop ends the binding_site list with a NoneType object if the primer has an overhang
        return binding_site[-2].start()                      #show the last matching binding site for a primer with an overhang
    else:
        return binding_site[-1].start()                      #show the last matching binding site for a primer with no overhang
        

#Examples

QUEEN(record="139107", dbtype="addgene", product="pSI_626")

#Example 1: Rv primer binds to 24 bp in pSI_626, but has 21 bp overhang that actually matches another part of pSI_626
QUEEN(seq="CCACCTTGCGCTTCTTCTTTGGGCCTGGGTTGTCCCGCAGGTACT",product="SK132")  #Rv primer that binds 24bp of pSI_626
find_primer_binding_str('RV', SK132, pSI_626)                               #Finds GGCCTGGGTTGTCCCGCAGGTACT 
primer_binding_start_pos('RV', SK132, pSI_626)                              #Returns start position of binding portion of primer

#Example 2: Fw primer binds to 23 bp in pSI_626 with no overhang
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")                      #Fw primer that has no overhang and binds all 23bp
find_primer_binding_str('FW', SI1313, pSI_626)                              #Finds CCAAAGAGGTGCTGGACGCCACC
primer_binding_start_pos('FW', SI1313, pSI_626)                             #Returns start position of binding portion of primer, no opportunity to return None as the last call

#Example 3: Fw primer binds to 23 bp in pSI_626 with no overhang but is given with wrong primer type
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")                      #Fw primer that has no overhang and binds all 23bp
find_primer_binding_str('RV', SI1313, pSI_626)                              #Finds CCAAAGAGGTGCTGGACGCCACC
primer_binding_start_pos('RV', SI1313, pSI_626)                             #Returns 'No primer binding site found.'

#Example 4: Rv primer binds to a Fw primer
QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003") 
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")
find_primer_binding_str('RV', SI2004, SI2003)
primer_binding_start_pos('RV', SI2004, SI2003)

4

In [1963]:
def primer_binding_end_pos(primer_type,  #'FW' or 'RV'
                           primer,       #QUEEN object
                           template      #QUEEN object
                          ):
    """
    Given a type of primer ('FW' or 'RV'), the primer, and template sequence, finds the initial binding site of the primer in the template sequence, whether the primer has an overhang or not.
    Returns the end index position of the template sequence from the portion of the primer that binds the template sequence.
    Requires minimum 18bp annealing between primer and template. The 3' end of the binding primer must anneal.
    """
    # Note: re.search cannot search in the 3' to 5' direction on a QUEEN DNA object, so the function will not work for Rv primers unless the template is flipped
    # For example:
      # match1 = re.search('hell', 'hello') #Finds match='hell'
      # print(match1)
      # match2 = re.search('lleh', 'hello') #Finds None
      # print(match2)
    
    n = len(primer.seq)                  #length of primer
    i=-18                                #starting point of i
    flippedtemplate = flipdna(template)  #reverse complement of template
    binding_site = []
    
    
    if primer_type == 'FW':                                 #for searching templates with Fw primer
        while i <= n:
            match = re.search(primer.seq[i:], template.seq) #search for each character in the primer string until the maximum amount of the primer matches the template string
            binding_site.append(match)                      #add matching string to the list binding_site
            if match == None:                               #end loop when there are no more matches
                break
            elif -i == n:                                   #end loop when there are no is no more primer string to iterate on
                break
            i = i-1
    elif primer_type == 'RV':                               #for searching templates with Rv primer; template must be reverse complement because re can't search on dsDNA like QUEEN can
        while i <= n:
            match = re.search(primer.seq[i:], flippedtemplate.seq)
            binding_site.append(match)
            if match == None:
                break
            elif -i == n:
                break
            i = i-1
            
    if len(binding_site) == 1 and binding_site[-1] == None:  #if there is no binding sequence, the binding_site list will only contain 1 NoneType object
        return 'No primer binding site found.'               #notify that no binding sequence for the primer was found in the template
    elif binding_site[-1] == None:                           #the while loop ends the binding_site list with a NoneType object if the primer has an overhang
        return binding_site[-2].end()                        #show the last matching binding site for a primer with an overhang
    else:
        return binding_site[-1].end()                        #show the last matching binding site for a primer with no overhang
        

#Examples

QUEEN(record="139107", dbtype="addgene", product="pSI_626")

#Example 1: Rv primer binds to 24 bp in pSI_626, but has 21 bp overhang that actually matches another part of pSI_626
QUEEN(seq="CCACCTTGCGCTTCTTCTTTGGGCCTGGGTTGTCCCGCAGGTACT",product="SK132")  #Rv primer that binds 24bp of pSI_626
find_primer_binding_str('RV', SK132, pSI_626)                               #Finds GGCCTGGGTTGTCCCGCAGGTACT 
primer_binding_end_pos('RV', SK132, pSI_626)                              #Returns start position of binding portion of primer

#Example 2: Fw primer binds to 23 bp in pSI_626 with no overhang
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")                      #Fw primer that has no overhang and binds all 23bp
find_primer_binding_str('FW', SI1313, pSI_626)                              #Finds CCAAAGAGGTGCTGGACGCCACC
primer_binding_end_pos('FW', SI1313, pSI_626)                             #Returns start position of binding portion of primer, no opportunity to return None as the last call

#Example 3: Fw primer binds to 23 bp in pSI_626 with no overhang but is given with wrong primer type
QUEEN(seq="ccaaagaggtgctggacgccacc", product="SI1313")                      #Fw primer that has no overhang and binds all 23bp
find_primer_binding_str('RV', SI1313, pSI_626)                              #Finds CCAAAGAGGTGCTGGACGCCACC
primer_binding_end_pos('RV', SI1313, pSI_626)                             #Returns 'No primer binding site found.'

#Example 4: Rv primer binds to a Fw primer
QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003") 
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")
find_primer_binding_str('RV', SI2004, SI2003)
primer_binding_end_pos('RV', SI2004, SI2003)


29

In [1964]:
def find_primer_binding_primer(fw_primer,  #QUEEN object
                               rv_primer   #QUEEN object
                              ):
    """
    Searches for binding sites between two primers (useful for template-free PCR or annealing of two ssDNA oligos).
    Returns feature list of QUEEN search, including start, end, strand, and sequence. 3' end of primer must anneal.
    Requires minimum 18-bp annealing between two oligos.
    """
    if len(fw_primer.seq) > len(rv_primer.seq):             #The smaller primer needs to be the query and the larger primer needs to be the template searched on.
        return find_primer_binding(rv_primer, fw_primer)   
    elif len(fw_primer.seq) < len(rv_primer.seq):
        return find_primer_binding(fw_primer, rv_primer)
    else:
        return find_primer_binding(fw_primer, rv_primer)


#Examples

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="testfw")
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="testrv") #testrv is the reverse complement of the inner sequence of SI2003 flanked by each CTAG
find_primer_binding_primer(testfw, testrv)  #Larger primer given as first item - works

find_primer_binding_primer(testrv, testfw)  #Larger primer given as second item - works

QUEEN(seq="AAAACCCTTATGACCCTGACACGCTCGGTAAAA", product="testrv2")
find_primer_binding_primer(testfw, testrv2) #Primers are equal size with the second primer having overhangs that don't match - works


start  end  strand  sequence                   
4      29   -       CCCTTATGACCCTGACACGCTCGGT  

start  end  strand  sequence                   
4      29   -       CCCTTATGACCCTGACACGCTCGGT  



'No primer binding site found.'

In [1965]:
def find_primer_binding_primer_str(fw_primer,  #QUEEN object
                                   rv_primer   #QUEEN object
                                  ):
    """
    Searches for binding sites between two primers (useful for template-free PCR or annealing of two ssDNA oligos).
    Returns string of the binding portion of the primer that anneals to the other primer. 3' end of primer must anneal.
    If primers anneal and are equal size, returns string of the binding portion of the primer provided second.
    Requires minimum 18-bp annealing between two oligos.
    """
    
    #The 'RV' feature must be set for every search because by design, every template (whether it's a Fw primer or Rv primer),
    #will be antisense to the query, unlike when we're searching on a plasmid, which has a sense strand and an antisense strand
    
    if len(fw_primer.seq) > len(rv_primer.seq):                         #The smaller primer needs to be the query and the larger primer needs to be the template searched on.
        return find_primer_binding_str('RV', rv_primer, fw_primer)      
    elif len(fw_primer.seq) < len(rv_primer.seq):
        return find_primer_binding_str('RV', fw_primer, rv_primer)
    elif len(fw_primer.seq) == len(rv_primer.seq):
        return find_primer_binding_str('RV', fw_primer, rv_primer)


#Examples

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003")
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")                #The reverse complement of the inner sequence of SI2003 flanked by each CTAG
find_primer_binding_primer_str(SI2003, SI2004)                          #Larger primer given as first item - works
find_primer_binding_primer_str(SI2004, SI2003)                          #Larger primer given as second item - works
find_primer_binding_primer_str(SI2004, SI2004)                          #Primers are the same size and identical - works (no match because primers aren't complementary)

QUEEN(seq="AAAACCCTTATGACCCTGACACGCTCGGTAAAA", product="test")
find_primer_binding_primer_str(SI2003, test)                            #Primers are equal size with second having overhangs that don't match - works (no match because no 3' annealing)

QUEEN(seq="AAAACCCTTATGACCCTGACACGCTCGGT", product="test2")
find_primer_binding_primer_str(SI2003, test2)                           #Rv primer is smaller with a 5' overhang that doesn't match - works

QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGG", product="SI2003trunc")
find_primer_binding_primer_str(SI2004, SI2003trunc)                     #Primers are equal size and anneal fully - works

'CCCTTATGACCCTGACACGCTCGGT'

In [1966]:
def find_primer_binding_primer_str_short(fw_primer,  #QUEEN object
                                         rv_primer   #QUEEN object
                                        ):
    """
    Searches for binding sites between two primers (useful for template-free PCR or annealing of two ssDNA oligos).
    Returns string of the binding portion of the primer that anneals to the other primer. 3' end of primer must anneal.
    If primers anneal and are equal size, returns string of the binding portion of the primer provided second.
    Requires minimum 15-bp annealing between two oligos.
    """
    
    #The 'RV' feature must be set for every search because by design, every template (whether it's a Fw primer or Rv primer),
    #will be antisense to the query, unlike when we're searching on a plasmid, which has a sense strand and an antisense strand
    
    if len(fw_primer.seq) > len(rv_primer.seq):                         #The smaller primer needs to be the query and the larger primer needs to be the template searched on.
        return find_primer_binding_str_short('RV', rv_primer, fw_primer)      
    elif len(fw_primer.seq) < len(rv_primer.seq):
        return find_primer_binding_str_short('RV', fw_primer, rv_primer)
    elif len(fw_primer.seq) == len(rv_primer.seq):
        return find_primer_binding_str_short('RV', fw_primer, rv_primer)


#Examples

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003")
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")                      #The reverse complement of the inner sequence of SI2003 flanked by each CTAG
find_primer_binding_primer_str_short(SI2003, SI2004)                          #Larger primer given as first item - works
find_primer_binding_primer_str_short(SI2004, SI2003)                          #Larger primer given as second item - works
find_primer_binding_primer_str_short(SI2004, SI2004)                          #Primers are the same size and identical - works (no match because primers aren't complementary)

QUEEN(seq="AAAACCCTTATGACCCTGACACGCTCGGTAAAA", product="test")
find_primer_binding_primer_str_short(SI2003, test)                            #Primers are equal size with second having overhangs that don't match - works (no match because no 3' annealing)

QUEEN(seq="AAAACCCTTATGACCCTGACACGCTCGGT", product="test2")
find_primer_binding_primer_str_short(SI2003, test2)                           #Rv primer is smaller with a 5' overhang that doesn't match - works

QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGG", product="SI2003trunc")
find_primer_binding_primer_str_short(SI2004, SI2003trunc)                     #Primers are equal size and anneal fully - works

'CCCTTATGACCCTGACACGCTCGGT'

In [1967]:
def find_primer_binding_primer_start(fw_primer,  #QUEEN object
                                     rv_primer   #QUEEN object
                                    ):
    """
    Searches for binding sites between two primers (useful for template-free PCR or annealing of two ssDNA oligos).
    Returns the start index position of the template sequence from the binding portion of the primer that anneals to the other primer. 3' end of primer must anneal.
    Requires minimum 18-bp annealing between two oligos.
    """
    
    #The 'RV' feature must be set for every search because by design, every template (whether it's a Fw primer or Rv primer),
    #will be antisense to the query, unlike when we're searching on a plasmid, which has a sense strand and an antisense strand
    
    if len(fw_primer.seq) > len(rv_primer.seq):                         #The smaller primer needs to be the query and the larger primer needs to be the template searched on.
        return primer_binding_start_pos('RV', rv_primer, fw_primer)      
    elif len(fw_primer.seq) < len(rv_primer.seq):
        return primer_binding_start_pos('RV', fw_primer, rv_primer)
    elif len(fw_primer.seq) == len(rv_primer.seq):
        return primer_binding_start_pos('RV', fw_primer, rv_primer)


#Examples

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003")
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")                #The reverse complement of the inner sequence of SI2003 flanked by each CTAG
find_primer_binding_primer_start(SI2003, SI2004)                        #Larger primer given as first item - works
find_primer_binding_primer_start(SI2004, SI2003)                        #Larger primer given as second item - works
find_primer_binding_primer_start(SI2004, SI2004)                        #Primers are the same size and identical - works (no match because primers aren't complementary)

QUEEN(seq="AAAACCCTTATGACCCTGACACGCTCGGTAAAA", product="test")
find_primer_binding_primer_start(SI2003, test)                           #Primers are equal size with second having overhangs that don't match - works (no match because no 3' annealing)

QUEEN(seq="AAAACCCTTATGACCCTGACACGCTCGGT", product="test2")
find_primer_binding_primer_start(SI2003, test2)                          #Rv primer is smaller with a 5' overhang that doesn't match - works

QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGG", product="SI2003trunc")
find_primer_binding_primer_start(SI2004, SI2003trunc)                     #Primers are equal size and anneal fully - works

0

In [1968]:
def find_primer_binding_primer_end(fw_primer,  #QUEEN object
                                   rv_primer   #QUEEN object
                                  ):
    """
    Searches for binding sites between two primers (useful for template-free PCR or annealing of two ssDNA oligos).
    Returns the start index position of the template sequence from the binding portion of the primer that anneals to the other primer. 3' end of primer must anneal.
    Requires minimum 18-bp annealing between two oligos.
    """
    
    #The 'RV' feature must be set for every search because by design, every template (whether it's a Fw primer or Rv primer),
    #will be antisense to the query, unlike when we're searching on a plasmid, which has a sense strand and an antisense strand
    
    if len(fw_primer.seq) > len(rv_primer.seq):                         #The smaller primer needs to be the query and the larger primer needs to be the template searched on.
        return primer_binding_end_pos('RV', rv_primer, fw_primer)      
    elif len(fw_primer.seq) < len(rv_primer.seq):
        return primer_binding_end_pos('RV', fw_primer, rv_primer)
    elif len(fw_primer.seq) == len(rv_primer.seq):
        return primer_binding_end_pos('RV', fw_primer, rv_primer)


#Examples

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003")
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")                #The reverse complement of the inner sequence of SI2003 flanked by each CTAG
find_primer_binding_primer_end(SI2003, SI2004)                        #Larger primer given as first item - works
find_primer_binding_primer_end(SI2004, SI2003)                        #Larger primer given as second item - works
find_primer_binding_primer_end(SI2004, SI2004)                        #Primers are the same size and identical - works (no match because primers aren't complementary)

QUEEN(seq="AAAACCCTTATGACCCTGACACGCTCGGTAAAA", product="test")
find_primer_binding_primer_end(SI2003, test)                           #Primers are equal size with second having overhangs that don't match - works (no match because no 3' annealing)

QUEEN(seq="AAAACCCTTATGACCCTGACACGCTCGGT", product="test2")
find_primer_binding_primer_end(SI2003, test2)                          #Rv primer is smaller with a 5' overhang that doesn't match - works

QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGG", product="SI2003trunc")
find_primer_binding_primer_end(SI2004, SI2003trunc)                     #Primers are equal size and anneal fully - works

25

In [2402]:
def find_similar_sequence(seq1,                  #str for Fw primer
                          seq2,                  #str for template/plasmid
                          max_mismatches=3       #user-defined, default is 3
                         ): 
    """
    Given a Fw primer (str) and template/plasmid (str), and the maximum number of mismatches allowed (default is 3),
    finds the best match to the primer query in the template/plasmid.
    Searches in a window size equal to the length of seq1.
    Returns the best match as a substring from the template/plasmid.
    """
    #Function written with help from ChatGPT
    
    # Convert the sequences to uppercase
    seq1 = seq1.upper()
    seq2 = seq2.upper()

    # Initialize the variables to track the best match
    best_match = ""
    best_start = -1
    best_mismatches = max_mismatches + 1

    # Iterate over all possible substrings of seq2 that are the same length as seq1
    for i in range(len(seq2) - len(seq1) + 1):
        # Get the substring of seq2 that starts at position i and has the same length as seq1
        substring = seq2[i:i+len(seq1)]

        # Count the number of mismatches between the substring and seq1
        mismatches = 0
        for j in range(len(substring)):
            if substring[j] != seq1[j]:
                mismatches += 1

        # If this is the best match so far (fewest mismatches), update the best match
        if mismatches < best_mismatches:
            best_match = substring
            best_start = i
            best_mismatches = mismatches

    # Return the best matching substring in the plasmid
    return (best_match)


#Examples

QUEEN(record="https://benchling.com/s/seq-lyizd9Ah3jyfw9GFwW9v", dbtype="benchling", product="pLV_mCherry")

QUEEN(seq="GGTGAATTCCCGAGCGTGTCAGGGTGACCGTGAGCAAGGGCGAGG", product = "SI1293")
QUEEN(seq="GGTGAATTCCCGAGCGTGTCAGGGTGACCGTGAGCAAGGGCGAGGAGGATAACGCTGCCATC", product="SI1294")
QUEEN(seq="GGTGAATTCCCGAGCGTGTCAGGGTGACCGTGGCCATCATCAAGGAGTTCA", product="SI1295")
QUEEN(seq="GGTGAATTCCCGAGCGTGTCAGGGTGACCATGAGCAAGGGCGAGGAGGATAACGCTGCCATC", product="SI1297") 

pn1='test'
pn2='test2'
pd1='test3'
pd2='test4'

#Example 1: 35bp-annealing primer with 3bp mismatch to pLV_mCherry
find_similar_sequence(SI1294.seq[-18:], pLV_mCherry.seq, max_mismatches = 3) #works

#Example 2: 21bp-annealing primer to pLV_mCherry
find_similar_sequence(SI1295.seq[-18:], pLV_mCherry.seq, max_mismatches = 3) #works

#Example 3: 34bp-annealing primer with 3bp mismatch to pLV_mCherry
find_similar_sequence(SI1297.seq[-18:], pLV_mCherry.seq, max_mismatches = 3) #works

#Example 3: 34bp-annealing primer with 3bp mismatch to pLV_mCherry
find_similar_sequence(SI1297.seq[-18:], pLV_mCherry.seq, max_mismatches = 2) #works, returns no string because mismatches are too low

''

## DNA modification functions

In [1970]:
def generate_complement(dna #str
                       ):
    """
    Given a string of DNA in UPPERCASE letters only, returns complement of the DNA (not the reverse complement).
    """
    complementary_dna = dna.replace('A', 't').replace('T', 'a').replace('G', 'c').replace('C', 'g')
    return complementary_dna


#Examples
generate_complement('ACGT') #makes 'tgca'
generate_complement('acgt') #doesn't work because input is only uppercase letters

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003")
generate_complement(SI2003.seq) #works!

'gatctggctcgcacagtcccagtattcccgatc'

In [1971]:
def generate_dsdna(dna  #str
                  ):
    """
    Given a single-stranded string of DNA in UPPERCASE letters only, returns the double-stranded QUEEN object.
    """
    sense = dna                                  #sense strand of dna
    antisense = generate_complement(dna)         #antisense strand of dna
    concatenate = sense + '/' + antisense        #concatenate into one string for QUEEN to recognize the object
    dsdna = QUEEN(seq=concatenate)               #generate dsDNA
    return dsdna                                 #print dsDNA


#Examples
generate_dsdna('ACGT') #works!
#generate_dsDNA('acgt') #returns an error because the sequence is not in uppercase

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003")
generate_dsdna(SI2003.seq) #works!

<queen.QUEEN object; project='dna_14165', length='33 bp', sequence='CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG', topology='linear'>

In [1972]:
#Helper function for anneal_oligos.
def five_prime_overhang(string,
                       ):
    """
    Given a string of format 'NNN|NNN' where N can represent any nucleotide or absence of a nucleotide, returns the 5' side of the string (characters before the '|').
    Helper function for anneal_oligos.
    """ 
    i = 0
    string_new = ''
    n = len(string)
    
    while i <= n:
        string_new = string_new + string[i]
        if string[i] == '|':
            break
        elif i == n:
            break
        i = i+1
        
    return string_new[:i]

#Examples
five_prime_overhang('CTAG|CTAG') #Returns 'CTAG', len 4
five_prime_overhang('CTAGAA|')   #Returns 'CTAGAA', len 6
five_prime_overhang('|AAA')      #Returns '', this case doesn't return '|' because [:i] excludes position i (which is '|' here), len 0
five_prime_overhang('|')         #Returns '', this case doesn't return '|' because [:i] excludes position i (which is '|' here), len 0
#five_prime_overhang('')          #Returns IndexError

''

In [1973]:
#Helper function for anneal_oligos.
def three_prime_overhang(string,
                        ):
    """
    Given a string of format 'NNN|NNN' where N can represent any nucleotide or absence of a nucleotide, returns the 3' side of the string (characters after the '|').
    Helper function for anneal_oligos.
    """ 
    i = -1
    string_new = ''
    n = len(string)
    
    while i <= n:
        string_new = string[i] + string_new
        if string[i] == '|':
            break
        elif i == n:
            break
        i = i-1
    
    j = i+1 
    if string_new == '|':   #When '|' is the first character that i sees, then string_new becomes '|' and the loop ends. We remove this to give '' in such cases.
        return ''
    else:
        return string_new[j:]

#Examples
three_prime_overhang('CTAG|CTAG') #Returns 'CTAG', len 4
three_prime_overhang('CTAGAA|')   #Returns '', len 0
three_prime_overhang('|AAA')      #Returns 'AAA', len 3
three_prime_overhang('|')         #Returns '', len 0
#three_prime_overhang('')          #Returns IndexError

''

In [1974]:
#Helper function for anneal_oligos.
def primer_overhangs(primer, #QUEEN object
                     anneal  #str
                    ):
    """
    Given a primer and its known annealing sequence to another primer (*required to anneal*), determines the overhangs on its 5' and 3' ends (if any).
    Returns the overhangs in the format 'NNN|NNN', where '|' separates the 5' overhang (left of the | ) and the 3' overhang (right of the bar | ).
    There can be 0 or more N's on either side of the '|'.
    Helper function for anneal_oligos.
    """
    p = primer.seq
    
    if p.__contains__(anneal) == True:
        return p.replace(anneal, '|')
    elif p.__contains__(anneal) == False:
        return p.replace(flipdna(anneal), '|')


#Examples
QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003")
QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGG", product="SI2003inner")
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")

primer_overhangs(SI2003, SI2003inner.seq)               #The identified annealing sequence is the inner 25bp of SI2003
primer_overhangs(SI2003, 'CCCTTATGACCCTGACACGCTCGGT')   #The identified annealing sequence is SI2004.seq (written as string because flipdna operates on a pure string or QUEEN object)
primer_overhangs(SI2004, SI2004.seq)                    #The identified annealing sequence is the same as the primer, which leaves no overhangs and thus '|'
primer_overhangs(SI2003, 'ACTGTG')                      #The identified annealing sequence has no match in the primer, and the whole primer is returned

'CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG'

In [1975]:
def anneal_oligos(fw_primer,  #QUEEN object
                  rv_primer,  #QUEEN object
                  pn,         #str
                  pd          #str
                 ):
    """
    Given Fw and Rv primers, creates a new fragment through ssDNA annealing of the two primers to each other.
    *Requirement: Primers must anneal with at least 18 bp of binding.
    Leaves overhangs on either, neither, or both sides depending on the design of the two oligos.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the annealed oligo sequence.
    """
    #
    # To build dsDNAs from two ssDNA oligos annealing (but not extending), we must consider the nine possible combinations of oligos:
    #
    #               Case 1           Case 2                           Case 3                           Case 4                           Case 5
    # Fw primer:    5'-anneal-3'     5'-overhang-anneal-overhang-3'   5'----------anneal----------3'   5'-overhang-anneal----------3'   5'----------anneal-overhang-3'
    # Rv primer:    3'-anneal-5'     3'----------anneal----------5'   3'-overhang-anneal-overhang-5'   3'----------anneal-overhang-5'   3'-overhang-anneal----------5'
    #                  Frag.1            Frag.2          Frag.3           Frag.4          Frag.5   
    #
    #                                Case 6                           Case 7                           Case 8                           Case 9
    # Fw primer:                     5'-overhang-anneal-3'            5'----------anneal-3'            5'-anneal-overhang-3'            5'-anneal----------3'
    # Rv primer:                     3'----------anneal-5'            3'-overhang-anneal-5'            3'-anneal----------5'            3'-anneal-overhang-5'
    #
    #
    # With the information of two primer sequences we must generate 5 unique fragments that will exist or not exist depending on which IF statement is satisified.
    # Then, we will concatenate fragments depending on the condition satisfied.
    #
    # Approach: Generate each fragment, then concatenate them on the ds_anneal string, and the product should be one of cases 1-9.
    # If an annealed oligo doesn't exist, it should notify the user.
    #
    
    fw_primer_str = fw_primer.seq                                            #String of Fw primer sequence
    rv_primer_str = rv_primer.seq                                            #String of Rv primer sequence
    
    if fw_primer_str == rv_primer_str:                                       #Function can't run if oligos are exactly identical
        return 'Identical ssDNAs cannot anneal. Please revise your entry.'
    elif len(fw_primer_str) < 18:                                            #Function can't run if oligos have <18 bp annealing.
        return 'Oligos could not anneal because at least one is too short. Please revise your entry.'
    elif len(rv_primer_str) < 18:                                            #Function can't run if oligos have <18 bp annealing.
        return 'Oligos could not anneal because at least one is too short. Please revise your entry.'
    else:
        pass
    
    fw_overhang_list = []                                                    #Empty list for Fw overhang(s)
    rv_overhang_list = []                                                    #Empty list for Rv overhang(s)
    fw_fragment_list = []                                                    #Empty list for Fw overhang fragment(s)
    rv_fragment_list = []                                                    #Empty list for Fw overhang fragment(s)
    
    #Generate Frag.1
    anneal = find_primer_binding_primer_str(fw_primer, rv_primer)            #Finds the annealing site between the two primers
    if fw_primer_str.__contains__(anneal) == True:
        ds_anneal = generate_dsdna(anneal)                                   #Generates the double-stranded sequence of the binding sites of the two primers upon annealing
    elif fw_primer_str.__contains__(anneal) == False:                        #Error for Case 5 occurs here because the 3' end of the Rv primer doesn't anneal
        ds_anneal = generate_dsdna(flipdna(anneal))                          #Generates the double-stranded sequence of the binding sites of the two primers upon annealing
    
    
    #Generate Frag.2 and Frag.3
    overhangs_fw_primer = primer_overhangs(fw_primer, anneal)
    five_fw = five_prime_overhang(overhangs_fw_primer)                       #Provides the 5' overhang of the Fw primer (if any)
    fw_overhang_list.append(five_fw)
    three_fw = three_prime_overhang(overhangs_fw_primer)                     #Provides the 3' overhang of the Fw primer (if any)
    fw_overhang_list.append(three_fw)
    
    #Position 0 of fw_fragment_list will correspond to Frag.2
    #Position 1 of fw_fragment_list will correspond to Frag.3
    i=0
    while i < 2:
        if fw_overhang_list[i] != '':
            fragment = QUEEN(seq=fw_overhang_list[i] + '/' + ('-' * len(fw_overhang_list[i])))
            fw_fragment_list.append(fragment)
        elif fw_overhang_list[i] == '':
            fw_fragment_list.append('No overhang')
        elif i == 2:
            break
        i = i+1
    
    
    #Generate Frag.4 and Frag.5
    overhangs_rv_primer = primer_overhangs(rv_primer, anneal)
    five_rv = five_prime_overhang(overhangs_rv_primer)                       #Provides the 5' overhang of the Rv primer (if any)
    rv_overhang_list.append(five_rv)
    three_rv = three_prime_overhang(overhangs_rv_primer)                     #Provides the 3' overhang of the Rv primer (if any)
    rv_overhang_list.append(three_rv)
    
    #Position 0 of rv_fragment_list will correspond to Frag.4
    #Position 1 of rv_fragment_list will correspond to Frag.5
    i=0
    while i < 2:
        if rv_overhang_list[i] != '':
            fragment = QUEEN(seq=rv_overhang_list[i] + '/' + ('-' * len(rv_overhang_list[i])))
            rv_fragment_list.append(fragment)
        elif rv_overhang_list[i] == '':
            rv_fragment_list.append('No overhang')
        elif i == 2:
            break
        i = i+1

    
    if anneal != 'No primer binding site found.':
        
        if fw_fragment_list[0] != 'No overhang':    #Add upstream Fw overhang if possible
            frag2 = QUEEN(seq=fw_fragment_list[0].seq + ds_anneal.seq[0] + '/' + '-' * len(fw_fragment_list[0].seq) + generate_complement(ds_anneal.seq[0]))
            ds_anneal_left_trim = cropdna(ds_anneal, 1, len(ds_anneal.seq))
            anneal_left = joindna(frag2, ds_anneal_left_trim, pn=pn, pd=pd)
            
        elif rv_fragment_list[1] != 'No overhang':  #Add upstream Rv overhang if possible
            frag4 = QUEEN(seq=('-' * len(rv_fragment_list[1].seq) + ds_anneal.seq[0] + '/' + rv_fragment_list[1].seq[::-1] + generate_complement(ds_anneal.seq[0])))
            ds_anneal_left_trim = cropdna(ds_anneal, 1, len(ds_anneal.seq))
            anneal_left = joindna(frag4, ds_anneal_left_trim, pn=pn, pd=pd)
            
        else:
            anneal_left = ds_anneal
    
        if fw_fragment_list[1] != 'No overhang':    #Add downstream Fw overhang if possible
            frag3 = QUEEN(seq=ds_anneal.seq[-1] + fw_fragment_list[1].seq + '/' + generate_complement(ds_anneal.seq[-1]) + ('-' * len(fw_fragment_list[1].seq)))
            ds_anneal_right_trim = cropdna(anneal_left, 0, (len(anneal_left.seq) - 1))
            annealed_oligo = joindna(ds_anneal_right_trim, frag3, pn=pn, pd=pd)
            annealed_oligo.printsequence(display=True)
            
        elif rv_fragment_list[0] != 'No overhang':  #Add downstream Rv overhang if possible
            frag5 = QUEEN(seq= ds_anneal.seq[-1] + ('-' * len(rv_fragment_list[0].seq)) + '/' + generate_complement(ds_anneal.seq[-1]) + rv_fragment_list[0].seq[::-1])
            ds_anneal_right_trim = cropdna(anneal_left, 0, (len(anneal_left.seq) - 1))
            annealed_oligo = joindna(ds_anneal_right_trim, frag5, pn=pn, pd=pd)
            annealed_oligo.printsequence(display=True)
        
        elif fw_fragment_list[0] != 'No overhang':
            anneal_left.printsequence(display=True)
            
        elif rv_fragment_list[1] != 'No overhang':
            anneal_left.printsequence(display=True) 
        
        else:
            annealed_oligo = ds_anneal
            annealed_oligo.printsequence(display=True) #temporarily print sequences but change to QUEEN objects later
            
    else:
        'Oligos could not anneal.'
    

#Examples

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003") 
QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGG", product="SI2003truncated") 
QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGG", product = "SI2003shortright")
QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003shortleft")  
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGTCTAG", product="SI2004longright")
QUEEN(seq="CTAGCCCTTATGACCCTGACACGCTCGGT", product="SI2004longleft")
QUEEN(seq="ACACGCTCGGT", product="SI2004cantanneal")
pn1='test'
pn2='test2'
pd1='test3'
pd2='test4'

#anneal_oligos(SI2003, SI2003, pn=pn1, pd=pd2)            #Produces 'Identical ssDNAs cannot anneal. Please revise your entry.' #works
#anneal_oligos(SI2003, SI2004cantanneal, pn=pn1, pd=pd2)  #Produces 'Oligos could not anneal because at least one is too short. Please revise your entry.' #works
#anneal_oligos(SI2004cantanneal, SI2003, pn=pn1, pd=pd2)  #Produces 'Oligos could not anneal because at least one is too short. Please revise your entry.' #works

#anneal_oligos(SI2003truncated, SI2004, pn=pn1, pd=pd2)                     #Produces Case 1 #works
 # 5' ACCGAGCGTGTCAGGGTCATAAGGG 3'
 # 3' TGGCTCGCACAGTCCCAGTATTCCC 5'

#anneal_oligos(SI2004, SI2003truncated, pn=pn1, pd=pd2)                     #Produces Case 1 but treats SI2004 as sense strand #works
 # 5' CCCTTATGACCCTGACACGCTCGGT 3'
 # 3' GGGAATACTGGGACTGTGCGAGCCA 5'

#anneal_oligos(SI2003, SI2004, pn=pn1, pd=pd2)                              #Produces Case 2 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' ----TGGCTCGCACAGTCCCAGTATTCCC---- 5'

#anneal_oligos(SI2004, SI2003, pn=pn1, pd=pd2)                              #Produces Case 3 #works
 # 5' ----CCCTTATGACCCTGACACGCTCGGT---- 3'
 # 3' GATCGGGAATACTGGGACTGTGCGAGCCAGATC 5'

#anneal_oligos(SI2003shortright, SI2004longleft, pn=pn1, pd=pd2)            #Produces Case 4 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGG---- 3'
 # 3' ----TGGCTCGCACAGTCCCAGTATTCCCGATC 5'

#anneal_oligos(SI2003shortleft, SI2004longright, pn=pn1, pd=pd2)            #Produces Case 5 #makes error because 3' end of Rv primer doesn't anneal
 # 5' ----ACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' GATCTGGCTCGCACAGTCCCAGTATTCCC---- 5'

#anneal_oligos(SI2003, SI2004longleft, pn=pn1, pd=pd2)                      #Produces Case 6 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' ----TGGCTCGCACAGTCCCAGTATTCCCGATC 5'

#anneal_oligos(SI2004longright, SI2003, pn=pn1, pd=pd2)                     #Produces Case 7 #works
 # 5' ----CCCTTATGACCCTGACACGCTCGGTCTAG 3'
 # 3' GATCGGGAATACTGGGACTGTGCGAGCCAGATC 5'

#anneal_oligos(SI2003, SI2004longright, pn=pn1, pd=pd2)                     #Produces Case 8 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' GATCTGGCTCGCACAGTCCCAGTATTCCC---- 5'

#anneal_oligos(SI2004longleft, SI2003, pn=pn1, pd=pd2)                      #Produces Case 9 #works
 # 5' CTAGCCCTTATGACCCTGACACGCTCGGT---- 3'
 # 3' GATCGGGAATACTGGGACTGTGCGAGCCAGATC 5'

In [1976]:
def anneal_oligos_object(fw_primer,  #QUEEN object
                         rv_primer,  #QUEEN object
                         pn,         #str
                         pd          #str
                        ):
    """
    Given Fw and Rv primers, creates a new fragment through ssDNA annealing of the two primers to each other.
    *Requirement: Primers must anneal with at least 18 bp of binding.
    Leaves overhangs on either, neither, or both sides depending on the design of the two oligos.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the QUEEN object with name " ". Also provides length and the object's topology (linear in this case).
    """
    #
    # To build dsDNAs from two ssDNA oligos annealing (but not extending), we must consider the nine possible combinations of oligos:
    #
    #               Case 1           Case 2                           Case 3                           Case 4                           Case 5
    # Fw primer:    5'-anneal-3'     5'-overhang-anneal-overhang-3'   5'----------anneal----------3'   5'-overhang-anneal----------3'   5'----------anneal-overhang-3'
    # Rv primer:    3'-anneal-5'     3'----------anneal----------5'   3'-overhang-anneal-overhang-5'   3'----------anneal-overhang-5'   3'-overhang-anneal----------5'
    #                  Frag.1            Frag.2          Frag.3           Frag.4          Frag.5   
    #
    #                                Case 6                           Case 7                           Case 8                           Case 9
    # Fw primer:                     5'-overhang-anneal-3'            5'----------anneal-3'            5'-anneal-overhang-3'            5'-anneal----------3'
    # Rv primer:                     3'----------anneal-5'            3'-overhang-anneal-5'            3'-anneal----------5'            3'-anneal-overhang-5'
    #
    #
    # With the information of two primer sequences we must generate 5 unique fragments that will exist or not exist depending on which IF statement is satisified.
    # Then, we will concatenate fragments depending on the condition satisfied.
    #
    # Approach: Generate each fragment, then concatenate them on the ds_anneal string, and the product should be one of cases 1-9.
    # If an annealed oligo doesn't exist, it should notify the user.
    #
    
    fw_primer_str = fw_primer.seq                                            #String of Fw primer sequence
    rv_primer_str = rv_primer.seq                                            #String of Rv primer sequence
    
    if fw_primer_str == rv_primer_str:                                       #Function can't run if oligos are exactly identical
        return 'Identical ssDNAs cannot anneal. Please revise your entry.'
    elif len(fw_primer_str) < 18:                                            #Function can't run if oligos have <18 bp annealing.
        return 'Oligos could not anneal because at least one is too short. Please revise your entry.'
    elif len(rv_primer_str) < 18:                                            #Function can't run if oligos have <18 bp annealing.
        return 'Oligos could not anneal because at least one is too short. Please revise your entry.'
    else:
        pass
    
    fw_overhang_list = []                                                    #Empty list for Fw overhang(s)
    rv_overhang_list = []                                                    #Empty list for Rv overhang(s)
    fw_fragment_list = []                                                    #Empty list for Fw overhang fragment(s)
    rv_fragment_list = []                                                    #Empty list for Fw overhang fragment(s)
    
    #Generate Frag.1
    anneal = find_primer_binding_primer_str(fw_primer, rv_primer)            #Finds the annealing site between the two primers
    if fw_primer_str.__contains__(anneal) == True:
        ds_anneal = generate_dsdna(anneal)                                   #Generates the double-stranded sequence of the binding sites of the two primers upon annealing
    elif fw_primer_str.__contains__(anneal) == False:                        #Error for Case 5 occurs here because the 3' end of the Rv primer doesn't anneal
        ds_anneal = generate_dsdna(flipdna(anneal))                          #Generates the double-stranded sequence of the binding sites of the two primers upon annealing
    
    
    #Generate Frag.2 and Frag.3
    overhangs_fw_primer = primer_overhangs(fw_primer, anneal)
    five_fw = five_prime_overhang(overhangs_fw_primer)                       #Provides the 5' overhang of the Fw primer (if any)
    fw_overhang_list.append(five_fw)
    three_fw = three_prime_overhang(overhangs_fw_primer)                     #Provides the 3' overhang of the Fw primer (if any)
    fw_overhang_list.append(three_fw)
    
    #Position 0 of fw_fragment_list will correspond to Frag.2
    #Position 1 of fw_fragment_list will correspond to Frag.3
    i=0
    while i < 2:
        if fw_overhang_list[i] != '':
            fragment = QUEEN(seq=fw_overhang_list[i] + '/' + ('-' * len(fw_overhang_list[i])))
            fw_fragment_list.append(fragment)
        elif fw_overhang_list[i] == '':
            fw_fragment_list.append('No overhang')
        elif i == 2:
            break
        i = i+1
    
    
    #Generate Frag.4 and Frag.5
    overhangs_rv_primer = primer_overhangs(rv_primer, anneal)
    five_rv = five_prime_overhang(overhangs_rv_primer)                       #Provides the 5' overhang of the Rv primer (if any)
    rv_overhang_list.append(five_rv)
    three_rv = three_prime_overhang(overhangs_rv_primer)                     #Provides the 3' overhang of the Rv primer (if any)
    rv_overhang_list.append(three_rv)
    
    #Position 0 of rv_fragment_list will correspond to Frag.4
    #Position 1 of rv_fragment_list will correspond to Frag.5
    i=0
    while i < 2:
        if rv_overhang_list[i] != '':
            fragment = QUEEN(seq=rv_overhang_list[i] + '/' + ('-' * len(rv_overhang_list[i])))
            rv_fragment_list.append(fragment)
        elif rv_overhang_list[i] == '':
            rv_fragment_list.append('No overhang')
        elif i == 2:
            break
        i = i+1

    
    if anneal != 'No primer binding site found.':
        
        if fw_fragment_list[0] != 'No overhang':    #Add upstream Fw overhang if possible
            frag2 = QUEEN(seq=fw_fragment_list[0].seq + ds_anneal.seq[0] + '/' + '-' * len(fw_fragment_list[0].seq) + generate_complement(ds_anneal.seq[0]))
            ds_anneal_left_trim = cropdna(ds_anneal, 1, len(ds_anneal.seq))
            anneal_left = joindna(frag2, ds_anneal_left_trim, pn=pn, pd=pd)
            
        elif rv_fragment_list[1] != 'No overhang':  #Add upstream Rv overhang if possible
            frag4 = QUEEN(seq=('-' * len(rv_fragment_list[1].seq) + ds_anneal.seq[0] + '/' + rv_fragment_list[1].seq[::-1] + generate_complement(ds_anneal.seq[0])))
            ds_anneal_left_trim = cropdna(ds_anneal, 1, len(ds_anneal.seq))
            anneal_left = joindna(frag4, ds_anneal_left_trim, pn=pn, pd=pd)
            
        else:
            anneal_left = ds_anneal
    
        if fw_fragment_list[1] != 'No overhang':    #Add downstream Fw overhang if possible
            frag3 = QUEEN(seq=ds_anneal.seq[-1] + fw_fragment_list[1].seq + '/' + generate_complement(ds_anneal.seq[-1]) + ('-' * len(fw_fragment_list[1].seq)))
            ds_anneal_right_trim = cropdna(anneal_left, 0, (len(anneal_left.seq) - 1))
            annealed_oligo = joindna(ds_anneal_right_trim, frag3, pn=pn, pd=pd)
            return annealed_oligo
            
        elif rv_fragment_list[0] != 'No overhang':  #Add downstream Rv overhang if possible
            frag5 = QUEEN(seq= ds_anneal.seq[-1] + ('-' * len(rv_fragment_list[0].seq)) + '/' + generate_complement(ds_anneal.seq[-1]) + rv_fragment_list[0].seq[::-1])
            ds_anneal_right_trim = cropdna(anneal_left, 0, (len(anneal_left.seq) - 1))
            annealed_oligo = joindna(ds_anneal_right_trim, frag5, pn=pn, pd=pd)
            return annealed_oligo
        
        elif fw_fragment_list[0] != 'No overhang':
            return anneal_left
            
        elif rv_fragment_list[1] != 'No overhang':
            return anneal_left
        
        else:
            annealed_oligo = ds_anneal
            return annealed_oligo
            
    else:
        'Oligos could not anneal.'
    

#Examples

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003") 
QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGG", product="SI2003truncated") 
QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGG", product = "SI2003shortright")
QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003shortleft")  
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGTCTAG", product="SI2004longright")
QUEEN(seq="CTAGCCCTTATGACCCTGACACGCTCGGT", product="SI2004longleft")
QUEEN(seq="ACACGCTCGGT", product="SI2004cantanneal")
pn1='test'
pn2='test2'
pd1='test3'
pd2='test4'

#anneal_oligos_object(SI2003, SI2003, pn=pn1, pd=pd2)            #Produces 'Identical ssDNAs cannot anneal. Please revise your entry.' #works
#anneal_oligos_object(SI2003, SI2004cantanneal, pn=pn1, pd=pd2)  #Produces 'Oligos could not anneal because at least one is too short. Please revise your entry.' #works
#anneal_oligos_object(SI2004cantanneal, SI2003, pn=pn1, pd=pd2)  #Produces 'Oligos could not anneal because at least one is too short. Please revise your entry.' #works

#anneal_oligos_object(SI2003truncated, SI2004, pn=pn1, pd=pd2)                     #Produces Case 1 #works
 # 5' ACCGAGCGTGTCAGGGTCATAAGGG 3'
 # 3' TGGCTCGCACAGTCCCAGTATTCCC 5'

#anneal_oligos_object(SI2004, SI2003truncated, pn=pn1, pd=pd2)                     #Produces Case 1 but treats SI2004 as sense strand #works
 # 5' CCCTTATGACCCTGACACGCTCGGT 3'
 # 3' GGGAATACTGGGACTGTGCGAGCCA 5'

#anneal_oligos_object(SI2003, SI2004, pn=pn1, pd=pd2)                              #Produces Case 2 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' ----TGGCTCGCACAGTCCCAGTATTCCC---- 5'

#anneal_oligos_object(SI2004, SI2003, pn=pn1, pd=pd2)                              #Produces Case 3 #works
 # 5' ----CCCTTATGACCCTGACACGCTCGGT---- 3'
 # 3' GATCGGGAATACTGGGACTGTGCGAGCCAGATC 5'

#anneal_oligos_object(SI2003shortright, SI2004longleft, pn=pn1, pd=pd2)            #Produces Case 4 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGG---- 3'
 # 3' ----TGGCTCGCACAGTCCCAGTATTCCCGATC 5'

#anneal_oligos_object(SI2003shortleft, SI2004longright, pn=pn1, pd=pd2)            #Produces Case 5 #makes error because 3' end of Rv primer doesn't anneal
 # 5' ----ACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' GATCTGGCTCGCACAGTCCCAGTATTCCC---- 5'

#anneal_oligos_object(SI2003, SI2004longleft, pn=pn1, pd=pd2)                      #Produces Case 6 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' ----TGGCTCGCACAGTCCCAGTATTCCCGATC 5'

#anneal_oligos_object(SI2004longright, SI2003, pn=pn1, pd=pd2)                     #Produces Case 7 #works
 # 5' ----CCCTTATGACCCTGACACGCTCGGTCTAG 3'
 # 3' GATCGGGAATACTGGGACTGTGCGAGCCAGATC 5'

#anneal_oligos_object(SI2003, SI2004longright, pn=pn1, pd=pd2)                     #Produces Case 8 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' GATCTGGCTCGCACAGTCCCAGTATTCCC---- 5'

#anneal_oligos_object(SI2004longleft, SI2003, pn=pn1, pd=pd2)                      #Produces Case 9 #works
 # 5' CTAGCCCTTATGACCCTGACACGCTCGGT---- 3'
 # 3' GATCGGGAATACTGGGACTGTGCGAGCCAGATC 5'

In [1977]:
def anneal_oligos_object_short(fw_primer,  #QUEEN object
                               rv_primer,  #QUEEN object
                               pn,         #str
                               pd          #str
                              ):
    """
    Given Fw and Rv primers, creates a new fragment through ssDNA annealing of the two primers to each other.
    *Requirement: Primers must anneal with at least 15 bp of binding.
    Leaves overhangs on either, neither, or both sides depending on the design of the two oligos.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the QUEEN object with name " ". Also provides length and the object's topology (linear in this case).
    """
    #
    # To build dsDNAs from two ssDNA oligos annealing (but not extending), we must consider the nine possible combinations of oligos:
    #
    #               Case 1           Case 2                           Case 3                           Case 4                           Case 5
    # Fw primer:    5'-anneal-3'     5'-overhang-anneal-overhang-3'   5'----------anneal----------3'   5'-overhang-anneal----------3'   5'----------anneal-overhang-3'
    # Rv primer:    3'-anneal-5'     3'----------anneal----------5'   3'-overhang-anneal-overhang-5'   3'----------anneal-overhang-5'   3'-overhang-anneal----------5'
    #                  Frag.1            Frag.2          Frag.3           Frag.4          Frag.5   
    #
    #                                Case 6                           Case 7                           Case 8                           Case 9
    # Fw primer:                     5'-overhang-anneal-3'            5'----------anneal-3'            5'-anneal-overhang-3'            5'-anneal----------3'
    # Rv primer:                     3'----------anneal-5'            3'-overhang-anneal-5'            3'-anneal----------5'            3'-anneal-overhang-5'
    #
    #
    # With the information of two primer sequences we must generate 5 unique fragments that will exist or not exist depending on which IF statement is satisified.
    # Then, we will concatenate fragments depending on the condition satisfied.
    #
    # Approach: Generate each fragment, then concatenate them on the ds_anneal string, and the product should be one of cases 1-9.
    # If an annealed oligo doesn't exist, it should notify the user.
    #
    
    fw_primer_str = fw_primer.seq                                            #String of Fw primer sequence
    rv_primer_str = rv_primer.seq                                            #String of Rv primer sequence
    
    if fw_primer_str == rv_primer_str:                                       #Function can't run if oligos are exactly identical
        return 'Identical ssDNAs cannot anneal. Please revise your entry.'
    elif len(fw_primer_str) < 15:                                            #Function can't run if oligos have <15 bp annealing.
        return 'Oligos could not anneal because at least one is too short. Please revise your entry.'
    elif len(rv_primer_str) < 15:                                            #Function can't run if oligos have <15 bp annealing.
        return 'Oligos could not anneal because at least one is too short. Please revise your entry.'
    else:
        pass
    
    fw_overhang_list = []                                                    #Empty list for Fw overhang(s)
    rv_overhang_list = []                                                    #Empty list for Rv overhang(s)
    fw_fragment_list = []                                                    #Empty list for Fw overhang fragment(s)
    rv_fragment_list = []                                                    #Empty list for Fw overhang fragment(s)
    
    #Generate Frag.1
    anneal = find_primer_binding_primer_str_short(fw_primer, rv_primer)            #Finds the annealing site between the two primers
    if fw_primer_str.__contains__(anneal) == True:
        ds_anneal = generate_dsdna(anneal)                                   #Generates the double-stranded sequence of the binding sites of the two primers upon annealing
    elif fw_primer_str.__contains__(anneal) == False:                        #Error for Case 5 occurs here because the 3' end of the Rv primer doesn't anneal
        ds_anneal = generate_dsdna(flipdna(anneal))                          #Generates the double-stranded sequence of the binding sites of the two primers upon annealing
    
    
    #Generate Frag.2 and Frag.3
    overhangs_fw_primer = primer_overhangs(fw_primer, anneal)
    five_fw = five_prime_overhang(overhangs_fw_primer)                       #Provides the 5' overhang of the Fw primer (if any)
    fw_overhang_list.append(five_fw)
    three_fw = three_prime_overhang(overhangs_fw_primer)                     #Provides the 3' overhang of the Fw primer (if any)
    fw_overhang_list.append(three_fw)
    
    #Position 0 of fw_fragment_list will correspond to Frag.2
    #Position 1 of fw_fragment_list will correspond to Frag.3
    i=0
    while i < 2:
        if fw_overhang_list[i] != '':
            fragment = QUEEN(seq=fw_overhang_list[i] + '/' + ('-' * len(fw_overhang_list[i])))
            fw_fragment_list.append(fragment)
        elif fw_overhang_list[i] == '':
            fw_fragment_list.append('No overhang')
        elif i == 2:
            break
        i = i+1
    
    
    #Generate Frag.4 and Frag.5
    overhangs_rv_primer = primer_overhangs(rv_primer, anneal)
    five_rv = five_prime_overhang(overhangs_rv_primer)                       #Provides the 5' overhang of the Rv primer (if any)
    rv_overhang_list.append(five_rv)
    three_rv = three_prime_overhang(overhangs_rv_primer)                     #Provides the 3' overhang of the Rv primer (if any)
    rv_overhang_list.append(three_rv)
    
    #Position 0 of rv_fragment_list will correspond to Frag.4
    #Position 1 of rv_fragment_list will correspond to Frag.5
    i=0
    while i < 2:
        if rv_overhang_list[i] != '':
            fragment = QUEEN(seq=rv_overhang_list[i] + '/' + ('-' * len(rv_overhang_list[i])))
            rv_fragment_list.append(fragment)
        elif rv_overhang_list[i] == '':
            rv_fragment_list.append('No overhang')
        elif i == 2:
            break
        i = i+1

    
    if anneal != 'No primer binding site found.':
        
        if fw_fragment_list[0] != 'No overhang':    #Add upstream Fw overhang if possible
            frag2 = QUEEN(seq=fw_fragment_list[0].seq + ds_anneal.seq[0] + '/' + '-' * len(fw_fragment_list[0].seq) + generate_complement(ds_anneal.seq[0]))
            ds_anneal_left_trim = cropdna(ds_anneal, 1, len(ds_anneal.seq))
            anneal_left = joindna(frag2, ds_anneal_left_trim, pn=pn, pd=pd)
            
        elif rv_fragment_list[1] != 'No overhang':  #Add upstream Rv overhang if possible
            frag4 = QUEEN(seq=('-' * len(rv_fragment_list[1].seq) + ds_anneal.seq[0] + '/' + rv_fragment_list[1].seq[::-1] + generate_complement(ds_anneal.seq[0])))
            ds_anneal_left_trim = cropdna(ds_anneal, 1, len(ds_anneal.seq))
            anneal_left = joindna(frag4, ds_anneal_left_trim, pn=pn, pd=pd)
            
        else:
            anneal_left = ds_anneal
    
        if fw_fragment_list[1] != 'No overhang':    #Add downstream Fw overhang if possible
            frag3 = QUEEN(seq=ds_anneal.seq[-1] + fw_fragment_list[1].seq + '/' + generate_complement(ds_anneal.seq[-1]) + ('-' * len(fw_fragment_list[1].seq)))
            ds_anneal_right_trim = cropdna(anneal_left, 0, (len(anneal_left.seq) - 1))
            annealed_oligo = joindna(ds_anneal_right_trim, frag3, pn=pn, pd=pd)
            return annealed_oligo
            
        elif rv_fragment_list[0] != 'No overhang':  #Add downstream Rv overhang if possible
            frag5 = QUEEN(seq= ds_anneal.seq[-1] + ('-' * len(rv_fragment_list[0].seq)) + '/' + generate_complement(ds_anneal.seq[-1]) + rv_fragment_list[0].seq[::-1])
            ds_anneal_right_trim = cropdna(anneal_left, 0, (len(anneal_left.seq) - 1))
            annealed_oligo = joindna(ds_anneal_right_trim, frag5, pn=pn, pd=pd)
            return annealed_oligo
        
        elif fw_fragment_list[0] != 'No overhang':
            return anneal_left
            
        elif rv_fragment_list[1] != 'No overhang':
            return anneal_left
        
        else:
            annealed_oligo = ds_anneal
            return annealed_oligo
            
    else:
        'Oligos could not anneal.'
    

#Examples

QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003") 
QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGG", product="SI2003truncated") 
QUEEN(seq="CTAGACCGAGCGTGTCAGGGTCATAAGGG", product = "SI2003shortright")
QUEEN(seq="ACCGAGCGTGTCAGGGTCATAAGGGCTAG", product="SI2003shortleft")  
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGT", product="SI2004")
QUEEN(seq="CCCTTATGACCCTGACACGCTCGGTCTAG", product="SI2004longright")
QUEEN(seq="CTAGCCCTTATGACCCTGACACGCTCGGT", product="SI2004longleft")
QUEEN(seq="ACACGCTCGGT", product="SI2004cantanneal")
pn1='test'
pn2='test2'
pd1='test3'
pd2='test4'

#anneal_oligos_object_short(SI2003, SI2003, pn=pn1, pd=pd2)            #Produces 'Identical ssDNAs cannot anneal. Please revise your entry.' #works
#anneal_oligos_object_short(SI2003, SI2004cantanneal, pn=pn1, pd=pd2)  #Produces 'Oligos could not anneal because at least one is too short. Please revise your entry.' #works
#anneal_oligos_object_short(SI2004cantanneal, SI2003, pn=pn1, pd=pd2)  #Produces 'Oligos could not anneal because at least one is too short. Please revise your entry.' #works

#anneal_oligos_object_short(SI2003truncated, SI2004, pn=pn1, pd=pd2)                     #Produces Case 1 #works
 # 5' ACCGAGCGTGTCAGGGTCATAAGGG 3'
 # 3' TGGCTCGCACAGTCCCAGTATTCCC 5'

#anneal_oligos_object_short(SI2004, SI2003truncated, pn=pn1, pd=pd2)                     #Produces Case 1 but treats SI2004 as sense strand #works
 # 5' CCCTTATGACCCTGACACGCTCGGT 3'
 # 3' GGGAATACTGGGACTGTGCGAGCCA 5'

#anneal_oligos_object_short(SI2003, SI2004, pn=pn1, pd=pd2)                              #Produces Case 2 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' ----TGGCTCGCACAGTCCCAGTATTCCC---- 5'

#anneal_oligos_object_short(SI2004, SI2003, pn=pn1, pd=pd2)                              #Produces Case 3 #works
 # 5' ----CCCTTATGACCCTGACACGCTCGGT---- 3'
 # 3' GATCGGGAATACTGGGACTGTGCGAGCCAGATC 5'

#anneal_oligos_object_short(SI2003shortright, SI2004longleft, pn=pn1, pd=pd2)            #Produces Case 4 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGG---- 3'
 # 3' ----TGGCTCGCACAGTCCCAGTATTCCCGATC 5'

#anneal_oligos_object_short(SI2003shortleft, SI2004longright, pn=pn1, pd=pd2)            #Produces Case 5 #makes error because 3' end of Rv primer doesn't anneal
 # 5' ----ACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' GATCTGGCTCGCACAGTCCCAGTATTCCC---- 5'

#anneal_oligos_object_short(SI2003, SI2004longleft, pn=pn1, pd=pd2)                      #Produces Case 6 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' ----TGGCTCGCACAGTCCCAGTATTCCCGATC 5'

#anneal_oligos_object_short(SI2004longright, SI2003, pn=pn1, pd=pd2)                     #Produces Case 7 #works
 # 5' ----CCCTTATGACCCTGACACGCTCGGTCTAG 3'
 # 3' GATCGGGAATACTGGGACTGTGCGAGCCAGATC 5'

#anneal_oligos_object_short(SI2003, SI2004longright, pn=pn1, pd=pd2)                     #Produces Case 8 #works
 # 5' CTAGACCGAGCGTGTCAGGGTCATAAGGGCTAG 3'
 # 3' GATCTGGCTCGCACAGTCCCAGTATTCCC---- 5'

#anneal_oligos_object_short(SI2004longleft, SI2003, pn=pn1, pd=pd2)                      #Produces Case 9 #works
 # 5' CTAGCCCTTATGACCCTGACACGCTCGGT---- 3'
 # 3' GATCGGGAATACTGGGACTGTGCGAGCCAGATC 5'

In [2241]:
def gibson_assembly(frag1,         #QUEEN object
                    frag2,         #QUEEN object
                    assembly_name, #str
                    pn,            #str
                    pd             #str
                   ):
    """
    Given two fragments with overlapping 5' and 3' ends, searches the two fragments for their overlapping sequences,
    modifies them by 'digesting' the 5' ends of the overlapping regions (removes the overlapping regions from the larger fragment using modifyends)
    and ligates (using joindna) the modified products to form a single product.
    Returns the product as a circular QUEEN object.
    """
    
    #
    # The input fragments by nature should be double-stranded, so the function will operate on the Fw strands of the fragments when executing.
    #
    # Approach:
    # 1. With the two input fragments, use the smaller fragment (query) to search on the larger fragment (template). The query is likely the insert and the template is likely the backbone.
    #    Distinguish the smaller fragment from the larger fragment.
    # 2. Use the find_similar_sequence function to first search the 5' end of the query for its overlapping region with the template.
    #    Then, use find_similar_sequence to search the 3' end of the query for its overlapping region with the template.
    # 3. Remove the 5' overlap from the 3' end of the template and the 3' overlap from the 5' end of the template
    # 4. Use joindna to piece together the fragments into a circular QUEEN object (plasmid)
    #
    
    #Make the smaller fragment 'fragment1' and the larger fragment 'fragment2'. This gives the user flexibility to enter fragments in any order when calling the function.
    fragment1 = []
    fragment2 = []
    if len(frag1.seq) < len(frag2.seq):
        fragment1 = frag1
        fragment2 = frag2
    elif len(frag1.seq) > len(frag2.seq):
        fragment1 = frag2
        fragment2 = frag1
    else:
        pass #doesn't matter which is first or second if they are equal size
        
    #Generate the five_prime_overlap sequence and length   
    i = 1
    five_prime_overlap = []
    five_prime_overlap_len = []
    
    while i < 100: #arbitrary limit for i
        if find_similar_sequence(fragment1.seq[:i], fragment2.seq, max_mismatches=0) != '':
            five_prime_overlap = find_similar_sequence(fragment1.seq[:i], fragment2.seq, max_mismatches=0)
            five_prime_overlap_len = i
        else:
            break
        i = i+1
    
    #Generate the three_prime_overlap sequence and length   
    i = 1
    three_prime_overlap = []
    three_prime_overlap_len = []
    
    while i < 100: #arbitrary limit for i
        if find_similar_sequence(fragment1.seq[-i:], fragment2.seq, max_mismatches=0) != '':
            three_prime_overlap = find_similar_sequence(fragment1.seq[-i:], fragment2.seq, max_mismatches=0)
            three_prime_overlap_len = i
        else:
            break
        i = i+1
    
    #Remove overlap
    five_prime_overlap_cutsite = len(fragment2.seq) - five_prime_overlap_len
    truncated_template = cropdna(fragment2, three_prime_overlap_len, five_prime_overlap_cutsite) 
    
    #Piece together the fragments
    assembled_product = joindna(truncated_template, fragment1, product=assembly_name, topology="circular", pn=pn, pd=pd)
    
    return assembled_product

#Examples
QUEEN(seq="TGGACGAGCTGTACAAGTAATTAATTAACTAGGTCTTGAAAGGAGTGGGAATTGGC", product="SI415") 
QUEEN(seq="TGAGTAAAGTTTGAAGCCATGAATTCCCTGTGTTCTGGCGGCAAACCCGTTG", product="SI417")
#QUEEN(record="https://benchling.com/s/seq-KzYdG9UVFpnuTv5G3FIP", dbtype="benchling", product="pLV_CS_038")
#QUEEN(record="https://benchling.com/s/seq-AXIf1rdVlbb2QDPoOudT", dbtype="benchling", product="pLV_CS_043")
pn1='test'
pn2='test2'
pd1='test3'
pd2='test4'
#testinsert = create_pcr_product(pLV_CS_038, SI415, SI417, pn1, pd1)
testbackbone = double_digest_insert(pLV_CS_043, 'EcoRI', 'PacI', pn1, pd1) #creates digested product with re1 digest at 5' end and re2 digest at 3'end

#Example 1: pLV_CS_060 construction; 1 insert + 1 digested backbone
gibson_assembly(testinsert, testbackbone, 'test_plasmid', pn1, pd1) #works! generates expected 9842 bp product with exact match to pLV_CS_060

<queen.QUEEN object; project='test_plasmid_6', length='9842 bp', topology='circular'>

In [2384]:
def stitch_fragments(frag1,  #QUEEN object
                     frag2,  #QUEEN object
                    ):
    """
    *Fragments must be entered in the order in which they are stitched (5' to 3'). Function only works with fragments in same sense.*
    Given two fragments with overlapping 5' and 3' ends, searches the two fragments for their overlapping region,
    modifies them by 'digesting' the ends of the overlapping regions (removes the overlapping regions from the first fragment using modifyends)
    and ligates (using joindna) the modified products to form a single product.
    Returns the product as a linear QUEEN object.
    The stitch_fragments function can be used multiple times in a row to attach several inserts together during a multi-fragment assembly.
    """
    
    #Find the overlap sequence and length at the 3' end of frag1 and the 5' end of frag2  
    i = 1
    overlap = []
    overlap_len = []
    
    while i < 100: #arbitrary limit for i
        if find_similar_sequence(frag1.seq[-i:], frag2.seq, max_mismatches=0) != '':
            overlap = find_similar_sequence(frag1.seq[-i:], frag2.seq, max_mismatches=0)
            overlap_len = i
        else:
            break
        i = i+1
    
    #Remove overlap
    overlap_cutsite = len(frag1.seq) - overlap_len
    truncated_frag1 = cropdna(frag1, 0, overlap_cutsite)
    
    #Piece together the fragments
    assembled_product = joindna(truncated_frag1, frag2, topology="linear")
    
    return assembled_product

#Examples
QUEEN(seq="TGGACGAGCTGTACAAGTAATTAATTAACTAGGTCTTGAAAGGAGTGGGAATTGGC", product="testoligo1") 
QUEEN(seq="GAAAGGAGTGGGAATTGGCTGAGTAAAGTTTGAAGCCATGAATTCCCTGTGTTCTGGCGGCAAACCCGTTG", product="testoligo2")
QUEEN(seq="GTCTGACAGCGTCTCTAGCCGTGAGAGCGTGTCAGTGTGTGGTGNCCGAGCGTGTCAGGGTGACCGTGGTGNCCGAGTCTGTCTCTCACAGCGTGGTGGAGACGAGCAGAGCT", product="SI739") 
QUEEN(seq="AGCTCTGCTCGTCTCCACCAC", product="SI738") 
pn1='test'
pn2='test2'
pd1='test3'
pd2='test4'

#Example 1: Stitch together testoligo1 and testoligo2, should form TGGACGAGCTGTACAAGTAATTAATTAACTAGGTCTTGAAAGGAGTGGGAATTGGCTGAGTAAAGTTTGAAGCCATGAATTCCCTGTGTTCTGGCGGCAAACCCGTTG
stitch_fragments(testoligo1, testoligo2) #works!

#Example 2: Stitch together SI739 and SI738 (SI738 encodes for the exact 3' end of SI739)
#Expected upon amplification
#5' GTCTGACAGCGTCTCTAGCCGTGAGAGCGTGTCAGTGTGTGGTGNCCGAGCGTGTCAGGGTGACCGTGGTGNCCGAGTCTGTCTCTCACAGCGTGGTGGAGACGAGCAGAGCT 3'
#3' CAGACTGTCGCAGAGATCGGCACTCTCGCACAGTCACACACCACNGGCTCGCACAGTCCCACTGGCACCACNGGCTCAGACAGAGAGTGTCGCACCACCTCTGCTCGTCTCGA 5'
stitch_fragments(SI739, flipdna(SI738)) #works! stitch_fragments works for stitching a Fw primer and Rv primer if flipdna is used

<queen.QUEEN object; project='SI739_121', length='113 bp', topology='linear'>

In [2348]:
def template_free_pcr(fw_primer,  #QUEEN object
                      rv_primer,  #QUEEN object
                      pn,         #str
                      pd          #str
                     ):
    """
    *Primers should be entered in the order in which they are desired (5' to 3').*
    Given two PCR primers that can anneal to each other (no minimum annealing length required), searches the two fragments for their overlapping region,
    modifies them by removing the overlapping regions from the first fragment using modifyends,
    and ligates (using joindna) the modified products to form a single product.
    Returns the product as a linear QUEEN object.
    """
    
    rv_primer_rc = flipdna(rv_primer) #Need to operate on reverse complement of rv_primer to produce PCR product with correct orientation
    
    #Find the overlap sequence and length at the 3' end of fw_primer and the 5' end of rv_primer_rc  
    i = 1
    overlap = []
    overlap_len = []
    
    while i < 100: #arbitrary limit for i
        if find_similar_sequence(fw_primer.seq[-i:], rv_primer_rc.seq, max_mismatches=0) != '':
            overlap = find_similar_sequence(fw_primer.seq[-i:], rv_primer_rc.seq, max_mismatches=0)
            overlap_len = i
        else:
            break
        i = i+1
    
    overlap_cutoff = len(fw_primer.seq) - overlap_len
    truncated_fw_primer = cropdna(fw_primer, 0, overlap_cutoff) 
    assembled_product = joindna(truncated_fw_primer, rv_primer_rc, topology="linear", pn=pn, pd=pd)
    
    return assembled_product

#Examples
QUEEN(seq="GACCTCCATAGAAGACACCGATGAATGCATGTCATGTAATGCTATGAGAGGATCTATGATTTCCGGTGAATTCATGC", product="SI1086") 
QUEEN(seq="GCTCCTCGCCCTTGCTCACACTTTCGGGTGTGGCGGACTCTGAGGTCCCGGGAGTCTCGCTGCCGCTTCCATATGACCCTGACACTTACGGCATGAATTCACCGG", product="SI1087")
QUEEN(seq="GCTCCTCGCCCTTGCTCACACTTTCGGGTGTGGCGGACTCTGAGGTCCCGGGAGTCTCGCTGCCGCTTCCATATGACCCTGATTAGCTCGGCATGAATTCACCGG", product="SI1089")
QUEEN(seq="GTCTGACAGCGTCTCTAGCCGTGAGAGCGTGTCAGTGTGTGGTGNCCGAGCGTGTCAGGGTGACCGTGGTGNCCGAGTCTGTCTCTCACAGCGTGGTGGAGACGAGCAGAGCT", product="SI739") 
QUEEN(seq="AGCTCTGCTCGTCTCCACCAC", product="SI738") 
pn1='test'
pd1='test3'

#Example 1: Template-free PCR of SI1086 and SI1087, should form 
# 5'-GACCTCCATAGAAGACACCGATGAATGCATGTCATGTAATGCTATGAGAGGATCTATGATTTCCGGTGAATTCATGCCGTAAGTGTCAGGGTCATATGGAAGCGGCAGCGAGACTCCCGGGACCTCAGAGTCCGCCACACCCGAAAGTGTGAGCAAGGGCGAGGAGC-3'
# 3'-CTGGAGGTATCTTCTGTGGCTACTTACGTACAGTACATTACGATACTCTCCTAGATACTAAAGGCCACTTAAGTACGGCATTCACAGTCCCAGTATACCTTCGCCGTCGCTCTGAGGGCCCTGGAGTCTCAGGCGGTGTGGGCTTTCACACTCGTTCCCGCTCCTCG-5'
template_free_pcr(SI1086, SI1087, pn1, pd1) #works!

#Example 2: Template-free PCR of SI1086 and SI1089, should form 
# 5'-GACCTCCATAGAAGACACCGATGAATGCATGTCATGTAATGCTATGAGAGGATCTATGATTTCCGGTGAATTCATGCCGAGCTAATCAGGGTCATATGGAAGCGGCAGCGAGACTCCCGGGACCTCAGAGTCCGCCACACCCGAAAGTGTGAGCAAGGGCGAGGAGC-3'
# 3'-CTGGAGGTATCTTCTGTGGCTACTTACGTACAGTACATTACGATACTCTCCTAGATACTAAAGGCCACTTAAGTACGGCTCGATTAGTCCCAGTATACCTTCGCCGTCGCTCTGAGGGCCCTGGAGTCTCAGGCGGTGTGGGCTTTCACACTCGTTCCCGCTCCTCG-5'
template_free_pcr(SI1086, SI1089, pn1, pd1) #works!

#Example 3: Template free PCR of SI739 and SI738 (SI738 encodes for the exact 3' end of SI739), should form
#5' GTCTGACAGCGTCTCTAGCCGTGAGAGCGTGTCAGTGTGTGGTGNCCGAGCGTGTCAGGGTGACCGTGGTGNCCGAGTCTGTCTCTCACAGCGTGGTGGAGACGAGCAGAGCT 3'
#3' CAGACTGTCGCAGAGATCGGCACTCTCGCACAGTCACACACCACNGGCTCGCACAGTCCCACTGGCACCACNGGCTCAGACAGAGAGTGTCGCACCACCTCTGCTCGTCTCGA 5'
template_free_pcr(SI739, SI738, pn1, pd1) #works!

<queen.QUEEN object; project='SI739_115', length='113 bp', topology='linear'>

## Wrapper cloning functions

In [1981]:
def create_pcr_product(template,  #QUEEN object
                       fw_primer, #QUEEN object
                       rv_primer, #QUEEN object
                       pn,        #str
                       pd         #str
                      ):
    """
    Given a template sequence for PCR, and the Fw and Rv primers for amplifying a fragment, creates the new fragment using primer binding sites and primer overhangs.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the modifyends object with an object named "PCR_product". Also provides length and the object's topology (linear in this case).
    """
    FW = primer_binding_object(fw_primer, template)        #Create object for the 3’-end binding sequence of the forward primer.
    RV = primer_binding_object(rv_primer, template)        #Create object for the 3’-end binding sequence of the reverse primer.
    
    cropdna(template, FW[0].end, RV[0].start, product="extract1", pn=pn, pd=pd)          #Crop the internal DNA sequence flanked by the primer annealing sites.
    
    return modifyends(extract1, fw_primer.seq, rv_primer.rcseq, product="PCR_product", pn=pn, pd=pd)    #Add Fw and Rv primer sequences to both ends of the cropped fragment.

In [1982]:
def create_pcr_product_special(template,  #QUEEN object
                               fw_primer, #QUEEN object (shorter annealing primer if needed)
                               rv_primer, #QUEEN object (shorter annealing primer if needed)
                               pn,        #str
                               pd         #str
                              ):
    """
    **Special case function for using Fw primers that are only 15 bp or longer, compatible with the primer_binding_object_short function.**
    Given a template sequence for PCR, and the Fw and Rv primers for amplifying a fragment, creates the new fragment using primer binding sites and primer overhangs.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the modifyends object with an object named "PCR_product". Also provides length and the object's topology (linear in this case).
    """
    FW = primer_binding_object_short(fw_primer, template)        #Create object for the 3’-end binding sequence of the forward primer.
    RV = primer_binding_object_short(rv_primer, template)        #Create object for the 3’-end binding sequence of the reverse primer.
    
    cropdna(template, FW[0].end, RV[0].start, product="extract1", pn=pn, pd=pd)          #Crop the internal DNA sequence flanked by the primer annealing sites.
    
    return modifyends(extract1, fw_primer.seq, rv_primer.rcseq, product="PCR_product", pn=pn, pd=pd)    #Add Fw and Rv primer sequences to both ends of the cropped fragment.

In [1983]:
def create_pcr_product_mismatches(template,  #QUEEN object
                                  fw_primer, #QUEEN object with 3 mismatches
                                  rv_primer, #QUEEN object
                                  pn,        #str
                                  pd         #str
                                 ):
    """
    Given a template sequence for PCR, and the Fw and Rv primers for amplifying a fragment, creates the new fragment using primer binding sites and primer overhangs.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the modifyends object with an object named "PCR_product". Also provides length and the object's topology (linear in this case).
    """
    
    FW_str = find_similar_sequence(fw_primer.seq[-18:], template.seq, max_mismatches=3)        #Create object for the 3’-end binding sequence of the forward primer.
    RV = primer_binding_object(rv_primer, template)                                            #Create object for the 3’-end binding sequence of the reverse primer.
    
    FW_queen = QUEEN(seq=FW_str)
    FW = template.searchsequence(query=FW_queen.seq)
    
    cropdna(template, FW[0].end, RV[0].start, product="extract1", pn=pn, pd=pd)          #Crop the internal DNA sequence flanked by the primer annealing sites.
    
    return modifyends(extract1, fw_primer.seq, rv_primer.rcseq, product="PCR_product", pn=pn, pd=pd)    #Add Fw and Rv primer sequences to both ends of the cropped fragment.


#Examples

QUEEN(record="https://benchling.com/s/seq-lyizd9Ah3jyfw9GFwW9v", dbtype="benchling", product="pLV_mCherry")

QUEEN(seq="GGTGAATTCCCGAGCGTGTCAGGGTGACCGTGAGCAAGGGCGAGGAGGATAACGCTGCCATC", product="SI1294")
QUEEN(seq="GGTGAATTCCCGAGCGTGTCAGGGTGACCGTGGCCATCATCAAGGAGTTCA", product="SI1295")
QUEEN(seq="GGTGAATTCCCGAGCGTGTCAGGGTGACCATGAGCAAGGGCGAGGAGGATAACGCTGCCATC", product="SI1297") 
QUEEN(seq="AATTGGATCCTTACTTGTACAGCTCGTCCA", product="SI680")

#Example 1: Fw primer contains 3 mismatches
create_pcr_product_mismatches(pLV_mCherry, SI1294, SI680, pn=pn1, pd=pd1) #works

#Example 2: Fw primer contains 0 mismatches
create_pcr_product_mismatches(pLV_mCherry, SI1295, SI680, pn=pn1, pd=pd1) #works

<queen.QUEEN object; project='PCR_product_674', length='723 bp', topology='linear'>

In [1984]:
def double_digest_insert(dna, #QUEEN object
                         re1, #str
                         re2, #str
                         pn,  #str
                         pd   #str
                        ):
    """
    Given a QUEEN DNA sequence for a fragment/insert, finds the restriction sites and digests at the found sites.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the cropdna object with an object named "digested_fragment". Also provides length and the object's topology (linear in this case).
    """
    
    dna.searchsequence(cs.lib[re1], product="re1site")                          #Search for restriction site of enzyme 1.
    dna.searchsequence(cs.lib[re2], product="re2site")                          #Search for restriction site of enzyme 2.
    
    return cropdna(dna, re1site[0], re2site[0], product="digested_insert", pn=pn, pd=pd)      #Return the digested DNA.

In [1985]:
def double_digest_backbone(dna,  #QUEEN object
                           re1,  #str
                           re2,  #str
                           pn,   #str
                           pd    #str
                          ):
    """
    Given a QUEEN DNA sequence for a backbone, finds the restriction sites and digests at the found sites.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the cropdna object with an object named "digested_backbone". Also provides length and the object's topology (linear in this case).
    """
    
    dna.searchsequence(cs.lib[re1], product="re1site")                          #Search for restriction site of enzyme 1.
    dna.searchsequence(cs.lib[re2], product="re2site")                          #Search for restriction site of enzyme 2.

    return cropdna(dna, re2site[0], re1site[0], product="digested_backbone", pn=pn, pd=pd)    #Return the digested DNA.

In [1986]:
def typeIIS_digest_insert(dna,  #QUEEN object
                          re,   #str
                          pn,   #str
                          pd    #str
                         ):
    """
    Given a QUEEN DNA sequence for an insert fragment, finds the Type II-S restriction site(s) and digests at the found sites.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the cutdna insert object. Also provides length and the object's topology (linear in this case).
    """
    from QUEEN import cutsite #Import a restriction enzyme library 
    
    sites = dna.searchsequence(cutsite.lib[re])
    fragments = cutdna(dna, *sites, pn=pn, pd=pd) #Produces a list of two fragments; the first is the external (backbone) fragment and the second is the internal (usually smaller) fragment.
    return fragments[1] #Keep the insert fragment


#Examples
#QUEEN(record="36083", dbtype="addgene", product="pLV_EGFP")
QUEEN(seq="GTCTGACAGCGTCTCTAGCCGTGAGAGCGTGTCAGTGTGTGGTGNCCGAGCGTGTCAGGGTGACCGTGGTGNCCGAGTCTGTCTCTCACAGCGTGGTGGAGACGAGCAGAGCT", product="SI739") 
QUEEN(seq="AGCTCTGCTCGTCTCCACCAC", product="SI738") 
pn1='test'
pd1='test3'

a = typeIIS_digest_insert(SI739, "BsmBI", pn1, pd1) #works
a.printsequence(display=True)

5' AGCCGTGAGAGCGTGTCAGTGTGTGGTGNCCGAGCGTGTCAGGGTGACCGTGGTGNCCGAGTCTGTCTCTCACAGCG---- 3'
3' ----CACTCTCGCACAGTCACACACCACNGGCTCGCACAGTCCCACTGGCACCACNGGCTCAGACAGAGAGTGTCGCACCA 5'



'AGCCGTGAGAGCGTGTCAGTGTGTGGTGNCCGAGCGTGTCAGGGTGACCGTGGTGNCCGAGTCTGTCTCTCACAGCG----/----CACTCTCGCACAGTCACACACCACNGGCTCGCACAGTCCCACTGGCACCACNGGCTCAGACAGAGAGTGTCGCACCA'

In [1987]:
def typeIIS_digest_backbone(dna,  #QUEEN object
                            re,   #str
                            pn,   #str
                            pd    #str
                           ):
    """
    Given a QUEEN DNA sequence for a backbone, finds the Type II-S restriction site(s) and digests at the found sites.
    User may specify a process name (pn) and process description (pd) as strings.
    Returns the cropdna backbone object. Also provides length and the object's topology (linear in this case).
    """
    from QUEEN import cutsite #Import a restriction enzyme library 
    
    sites = dna.searchsequence(cutsite.lib[re])
    fragments = cropdna(dna, *sites, pn=pn, pd=pd) #Produces the backbone fragment
    return fragments #Keep the backbone fragment


#Examples
#QUEEN(record="158431", dbtype="addgene", product="pSI_356_v2")
#QUEEN(record="https://benchling.com/s/seq-YhxTkuO87rxyYR1L61zC", dbtype="benchling", product="pLV_CS_116")
pn1='test'
pd1='test3'

typeIIS_digest_backbone(pSI_356_v2, "BsmBI", pn1, pd1) #works
typeIIS_digest_backbone(pLV_CS_116, "BsmBI", pn1, pd1) #works

<queen.QUEEN object; project='pLV_CS_684', length='9325 bp', topology='linear'>

# Cloning Templates

There are a total of 18 unique assembly types requiring distinct QUEEN construction procedures for the plasmids described in Table S2:
* __Digestion_Ligation_1__: Involving PCR with normal primers.  
    (1) Digest backbone `double_digest_backbone`, (2) Amplify insert `create_pcr_product`, (3) Digest insert `double_digest_insert`, (4) Ligate `joindna`.
* __Digestion_Ligation_2__: Involving PCR with mismatching primers.  
    (1) Digest backbone `double_digest_backbone`, (2) Amplify insert `create_pcr_product_mismatches`, (3) Digest insert `double_digest_insert`, (4) Ligate `joindna`.
* __Digestion_Ligation_3__: Involving PCR with short primer binding sites.  
    (1) Digest backbone `double_digest_backbone`, (2) Amplify insert `create_pcr_product_special`, (3) Digest insert `double_digest_insert`, (4) Ligate `joindna`.
* __Digestion_Ligation_4__: Involving two ssDNAs annealing.  
    (1) Digest backbone `double_digest_backbone`, (2) Anneal ssDNAs `anneal_oligos_object`, (3) Ligate `joindna`.
* __Digestion_Ligation_5__: Involving PCR with normal primers and requires flipping insert.  
    (1) Digest backbone `double_digest_backbone`, (2) Amplify insert `create_pcr_product` then `flipdna`, (3) Digest insert `double_digest_insert`, (4) Ligate `joindna`.
* __Digestion_Ligation_6__: Involving PCR with a flipped insert and short primer binding sites.  
    (1) Digest backbone `double_digest_backbone`, (2) Amplify insert `flipdna` `create_pcr_product_special`, (3) Digest insert `double_digest_insert`, (4) Ligate `joindna`.
* __Golden_Gate_Assembly_1__: Involving template-free PCR.  
    (1) Digest backbone `typeIIS_digest_backbone`, (2) Template-free PCR `generate_dsdna`, (3) Digest insert `typeIIS_digest_insert`, (4) Assemble `joindna`.
* __Golden_Gate_Assembly_2__: Involving two ssDNAs annealing.  
    (1) Digest backbone `typeIIS_digest_backbone`, (2) Anneal ssDNAs `anneal_oligos_object`, (3) Assemble `joindna`.
* __Golden_Gate_Assembly_3__: Involving two ssDNAs annealing with short binding sites.  
    (1) Digest backbone `typeIIS_digest_backbone`, (2) Anneal ssDNAs `anneal_oligos_object_short`, (3) Assemble `joindna`.
* __Golden_Gate_Assembly_4__: Involving two ssDNAs annealing with mismatches.
    (1) Digest backbone `typeIIS_digest_backbone` and modify for mismatch `cropdna` `joindna`, (2) Anneal ssDNAs `anneal_oligos_object`, (3) Assemble `joindna`.
* __Gibson_Assembly_2frag_1__: Involving digestion and PCR.  
    (1) Digest backbone `double_digest_backbone`, (2) Amplify insert `create_pcr_product`, (3) Assemble `gibson_assembly`.
* __Gibson_Assembly_2frag_2__: Involving only PCR.  
    (1) Amplify backbone `create_pcr_product`, (2) Amplify insert `create_pcr_product`, (3) Assemble `gibson_assembly`.
* __Gibson_Assembly_2frag_3__: Involving PCR with short primer binding sites.
    (1) Amplify backbone `create_pcr_product_special`, (2) Amplify insert `create_pcr_product_special`, (3) Assemble `gibson_assembly`.    
* __Gibson_Assembly_2frag_4__: Involving PCR with short primer binding sites.
    (1) Amplify backbone `create_pcr_product_special`, (2) Amplify insert `create_pcr_product`, (3) Assemble `gibson_assembly`.  
* __Gibson_Assembly_2frag_5__: Involving PCR and template-free PCR.  
    (1) Amplify backbone `create_pcr_product`, (2) Template-free PCR `stitch_fragments`, (3) Assemble `gibson_assembly`.
* __Gibson_Assembly_3frag_1__: Involving digestion and PCR.  
    (1) Digest backbone `double_digest_backbone`, (2) Amplify insert `create_pcr_product`, (3) Amplify insert `create_pcr_product`, (4) Assemble `stitch_fragments` then `gibson_assembly`.
* __Gibson_Assembly_3frag_2__: Involving PCR and PCR with short primer binding sites.
    (1) Amplify backbone `create_pcr_product`, (2) Amplify insert `create_pcr_product_special`, (3) Amplify insert `create_pcr_product_special`, (4) Assemble `stitch_fragments` then `gibson_assembly`.
* __Gibson_Assembly_4frag_1__: Involving digestion and PCR.  
    (1) Digest backbone `double_digest_backbone`, (2) Amplify insert `create_pcr_product`, (3) Amplify insert `create_pcr_product`,  
    (4) Amplify insert `create_pcr_product`, (5) Assemble `stitch_fragments` then `stitch_fragments` then `gibson_assembly`.

In [2411]:
#Load primers, plasmids, and assemblies
file_path = '/Users/samuelking/Yachie Lab Dropbox/Samuel King/SAMUEL_KING.LAB/Projects/CloneSelect/QUEEN/Assemblies Information Table.xlsx' #Update local path name here
oligos = pd.read_excel(file_path, sheet_name=0)
plasmids = pd.read_excel(file_path, sheet_name=1)
digestion_ligation_assemblies = pd.read_excel(file_path, sheet_name=2)
golden_gate_assemblies = pd.read_excel(file_path, sheet_name=3)
gibson_assemblies = pd.read_excel(file_path, sheet_name=4)
processes = pd.read_excel(file_path, sheet_name=5)

#Ensure 'str' is not kept as a variable in your memory, because the str() function is necessary below.
#If you receive 'TypeError: 'str' object is not callable', then use del(str) to delete 'str' from memory.

#Generate oligo QUEEN object list
oligo_list = []
i = 0
while i < len(oligos):
    primer = QUEEN(seq = oligos['seq'][i], product = oligos['primer_id'][i])
    oligo_list.append(primer)
    i = i+1

    
#Generate plasmid QUEEN object list (this takes a while) *note that benchling URLs called on by QUEEN cannot contain the suffix '?m=slm- ...'
#If at any point the plasmid_list generation stops due to an error where a plasmid isn't being loaded properly, enter 'i' to see at which index the function stopped working
#and check that plasmid in the spreadsheet.
#The plasmid_list generation may time out due to HTTPerror where too many requests are being called.. this is an internet issue and you can try re-running the function
#from the index number (i) which it stopped at and continuing to append plasmid_list from there. Make sure to # out plasmid_list = [] before doing this approach though.

#plasmid_list = []
#i = 0
#while i < len(plasmids):
#    plasmid = QUEEN(record = str(plasmids['record'][i]), dbtype = str(plasmids['dbtype'][i]), product = str(plasmids['plasmid_id'][i]))
#    plasmid_list.append(plasmid)
#    i = i+1

#Create name lists for the plasmids and oligos for later use
oligo_name_list = []
i = 0
while i < len(oligos['primer_id']):
    name = oligos['primer_id'][i]
    oligo_name_list.append(name)
    i = i+1
#
#plasmid_name_list = []
#i = 0
#while i < len(plasmids['plasmid_id']):
#    name = plasmids['plasmid_id'][i]
#    plasmid_name_list.append(name)
#    i = i+1

### Digestion_Ligation_1 Template

In [2281]:
#Set up the data for plasmids constructed by Digestion-Ligation_1
assembly_method = digestion_ligation_assemblies['assembly_method'] #extract the assembly_method column from the digestion_ligations sheet
digestion_ligation_1 = ['Digestion_Ligation_1'] #set up for extracting rows with Digestion_Ligation_1
digestion_ligation_1 = digestion_ligation_assemblies.query('@digestion_ligation_1 in assembly_method') #extract only rows containing Digestion_Ligation_1
digestion_ligation_1.index = range(len(digestion_ligation_1)) #re-index the extracted list to be in the order of only the rows containing Digestion_Ligation_1

In [2282]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][0]
pd1 = processes['pd1'][0]
#Step 2
pn2 = processes['pn2'][0]
pd2 = processes['pd2'][0]
#Step 3
pn3 = processes['pn3'][0]
pd3 = processes['pd3'][0]
#Step 4
pn4 = processes['pn4'][0]
pd4 = processes['pd4'][0]

In [2283]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(digestion_ligation_1):
    index_backbone = plasmid_name_list.index(digestion_ligation_1['backbone_id'][i])
    re1 = digestion_ligation_1['re1'][i]
    re2 = digestion_ligation_1['re2'][i]
    digested_backbone = double_digest_backbone(plasmid_list[index_backbone],
                                               re1, re2, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [2285]:
#Step 2: Amplify insert

pcr_products = []
i = 0
while i < len(digestion_ligation_1):
    index_template = plasmid_name_list.index(digestion_ligation_1['template_id'][i])
    index_fw = oligo_name_list.index(digestion_ligation_1['fw_primer'][i])
    index_rv = oligo_name_list.index(digestion_ligation_1['rv_primer'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                oligo_list[index_fw], 
                                oligo_list[index_rv], 
                                pn2, pd2)
    pcr_products.append(insert)
    i = i+1

In [1858]:
#Step 3: Digest insert

digested_inserts = []
i = 0
while i < len(pcr_products):
    re1 = digestion_ligation_1['re1'][i]
    re2 = digestion_ligation_1['re2'][i]
    digested_insert = double_digest_insert(pcr_products[i], re1, re2, pn3, pd3)
    digested_inserts.append(digested_insert)
    i = i+1

In [1859]:
#Step 4: Ligate
 
ligated_plasmids = []
i = 0
while i < len(digestion_ligation_1):
    ligation = joindna(digested_inserts[i],
                       digested_backbones[i],
                       product = digestion_ligation_1['plasmid_id'][i],
                       topology = "circular", pn=pn4, pd=pd4)
    ligated_plasmids.append(ligation)
    i = i+1

In [751]:
#Step 5: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(digestion_ligation_1):
    new_plasmid_known = QUEEN(record = str(digestion_ligation_1['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(digestion_ligation_1['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(digestion_ligation_1):
    if len(new_plasmids_known[i].seq) == len(ligated_plasmids[i].seq):
                              print(digestion_ligation_1['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(digestion_ligation_1['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pLV_CS_119 is not the same length as its Benchling sequence
pLV_CS_194 is not the same length as its Benchling sequence
pLV_CS_197 is not the same length as its Benchling sequence
pLV_CS_246 is not the same length as its Benchling sequence
pLV_CS_247 is not the same length as its Benchling sequence
pRS138 is the same length as its Benchling sequence.
pRS140 is the same length as its Benchling sequence.
pRS147 is not the same length as its Benchling sequence
pRS148 is not the same length as its Benchling sequence
pRS149 is not the same length as its Benchling sequence
pRS150 is not the same length as its Benchling sequence
pRS151 is not the same length as its Benchling sequence


In [1860]:
#Step 6: Output QUEEN gbk file

i = 0
while i < len(digestion_ligation_1):
    ligated_plasmids[i].outputgbk(output = digestion_ligation_1['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [1867]:
#Step 7: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(digestion_ligation_1):
    gbk_file = './' + digestion_ligation_1['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = digestion_ligation_1['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

QUEEN gbk files repaired.


### Digestion_Ligation_2 Template

In [1875]:
#Set up the data for plasmids constructed by Digestion-Ligation_2
assembly_method = digestion_ligation_assemblies['assembly_method'] #extract the assembly_method column from the digestion_ligations sheet
digestion_ligation_2 = ['Digestion_Ligation_2'] #set up for extracting rows with Digestion_Ligation_2
digestion_ligation_2 = digestion_ligation_assemblies.query('@digestion_ligation_2 in assembly_method') #extract only rows containing Digestion_Ligation_2
digestion_ligation_2.index = range(len(digestion_ligation_2)) #re-index the extracted list to be in the order of only the rows containing Digestion_Ligation_2

In [1876]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][0]
pd1 = processes['pd1'][0]
#Step 2
pn2 = processes['pn2'][0]
pd2 = processes['pd2'][0]
#Step 3
pn3 = processes['pn3'][0]
pd3 = processes['pd3'][0]
#Step 4
pn4 = processes['pn4'][0]
pd4 = processes['pd4'][0]

In [1877]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(digestion_ligation_2):
    index_backbone = plasmid_name_list.index(digestion_ligation_2['backbone_id'][i])
    re1 = digestion_ligation_2['re1'][i]
    re2 = digestion_ligation_2['re2'][i]
    digested_backbone = double_digest_backbone(plasmid_list[index_backbone],
                                               re1, re2, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [1878]:
#Step 2: Amplify insert

pcr_products = []
i = 0
while i < len(digestion_ligation_2):
    index_template = plasmid_name_list.index(digestion_ligation_2['template_id'][i])
    index_fw = oligo_name_list.index(digestion_ligation_2['fw_primer'][i])
    index_rv = oligo_name_list.index(digestion_ligation_2['rv_primer'][i])
    insert = create_pcr_product_mismatches(plasmid_list[index_template],
                                           oligo_list[index_fw], 
                                           oligo_list[index_rv], 
                                           pn2, pd2)
    pcr_products.append(insert)
    i = i+1

In [1879]:
#Step 3: Digest insert

digested_inserts = []
i = 0
while i < len(pcr_products):
    re1 = digestion_ligation_2['re1'][i]
    re2 = digestion_ligation_2['re2'][i]
    digested_insert = double_digest_insert(pcr_products[i], re1, re2, pn3, pd3)
    digested_inserts.append(digested_insert)
    i = i+1

In [1880]:
#Step 4: Ligate
 
ligated_plasmids = []
i = 0
while i < len(digestion_ligation_2):
    ligation = joindna(digested_inserts[i],
                       digested_backbones[i],
                       product = digestion_ligation_2['plasmid_id'][i],
                       topology = "circular", pn=pn4, pd=pd4)
    ligated_plasmids.append(ligation)
    i = i+1

In [759]:
#Step 5: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(digestion_ligation_2):
    new_plasmid_known = QUEEN(record = str(digestion_ligation_2['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(digestion_ligation_2['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(digestion_ligation_2):
    if len(new_plasmids_known[i].seq) == len(ligated_plasmids[i].seq):
                              print(digestion_ligation_2['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(digestion_ligation_2['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pLV_CS_193 is not the same length as its Benchling sequence
pLV_CS_196 is not the same length as its Benchling sequence


In [1881]:
#Step 6: Output QUEEN gbk file

i = 0
while i < len(digestion_ligation_2):
    ligated_plasmids[i].outputgbk(output = digestion_ligation_2['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [1882]:
#Step 7: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(digestion_ligation_2):
    gbk_file = './' + digestion_ligation_2['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = digestion_ligation_2['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

QUEEN gbk files repaired.


### Digestion_Ligation_3 Template

In [1883]:
#Set up the data for plasmids constructed by Digestion-Ligation_3
assembly_method = digestion_ligation_assemblies['assembly_method'] #extract the assembly_method column from the digestion_ligations sheet
digestion_ligation_3 = ['Digestion_Ligation_3'] #set up for extracting rows with Digestion_Ligation_3
digestion_ligation_3 = digestion_ligation_assemblies.query('@digestion_ligation_3 in assembly_method') #extract only rows containing Digestion_Ligation_3
digestion_ligation_3.index = range(len(digestion_ligation_3)) #re-index the extracted list to be in the order of only the rows containing Digestion_Ligation_3

In [1884]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][0]
pd1 = processes['pd1'][0]
#Step 2
pn2 = processes['pn2'][0]
pd2 = processes['pd2'][0]
#Step 3
pn3 = processes['pn3'][0]
pd3 = processes['pd3'][0]
#Step 4
pn4 = processes['pn4'][0]
pd4 = processes['pd4'][0]

In [1885]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(digestion_ligation_3):
    index_backbone = plasmid_name_list.index(digestion_ligation_3['backbone_id'][i])
    re1 = digestion_ligation_3['re1'][i]
    re2 = digestion_ligation_3['re2'][i]
    digested_backbone = double_digest_backbone(plasmid_list[index_backbone],
                                               re1, re2, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [1886]:
#Step 2: Amplify insert

pcr_products = []
i = 0
while i < len(digestion_ligation_3):
    index_template = plasmid_name_list.index(digestion_ligation_3['template_id'][i])
    index_fw = oligo_name_list.index(digestion_ligation_3['fw_primer'][i])
    index_rv = oligo_name_list.index(digestion_ligation_3['rv_primer'][i])
    insert = create_pcr_product_special(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn2, pd2)
    pcr_products.append(insert)
    i = i+1

In [1887]:
#Step 3: Digest insert

digested_inserts = []
i = 0
while i < len(pcr_products):
    re1 = digestion_ligation_3['re1'][i]
    re2 = digestion_ligation_3['re2'][i]
    digested_insert = double_digest_insert(pcr_products[i], re1, re2, pn3, pd3)
    digested_inserts.append(digested_insert)
    i = i+1

In [1888]:
#Step 4: Ligate
 
ligated_plasmids = []
i = 0
while i < len(digestion_ligation_3):
    ligation = joindna(digested_inserts[i],
                       digested_backbones[i],
                       product = digestion_ligation_3['plasmid_id'][i],
                       topology = "circular", pn=pn4, pd=pd4)
    ligated_plasmids.append(ligation)
    i = i+1

In [767]:
#Step 5: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(digestion_ligation_3):
    new_plasmid_known = QUEEN(record = str(digestion_ligation_3['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(digestion_ligation_3['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(digestion_ligation_3):
    if len(new_plasmids_known[i].seq) == len(ligated_plasmids[i].seq):
                              print(digestion_ligation_3['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(digestion_ligation_3['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pLV_CS_192 is not the same length as its Benchling sequence
pLV_CS_195 is not the same length as its Benchling sequence


In [1889]:
#Step 6: Output QUEEN gbk file

i = 0
while i < len(digestion_ligation_3):
    ligated_plasmids[i].outputgbk(output = digestion_ligation_3['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [1890]:
#Step 7: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(digestion_ligation_3):
    gbk_file = './' + digestion_ligation_3['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = digestion_ligation_3['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

QUEEN gbk files repaired.


### Digestion-Ligation_4 Template

In [1891]:
#Set up the data for plasmids constructed by Digestion-Ligation_4
assembly_method = digestion_ligation_assemblies['assembly_method'] #extract the assembly_method column from the digestion_ligations sheet
digestion_ligation_4 = ['Digestion_Ligation_4'] #set up for extracting rows with Digestion_Ligation_4
digestion_ligation_4 = digestion_ligation_assemblies.query('@digestion_ligation_4 in assembly_method') #extract only rows containing Digestion_Ligation_4
digestion_ligation_4.index = range(len(digestion_ligation_4)) #re-index the extracted list to be in the order of only the rows containing Digestion_Ligation_4

In [1892]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][1]
pd1 = processes['pd1'][1]
#Step 2
pn2 = processes['pn2'][1]
pd2 = processes['pd2'][1]
#Step 3
pn3 = processes['pn3'][1]
pd3 = processes['pd3'][1]

In [1893]:
#Step 1: Digest backbone

#The second RE site in the cutsite library from QUEEN was in reverse orientation, which was preventing correct ligation of the backbone and insert fragments.
#Given that every assembly used XbaI- and BmtI-digested backbone fragments, I manually coded the digestions in the correct orientation below.

digested_backbones = []
i = 0
while i < len(digestion_ligation_4):
    index_backbone = plasmid_name_list.index(digestion_ligation_4['backbone_id'][i])

    plasmid_list[index_backbone].searchsequence("T^CTAG_A", product='re1site') #search for XbaI cut site
    plasmid_list[index_backbone].searchsequence("G_CTAG^C", product='re2site') #search for BmtI cut site
    digested_backbone = cropdna(plasmid_list[index_backbone],
                                re2site[0], re1site[0],
                                product="digested_backbone", pn=pn1, pd=pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [1894]:
#Step 2: Anneal ssDNAs

annealed_DNA = []
i = 0
while i < len(digestion_ligation_4):
    index_fw = oligo_name_list.index(digestion_ligation_4['fw_primer'][i])
    index_rv = oligo_name_list.index(digestion_ligation_4['rv_primer'][i])
    insert = anneal_oligos_object(oligo_list[index_fw],
                                  oligo_list[index_rv],
                                  pn2, pd2)
    annealed_DNA.append(insert)
    i = i+1

In [1895]:
#Step 3: Ligate

ligated_plasmids = []
i = 0
while i < len(digestion_ligation_4):
    ligation = joindna(annealed_DNA[i],
                       digested_backbones[i],
                       product = digestion_ligation_4['plasmid_id'][i],
                       topology = "circular", pn=pn3, pd=pd3)
    ligated_plasmids.append(ligation)
    i = i+1

In [1301]:
#Step 4: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(digestion_ligation_4):
    new_plasmid_known = QUEEN(record = str(digestion_ligation_4['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(digestion_ligation_4['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(digestion_ligation_4):
    if len(new_plasmids_known[i].seq) == len(ligated_plasmids[i].seq):
                              print(digestion_ligation_4['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(digestion_ligation_4['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pLV_CS_127 is the same length as its Benchling sequence.
pLV_CS_128 is the same length as its Benchling sequence.
pLV_CS_129 is the same length as its Benchling sequence.
pLV_CS_137 is the same length as its Benchling sequence.
pLV_CS_138 is the same length as its Benchling sequence.
pLV_CS_139 is the same length as its Benchling sequence.
pLV_CS_251 is the same length as its Benchling sequence.
pLV_CS_252 is the same length as its Benchling sequence.
pLV_CS_253 is the same length as its Benchling sequence.


In [1896]:
#Step 5: Output QUEEN gbk file

i = 0
while i < len(digestion_ligation_4):
    ligated_plasmids[i].outputgbk(output = digestion_ligation_4['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [1897]:
#Step 6: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(digestion_ligation_4):
    gbk_file = './' + digestion_ligation_4['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = digestion_ligation_4['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

QUEEN gbk files repaired.


### Digestion_Ligation_5 Template

In [1898]:
#Set up the data for plasmids constructed by Digestion-Ligation_5
assembly_method = digestion_ligation_assemblies['assembly_method'] #extract the assembly_method column from the digestion_ligations sheet
digestion_ligation_5 = ['Digestion_Ligation_5'] #set up for extracting rows with Digestion_Ligation_5
digestion_ligation_5 = digestion_ligation_assemblies.query('@digestion_ligation_5 in assembly_method') #extract only rows containing Digestion_Ligation_5
digestion_ligation_5.index = range(len(digestion_ligation_5)) #re-index the extracted list to be in the order of only the rows containing Digestion_Ligation_5

In [1899]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][0]
pd1 = processes['pd1'][0]
#Step 2
pn2 = processes['pn2'][0]
pd2 = processes['pd2'][0]
#Step 3
pn3 = processes['pn3'][0]
pd3 = processes['pd3'][0]
#Step 4
pn4 = processes['pn4'][0]
pd4 = processes['pd4'][0]

In [1900]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(digestion_ligation_5):
    index_backbone = plasmid_name_list.index(digestion_ligation_5['backbone_id'][i])
    re1 = digestion_ligation_5['re1'][i]
    re2 = digestion_ligation_5['re2'][i]
    digested_backbone = double_digest_backbone(plasmid_list[index_backbone],
                                               re1, re2, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [1901]:
#Step 2: Amplify insert

pcr_products = []
i = 0
while i < len(digestion_ligation_5):
    index_template = plasmid_name_list.index(digestion_ligation_5['template_id'][i])
    index_fw = oligo_name_list.index(digestion_ligation_5['fw_primer'][i])
    index_rv = oligo_name_list.index(digestion_ligation_5['rv_primer'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                oligo_list[index_fw], 
                                oligo_list[index_rv], 
                                pn2, pd2)
    pcr_products.append(insert)
    i = i+1
    
flipped_pcr_products = []
i = 0
while i < len(digestion_ligation_5):
    flipped_insert = flipdna(pcr_products[i])
    flipped_pcr_products.append(flipped_insert)
    i = i+1

In [1902]:
#Step 3: Digest insert

digested_inserts = []
i = 0
while i < len(flipped_pcr_products):
    re1 = digestion_ligation_5['re1'][i]
    re2 = digestion_ligation_5['re2'][i]
    digested_insert = double_digest_insert(flipped_pcr_products[i], re1, re2, pn3, pd3)
    digested_inserts.append(digested_insert)
    i = i+1

In [1903]:
#Step 4: Ligate
 
ligated_plasmids = []
i = 0
while i < len(digestion_ligation_5):
    ligation = joindna(digested_inserts[i],
                       digested_backbones[i],
                       product = digestion_ligation_5['plasmid_id'][i],
                       topology = "circular", pn=pn4, pd=pd4)
    ligated_plasmids.append(ligation)
    i = i+1

In [1333]:
#Step 5: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(digestion_ligation_5):
    new_plasmid_known = QUEEN(record = str(digestion_ligation_5['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(digestion_ligation_5['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(digestion_ligation_5):
    if len(new_plasmids_known[i].seq) == len(ligated_plasmids[i].seq):
                              print(digestion_ligation_5['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(digestion_ligation_5['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pRS271 is the same length as its Benchling sequence.
pRS272 is the same length as its Benchling sequence.


In [1904]:
#Step 6: Output QUEEN gbk file

i = 0
while i < len(digestion_ligation_5):
    ligated_plasmids[i].outputgbk(output = digestion_ligation_5['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [1905]:
#Step 7: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(digestion_ligation_5):
    gbk_file = './' + digestion_ligation_5['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = digestion_ligation_5['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

QUEEN gbk files repaired.


### Digestion_Ligation_Assembly_6 Template

In [1906]:
#Set up the data for plasmids constructed by Digestion-Ligation_6
assembly_method = digestion_ligation_assemblies['assembly_method'] #extract the assembly_method column from the digestion_ligations sheet
digestion_ligation_6 = ['Digestion_Ligation_6'] #set up for extracting rows with Digestion_Ligation_6
digestion_ligation_6 = digestion_ligation_assemblies.query('@digestion_ligation_6 in assembly_method') #extract only rows containing Digestion_Ligation_6
digestion_ligation_6.index = range(len(digestion_ligation_6)) #re-index the extracted list to be in the order of only the rows containing Digestion_Ligation_6

In [1907]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][0]
pd1 = processes['pd1'][0]
#Step 2
pn2 = processes['pn2'][0]
pd2 = processes['pd2'][0]
#Step 3
pn3 = processes['pn3'][0]
pd3 = processes['pd3'][0]
#Step 4
pn4 = processes['pn4'][0]
pd4 = processes['pd4'][0]

In [1908]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(digestion_ligation_6):
    index_backbone = plasmid_name_list.index(digestion_ligation_6['backbone_id'][i])
    re1 = digestion_ligation_6['re1'][i]
    re2 = digestion_ligation_6['re2'][i]
    digested_backbone = double_digest_backbone(plasmid_list[index_backbone],
                                               re1, re2, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [1909]:
#Step 2: Amplify insert

pcr_products = []
i = 0
while i < len(digestion_ligation_6):
    index_template = plasmid_name_list.index(digestion_ligation_6['template_id'][i])
    index_fw = oligo_name_list.index(digestion_ligation_6['fw_primer'][i])
    index_rv = oligo_name_list.index(digestion_ligation_6['rv_primer'][i])
    insert = create_pcr_product_special(flipdna(plasmid_list[index_template]),
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn2, pd2)
    pcr_products.append(insert)
    i = i+1

In [1910]:
#Step 3: Digest insert

digested_inserts = []
i = 0
while i < len(pcr_products):
    re1 = digestion_ligation_6['re1'][i]
    re2 = digestion_ligation_6['re2'][i]
    digested_insert = double_digest_insert(pcr_products[i], re1, re2, pn3, pd3)
    digested_inserts.append(digested_insert)
    i = i+1

In [1911]:
#Step 4: Ligate
 
ligated_plasmids = []
i = 0
while i < len(digestion_ligation_6):
    ligation = joindna(digested_inserts[i],
                       digested_backbones[i],
                       product = digestion_ligation_6['plasmid_id'][i],
                       topology = "circular", pn=pn4, pd=pd4)
    ligated_plasmids.append(ligation)
    i = i+1

In [1697]:
#Step 5: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(digestion_ligation_6):
    new_plasmid_known = QUEEN(record = str(digestion_ligation_6['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(digestion_ligation_6['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(digestion_ligation_6):
    if len(new_plasmids_known[i].seq) == len(ligated_plasmids[i].seq):
                              print(digestion_ligation_6['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(digestion_ligation_6['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pKI076 is the same length as its Benchling sequence.
pKI077 is the same length as its Benchling sequence.


In [1912]:
#Step 5: Output QUEEN gbk file

i = 0
while i < len(digestion_ligation_6):
    ligated_plasmids[i].outputgbk(output = digestion_ligation_6['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [1913]:
#Step 6: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(digestion_ligation_6):
    gbk_file = './' + digestion_ligation_6['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = digestion_ligation_6['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

QUEEN gbk files repaired.


### Golden_Gate_Assembly_1 Template

In [1914]:
#Set up the data for plasmids constructed by Golden_Gate_Assembly_1
assembly_method = golden_gate_assemblies['assembly_method'] #extract the assembly_method column from the golden_gate sheet
golden_gate_assembly_1 = ['Golden_Gate_Assembly_1'] #set up for extracting rows with Golden_Gate_Assembly_1
golden_gate_assembly_1 = golden_gate_assemblies.query('@golden_gate_assembly_1 in assembly_method') #extract only rows containing Golden_Gate_Assembly_1
golden_gate_assembly_1.index = range(len(golden_gate_assembly_1)) #re-index the extracted list to be in the order of only the rows containing Golden_Gate_Assembly_1

In [1915]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][2]
pd1 = processes['pd1'][2]
#Step 2
pn2 = processes['pn2'][2]
pd2 = processes['pd2'][2]
#Step 3
pn3 = processes['pn3'][2]
pd3 = processes['pd3'][2]
#Step 4
pn4 = processes['pn4'][2]
pd4 = processes['pd4'][2]

In [1916]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(golden_gate_assembly_1):
    index_backbone = plasmid_name_list.index(golden_gate_assembly_1['backbone_id'][i])
    re1 = golden_gate_assembly_1['re1'][i]
    digested_backbone = typeIIS_digest_backbone(plasmid_list[index_backbone], re1, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [1917]:
#Step 2: Amplify insert by template-free PCR

pcr_products = []
i = 0
while i < len(golden_gate_assembly_1):
    index_fw = oligo_name_list.index(golden_gate_assembly_1['fw_primer'][i])
    index_rv = oligo_name_list.index(golden_gate_assembly_1['rv_primer'][i])
    insert = template_free_pcr(oligo_list[index_fw], 
                               oligo_list[index_rv], 
                               pn2, pd2)
    pcr_products.append(insert)
    i = i+1

In [1918]:
#Step 3: Digest insert

digested_inserts = []
i = 0
while i < len(golden_gate_assembly_1):
    re1 = golden_gate_assembly_1['re1'][i]
    digested_insert = typeIIS_digest_insert(pcr_products[i], re1, pn3, pd3)
    digested_inserts.append(digested_insert)
    i = i+1

In [1919]:
#Step 4: Assemble
 
assembled_plasmids = []
i = 0
while i < len(golden_gate_assembly_1):
    assembly = joindna(digested_inserts[i],
                       digested_backbones[i],
                       product = golden_gate_assembly_1['plasmid_id'][i],
                       topology = "circular", pn=pn4, pd=pd4)
    assembled_plasmids.append(assembly)
    i = i+1

In [1558]:
#Step 5: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(golden_gate_assembly_1):
    new_plasmid_known = QUEEN(record = str(golden_gate_assembly_1['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(golden_gate_assembly_1['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(golden_gate_assembly_1):
    if len(new_plasmids_known[i].seq) == len(assembled_plasmids[i].seq):
                              print(golden_gate_assembly_1['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(golden_gate_assembly_1['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pLV_CS_117 is the same length as its Benchling sequence.


In [1920]:
#Step 6: Output QUEEN gbk file

i = 0
while i < len(golden_gate_assembly_1):
    assembled_plasmids[i].outputgbk(output = golden_gate_assembly_1['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [1921]:
#Step 7: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(golden_gate_assembly_1):
    gbk_file = './' + golden_gate_assembly_1['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = golden_gate_assembly_1['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

QUEEN gbk files repaired.


### Golden_Gate_Assembly_2 Template

In [2404]:
#Set up the data for plasmids constructed by Golden_Gate_Assembly_2
assembly_method = golden_gate_assemblies['assembly_method'] #extract the assembly_method column from the golden_gate sheet
golden_gate_assembly_2 = ['Golden_Gate_Assembly_2'] #set up for extracting rows with Golden_Gate_Assembly_2
golden_gate_assembly_2 = golden_gate_assemblies.query('@golden_gate_assembly_2 in assembly_method') #extract only rows containing Golden_Gate_Assembly_2
golden_gate_assembly_2.index = range(len(golden_gate_assembly_2)) #re-index the extracted list to be in the order of only the rows containing Golden_Gate_Assembly_2

In [2405]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][3]
pd1 = processes['pd1'][3]
#Step 2
pn2 = processes['pn2'][3]
pd2 = processes['pd2'][3]
#Step 3
pn3 = processes['pn3'][3]
pd3 = processes['pd3'][3]

In [2406]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(golden_gate_assembly_2):
    index_backbone = plasmid_name_list.index(golden_gate_assembly_2['backbone_id'][i])
    re1 = golden_gate_assembly_2['re1'][i]
    digested_backbone = typeIIS_digest_backbone(plasmid_list[index_backbone], re1, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [2407]:
#Step 2: Anneal ssDNAs

annealed_DNA = []
i = 0
while i < len(golden_gate_assembly_2):
    index_fw = oligo_name_list.index(golden_gate_assembly_2['fw_primer'][i])
    index_rv = oligo_name_list.index(golden_gate_assembly_2['rv_primer'][i])
    insert = anneal_oligos_object(oligo_list[index_fw],
                                  oligo_list[index_rv],
                                  pn2, pd2)
    annealed_DNA.append(insert)
    i = i+1

In [2408]:
#Step 3: Assemble
 
assembled_plasmids = []
i = 0
while i < len(golden_gate_assembly_2):
    assembly = joindna(annealed_DNA[i],
                       digested_backbones[i],
                       product = golden_gate_assembly_2['plasmid_id'][i],
                       topology = "circular", pn=pn3, pd=pd3)
    assembled_plasmids.append(assembly)
    i = i+1

In [2409]:
#Step 4: Output QUEEN gbk file

i = 0
while i < len(golden_gate_assembly_2):
    assembled_plasmids[i].outputgbk(output = golden_gate_assembly_2['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [2410]:
#Step 5: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(golden_gate_assembly_2):
    gbk_file = './' + golden_gate_assembly_2['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = golden_gate_assembly_2['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

QUEEN gbk files repaired.


### Golden_Gate_Assembly_3 Template

In [1930]:
#Set up the data for plasmids constructed by Golden_Gate_Assembly_3
assembly_method = golden_gate_assemblies['assembly_method'] #extract the assembly_method column from the golden_gate sheet
golden_gate_assembly_3 = ['Golden_Gate_Assembly_3'] #set up for extracting rows with Golden_Gate_Assembly_3
golden_gate_assembly_3 = golden_gate_assemblies.query('@golden_gate_assembly_3 in assembly_method') #extract only rows containing Golden_Gate_Assembly_3
golden_gate_assembly_3.index = range(len(golden_gate_assembly_3)) #re-index the extracted list to be in the order of only the rows containing Golden_Gate_Assembly_3

In [1931]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][3]
pd1 = processes['pd1'][3]
#Step 2
pn2 = processes['pn2'][3]
pd2 = processes['pd2'][3]
#Step 3
pn3 = processes['pn3'][3]
pd3 = processes['pd3'][3]

In [1932]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(golden_gate_assembly_3):
    index_backbone = plasmid_name_list.index(golden_gate_assembly_3['backbone_id'][i])
    re1 = golden_gate_assembly_3['re1'][i]
    digested_backbone = typeIIS_digest_backbone(plasmid_list[index_backbone], re1, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [1933]:
#Step 2: Anneal ssDNAs

annealed_DNA = []
i = 0
while i < len(golden_gate_assembly_3):
    index_fw = oligo_name_list.index(golden_gate_assembly_3['fw_primer'][i])
    index_rv = oligo_name_list.index(golden_gate_assembly_3['rv_primer'][i])
    insert = anneal_oligos_object_short(oligo_list[index_fw],
                                        oligo_list[index_rv],
                                        pn2, pd2)
    annealed_DNA.append(insert)
    i = i+1

In [1934]:
#Step 3: Assemble
 
assembled_plasmids = []
i = 0
while i < len(golden_gate_assembly_3):
    assembly = joindna(annealed_DNA[i],
                       digested_backbones[i],
                       product = golden_gate_assembly_3['plasmid_id'][i],
                       topology = "circular", pn=pn3, pd=pd3)
    assembled_plasmids.append(assembly)
    i = i+1

In [1935]:
#Step 4: Output QUEEN gbk file

i = 0
while i < len(golden_gate_assembly_3):
    assembled_plasmids[i].outputgbk(output = golden_gate_assembly_3['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [1936]:
#Step 5: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(golden_gate_assembly_3):
    gbk_file = './' + golden_gate_assembly_3['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = golden_gate_assembly_3['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

QUEEN gbk files repaired.


### Golden_Gate_Assembly_4 Template

In [1937]:
#Set up the data for plasmids constructed by Golden_Gate_Assembly_4
assembly_method = golden_gate_assemblies['assembly_method'] #extract the assembly_method column from the golden_gate sheet
golden_gate_assembly_4 = ['Golden_Gate_Assembly_4'] #set up for extracting rows with Golden_Gate_Assembly_4
golden_gate_assembly_4 = golden_gate_assemblies.query('@golden_gate_assembly_4 in assembly_method') #extract only rows containing Golden_Gate_Assembly_4
golden_gate_assembly_4.index = range(len(golden_gate_assembly_4)) #re-index the extracted list to be in the order of only the rows containing Golden_Gate_Assembly_4

In [1938]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][3]
pd1 = processes['pd1'][3] + ' Position 5516 in the backbone plasmid pKI188 was manually modified from G -> A to account for an error in the insert upstream overhang design (which is ATAA instead of ATAG).'
#Step 2
pn2 = processes['pn2'][3]
pd2 = processes['pd2'][3]
#Step 3
pn3 = processes['pn3'][3]
pd3 = processes['pd3'][3]

In [1939]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(golden_gate_assembly_4):
    re1 = golden_gate_assembly_4['re1'][i]
    digested_backbone = typeIIS_digest_backbone(pKI188, re1, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1
    
#Modify position 5516 in the backbone plasmid pKI188 from G -> A, to account for an error in the insert overhang design (which is ATAA, instead of ATAG).

QUEEN(seq='T----/ATATT', product='modification')
modified_backbones = []
i = 0
while i < len(golden_gate_assembly_4):
    modified_backbone_1 = cropdna(digested_backbones[i], 0, 10735, pn=pn1, pd=pd1)
    modified_backbone_2 = joindna(modified_backbone_1, modification, pn=pn1, pd=pd1)
    modified_backbones.append(modified_backbone_2)
    i = i+1

In [1940]:
#Step 2: Anneal ssDNAs

annealed_DNA = []
i = 0
while i < len(golden_gate_assembly_4):
    index_fw = oligo_name_list.index(golden_gate_assembly_4['fw_primer'][i])
    index_rv = oligo_name_list.index(golden_gate_assembly_4['rv_primer'][i])
    insert = anneal_oligos_object(oligo_list[index_fw],
                                  oligo_list[index_rv],
                                  pn2, pd2)
    annealed_DNA.append(insert)
    i = i+1

In [1941]:
#Step 3: Assemble
 
assembled_plasmids = []
i = 0
while i < len(golden_gate_assembly_4):
    assembly = joindna(annealed_DNA[i],
                       modified_backbones[i],
                       product = golden_gate_assembly_4['plasmid_id'][i],
                       topology = "circular", pn=pn3, pd=pd3)
    assembled_plasmids.append(assembly)
    i = i+1

In [1942]:
#Step 4: Output QUEEN gbk file

i = 0
while i < len(golden_gate_assembly_4):
    assembled_plasmids[i].outputgbk(output = golden_gate_assembly_4['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


In [None]:
#Step 5: Repair gbk locus ID (temporary fix; will be implemented in next update of QUEEN)

i = 0
while i < len(golden_gate_assembly_4):
    gbk_file = './' + golden_gate_assembly_4['plasmid_id'][i]+ '.gbk'
    handle = open(gbk_file, 'r')
    record = GenBank.read(handle)
    record.locus = golden_gate_assembly_4['plasmid_id'][i]
    record.accession = ''
    record.version = ''
    record.definition = 'Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)'
    record.organism = 'Synthetic DNA construct'
    fout = open(gbk_file, 'w') # final gbk file name
    fout.write(str(record))
    fout.close()
    i = i+1
print('QUEEN gbk files repaired.')

### Gibson_Assembly_2frag_1 Template

In [2293]:
#Set up the data for plasmids constructed by Gibson_Assembly_2frag_1
assembly_method = gibson_assemblies['assembly_method'] #extract the assembly_method column from the gibson sheet
gibson_assembly_2frag_1 = ['Gibson_Assembly_2frag_1'] #set up for extracting rows with Gibson_Assembly_2frag_1
gibson_assembly_2frag_1 = gibson_assemblies.query('@gibson_assembly_2frag_1 in assembly_method') #extract only rows containing Gibson_Assembly_2frag_1
gibson_assembly_2frag_1.index = range(len(gibson_assembly_2frag_1)) #re-index the extracted list to be in the order of only the rows containing Gibson_Assembly_2frag_1

In [2294]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][4]
pd1 = processes['pd1'][4]
#Step 2
pn2 = processes['pn2'][4]
pd2 = processes['pd2'][4]
#Step 3
pn3 = processes['pn3'][4]
pd3 = processes['pd3'][4]

In [2295]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(gibson_assembly_2frag_1):
    index_backbone = plasmid_name_list.index(gibson_assembly_2frag_1['backbone_id'][i])
    re1 = gibson_assembly_2frag_1['re1'][i]
    re2 = gibson_assembly_2frag_1['re2'][i]
    digested_backbone = double_digest_backbone(plasmid_list[index_backbone],
                                               re1, re2, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [2296]:
#Step 2: Amplify insert

pcr_products = []
i = 0
while i < len(gibson_assembly_2frag_1):
    index_template = plasmid_name_list.index(gibson_assembly_2frag_1['template_id'][i])
    index_fw = oligo_name_list.index(gibson_assembly_2frag_1['fw_primer'][i])
    index_rv = oligo_name_list.index(gibson_assembly_2frag_1['rv_primer'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn2, pd2)
    pcr_products.append(insert)
    i = i+1

In [2297]:
#Step 3: Assemble

assembled_plasmids = []
i = 0
while i < len(gibson_assembly_2frag_1):
    gibson = gibson_assembly(pcr_products[i],
                             digested_backbones[i],
                             assembly_name = gibson_assembly_2frag_1['plasmid_id'][i],
                             pn=pn3, pd=pd3)
    assembled_plasmids.append(gibson)
    i = i+1

In [2010]:
#Step 4: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(gibson_assembly_2frag_1):
    new_plasmid_known = QUEEN(record = str(gibson_assembly_2frag_1['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(gibson_assembly_2frag_1['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(gibson_assembly_2frag_1):
    if len(new_plasmids_known[i].seq) == len(assembled_plasmids[i].seq):
                              print(gibson_assembly_2frag_1['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(gibson_assembly_2frag_1['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pSI_218 is the same length as its Benchling sequence.
pLV_CS_060 is the same length as its Benchling sequence.
pLV_CS_154 is not the same length as its Benchling sequence
pLV_CS_155 is not the same length as its Benchling sequence
pRS232 is the same length as its Benchling sequence.
pRS233 is the same length as its Benchling sequence.
pRS234 is the same length as its Benchling sequence.
pKI086 is the same length as its Benchling sequence.


In [2017]:
#Step 5: Output QUEEN gbk file

annot_dict = {
              "version"   :"", 
              "accession" :"", 
              "organism"  :"Synthetic DNA construct",
              "molecule_type":"DNA",
              "topology": "circular"
              }

i = 0
while i < len(gibson_assembly_2frag_1):
    assembled_plasmids[i].record.description = "Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)"
    assembled_plasmids[i].outputgbk(output = gibson_assembly_2frag_1['plasmid_id'][i]+ '.gbk', record_id = gibson_assembly_2frag_1['plasmid_id'][i], annotation=annot_dict)
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


### Gibson_Assembly_2frag_2 Template

In [2019]:
#Set up the data for plasmids constructed by Gibson_Assembly_2frag_2
assembly_method = gibson_assemblies['assembly_method'] #extract the assembly_method column from the gibson sheet
gibson_assembly_2frag_2 = ['Gibson_Assembly_2frag_2'] #set up for extracting rows with Gibson_Assembly_2frag_2
gibson_assembly_2frag_2 = gibson_assemblies.query('@gibson_assembly_2frag_2 in assembly_method') #extract only rows containing Gibson_Assembly_2frag_2
gibson_assembly_2frag_2.index = range(len(gibson_assembly_2frag_2)) #re-index the extracted list to be in the order of only the rows containing Gibson_Assembly_2frag_2

In [2020]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][5]
pd1 = processes['pd1'][5]
#Step 2
pn2 = processes['pn2'][5]
pd2 = processes['pd2'][5]
#Step 3
pn3 = processes['pn3'][5]
pd3 = processes['pd3'][5]

In [2021]:
#Step 1: Amplify backbone

pcr_products1 = []
i = 0
while i < len(gibson_assembly_2frag_2):
    index_template = plasmid_name_list.index(gibson_assembly_2frag_2['template_id'][i])
    index_fw = oligo_name_list.index(gibson_assembly_2frag_2['fw_primer'][i])
    index_rv = oligo_name_list.index(gibson_assembly_2frag_2['rv_primer'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn1, pd1)
    pcr_products1.append(insert)
    i = i+1

In [2022]:
#Step 2: Amplify insert

pcr_products2 = []
i = 0
while i < len(gibson_assembly_2frag_2):
    index_template = plasmid_name_list.index(gibson_assembly_2frag_2['template_id2'][i])
    index_fw = oligo_name_list.index(gibson_assembly_2frag_2['fw_primer2'][i])
    index_rv = oligo_name_list.index(gibson_assembly_2frag_2['rv_primer2'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn2, pd2)
    pcr_products2.append(insert)
    i = i+1

In [2023]:
#Step 3: Assemble

assembled_plasmids = []
i = 0
while i < len(gibson_assembly_2frag_2):
    gibson = gibson_assembly(pcr_products1[i],
                             pcr_products2[i],
                             assembly_name = gibson_assembly_2frag_2['plasmid_id'][i],
                             pn=pn3, pd=pd3)
    assembled_plasmids.append(gibson)
    i = i+1

In [None]:
#Step 4: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(gibson_assembly_2frag_2):
    new_plasmid_known = QUEEN(record = str(gibson_assembly_2frag_2['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(gibson_assembly_2frag_2['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(gibson_assembly_2frag_2):
    if len(new_plasmids_known[i].seq) == len(assembled_plasmids[i].seq):
                              print(gibson_assembly_2frag_2['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(gibson_assembly_2frag_2['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

In [2025]:
#Step 5: Output QUEEN gbk file

annot_dict = {
              "version"   :"", 
              "accession" :"", 
              "organism"  :"Synthetic DNA construct",
              "molecule_type":"DNA",
              "topology": "circular"
              }

i = 0
while i < len(gibson_assembly_2frag_2):
    assembled_plasmids[i].record.description = "Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)"
    assembled_plasmids[i].outputgbk(output = gibson_assembly_2frag_2['plasmid_id'][i]+ '.gbk', record_id = gibson_assembly_2frag_2['plasmid_id'][i], annotation=annot_dict)
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


### Gibson_Assembly_2frag_3 Template

In [2096]:
#Set up the data for plasmids constructed by Gibson_Assembly_2frag_3
assembly_method = gibson_assemblies['assembly_method'] #extract the assembly_method column from the gibson sheet
gibson_assembly_2frag_3 = ['Gibson_Assembly_2frag_3'] #set up for extracting rows with Gibson_Assembly_2frag_3
gibson_assembly_2frag_3 = gibson_assemblies.query('@gibson_assembly_2frag_3 in assembly_method') #extract only rows containing Gibson_Assembly_2frag_3
gibson_assembly_2frag_3.index = range(len(gibson_assembly_2frag_3)) #re-index the extracted list to be in the order of only the rows containing Gibson_Assembly_2frag_3

In [2097]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][5]
pd1 = processes['pd1'][5]
#Step 2
pn2 = processes['pn2'][5]
pd2 = processes['pd2'][5]
#Step 3
pn3 = processes['pn3'][5]
pd3 = processes['pd3'][5]

In [2098]:
#Step 1: Amplify backbone

pcr_products1 = []
i = 0
while i < len(gibson_assembly_2frag_3):
    index_template = plasmid_name_list.index(gibson_assembly_2frag_3['template_id'][i])
    index_fw = oligo_name_list.index(gibson_assembly_2frag_3['fw_primer'][i])
    index_rv = oligo_name_list.index(gibson_assembly_2frag_3['rv_primer'][i])
    insert = create_pcr_product_special(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn1, pd1)
    pcr_products1.append(insert)
    i = i+1

In [2099]:
#Step 2: Amplify insert

pcr_products2 = []
i = 0
while i < len(gibson_assembly_2frag_3):
    index_template = plasmid_name_list.index(gibson_assembly_2frag_3['template_id2'][i])
    index_fw = oligo_name_list.index(gibson_assembly_2frag_3['fw_primer2'][i])
    index_rv = oligo_name_list.index(gibson_assembly_2frag_3['rv_primer2'][i])
    insert = create_pcr_product_special(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn2, pd2)
    pcr_products2.append(insert)
    i = i+1

In [2100]:
#Step 3: Assemble

assembled_plasmids = []
i = 0
while i < len(gibson_assembly_2frag_3):
    gibson = gibson_assembly(pcr_products1[i],
                             pcr_products2[i],
                             assembly_name = gibson_assembly_2frag_3['plasmid_id'][i],
                             pn=pn3, pd=pd3)
    assembled_plasmids.append(gibson)
    i = i+1

In [2095]:
#Step 4: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(gibson_assembly_2frag_3):
    new_plasmid_known = QUEEN(record = str(gibson_assembly_2frag_3['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(gibson_assembly_2frag_3['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(gibson_assembly_2frag_3):
    if len(new_plasmids_known[i].seq) == len(assembled_plasmids[i].seq):
                              print(gibson_assembly_2frag_3['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(gibson_assembly_2frag_3['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pKI241 is the same length as its Benchling sequence.
pKI242 is the same length as its Benchling sequence.


In [2102]:
#Step 5: Output QUEEN gbk file

annot_dict = {
              "version"   :"", 
              "accession" :"", 
              "organism"  :"Synthetic DNA construct",
              "molecule_type":"DNA",
              "topology": "circular"
              }

i = 0
while i < len(gibson_assembly_2frag_3):
    assembled_plasmids[i].record.description = "Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)"
    assembled_plasmids[i].outputgbk(output = gibson_assembly_2frag_3['plasmid_id'][i]+ '.gbk', record_id = gibson_assembly_2frag_3['plasmid_id'][i], annotation=annot_dict)
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


### Gibson_Assembly_2frag_4 Template

In [2143]:
#Set up the data for plasmids constructed by Gibson_Assembly_2frag_4
assembly_method = gibson_assemblies['assembly_method'] #extract the assembly_method column from the gibson sheet
gibson_assembly_2frag_4 = ['Gibson_Assembly_2frag_4'] #set up for extracting rows with Gibson_Assembly_2frag_4
gibson_assembly_2frag_4 = gibson_assemblies.query('@gibson_assembly_2frag_4 in assembly_method') #extract only rows containing Gibson_Assembly_2frag_4
gibson_assembly_2frag_4.index = range(len(gibson_assembly_2frag_4)) #re-index the extracted list to be in the order of only the rows containing Gibson_Assembly_2frag_4

In [2144]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][5]
pd1 = processes['pd1'][5]
#Step 2
pn2 = processes['pn2'][5]
pd2 = processes['pd2'][5]
#Step 3
pn3 = processes['pn3'][5]
pd3 = processes['pd3'][5]

In [2145]:
#Step 1: Amplify backbone

pcr_products1 = []
i = 0
while i < len(gibson_assembly_2frag_4):
    index_template = plasmid_name_list.index(gibson_assembly_2frag_4['template_id'][i])
    index_fw = oligo_name_list.index(gibson_assembly_2frag_4['fw_primer'][i])
    index_rv = oligo_name_list.index(gibson_assembly_2frag_4['rv_primer'][i])
    insert = create_pcr_product_special(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn1, pd1)
    pcr_products1.append(insert)
    i = i+1

In [2146]:
#Step 2: Amplify insert

pcr_products2 = []
i = 0
while i < len(gibson_assembly_2frag_4):
    index_template = plasmid_name_list.index(gibson_assembly_2frag_4['template_id2'][i])
    index_fw = oligo_name_list.index(gibson_assembly_2frag_4['fw_primer2'][i])
    index_rv = oligo_name_list.index(gibson_assembly_2frag_4['rv_primer2'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn2, pd2)
    pcr_products2.append(insert)
    i = i+1

In [2147]:
#Step 3: Assemble

assembled_plasmids = []
i = 0
while i < len(gibson_assembly_2frag_4):
    gibson = gibson_assembly(pcr_products1[i],
                             pcr_products2[i],
                             assembly_name = gibson_assembly_2frag_4['plasmid_id'][i],
                             pn=pn3, pd=pd3)
    assembled_plasmids.append(gibson)
    i = i+1

In [2114]:
#Step 4: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(gibson_assembly_2frag_4):
    new_plasmid_known = QUEEN(record = str(gibson_assembly_2frag_4['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(gibson_assembly_2frag_4['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(gibson_assembly_2frag_4):
    if len(new_plasmids_known[i].seq) == len(assembled_plasmids[i].seq):
                              print(gibson_assembly_2frag_4['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(gibson_assembly_2frag_4['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pKI185 is the same length as its Benchling sequence.


In [2150]:
#Step 5: Output QUEEN gbk file

annot_dict = {
              "version"   :"", 
              "accession" :"", 
              "organism"  :"Synthetic DNA construct",
              "molecule_type":"DNA",
              "topology": "circular"
              }

i = 0
while i < len(gibson_assembly_2frag_4):
    assembled_plasmids[i].record.description = "Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)"
    assembled_plasmids[i].outputgbk(output = gibson_assembly_2frag_4['plasmid_id'][i]+ '.gbk', record_id = gibson_assembly_2frag_4['plasmid_id'][i], annotation=annot_dict)
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


### Gibson_Assembly_2frag_5 Template

In [2243]:
#Set up the data for plasmids constructed by Gibson_Assembly_2frag_5
assembly_method = gibson_assemblies['assembly_method'] #extract the assembly_method column from the gibson sheet
gibson_assembly_2frag_5 = ['Gibson_Assembly_2frag_5'] #set up for extracting rows with Gibson_Assembly_2frag_5
gibson_assembly_2frag_5 = gibson_assemblies.query('@gibson_assembly_2frag_5 in assembly_method') #extract only rows containing Gibson_Assembly_2frag_5
gibson_assembly_2frag_5.index = range(len(gibson_assembly_2frag_5)) #re-index the extracted list to be in the order of only the rows containing Gibson_Assembly_2frag_5

In [2244]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][6]
pd1 = processes['pd1'][6]
#Step 2
pn2 = processes['pn2'][6]
pd2 = processes['pd2'][6]
#Step 3
pn3 = processes['pn3'][6]
pd3 = processes['pd3'][6]

In [2245]:
#Step 1: Amplify backbone

pcr_products1 = []
i = 0
while i < len(gibson_assembly_2frag_5):
    index_template = plasmid_name_list.index(gibson_assembly_2frag_5['template_id'][i])
    index_fw = oligo_name_list.index(gibson_assembly_2frag_5['fw_primer'][i])
    index_rv = oligo_name_list.index(gibson_assembly_2frag_5['rv_primer'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn1, pd1)
    pcr_products1.append(insert)
    i = i+1

In [2246]:
#Step 2: Amplify insert by template-free PCR

pcr_products2 = []
i = 0
while i < len(gibson_assembly_2frag_5):
    index_fw = oligo_name_list.index(gibson_assembly_2frag_5['fw_primer2'][i])
    index_rv = oligo_name_list.index(gibson_assembly_2frag_5['rv_primer2'][i])
    insert = template_free_pcr(oligo_list[index_fw], 
                               oligo_list[index_rv], 
                               pn2, pd2)
    pcr_products2.append(insert)
    i = i+1

In [2247]:
#Step 3: Assemble

assembled_plasmids = []
i = 0
while i < len(gibson_assembly_2frag_5):
    gibson = gibson_assembly(pcr_products2[i],
                             pcr_products1[i],
                             assembly_name = gibson_assembly_2frag_5['plasmid_id'][i],
                             pn=pn3, pd=pd3)
    assembled_plasmids.append(gibson)
    i = i+1

In [None]:
#Step 4: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(gibson_assembly_2frag_5):
    new_plasmid_known = QUEEN(record = str(gibson_assembly_2frag_5['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(gibson_assembly_2frag_5['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(gibson_assembly_2frag_5):
    if len(new_plasmids_known[i].seq) == len(assembled_plasmids[i].seq):
                              print(gibson_assembly_2frag_5['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(gibson_assembly_2frag_5['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

In [2249]:
#Step 5: Output QUEEN gbk file

i = 0
while i < len(gibson_assembly_2frag_5):
    assembled_plasmids[i].outputgbk(output = gibson_assembly_2frag_5['plasmid_id'][i]+ '.gbk')
    i = i+1
print('Downloaded QUEEN gbk files.')

ValueError: dictionary update sequence element #0 has length 1; 2 is required

In [None]:
#Step 6: Output QUEEN gbk file

annot_dict = {
              "version"   :"", 
              "accession" :"", 
              "organism"  :"Synthetic DNA construct",
              "molecule_type":"DNA",
              "topology": "circular"
              }

i = 0
while i < len(gibson_assembly_2frag_5):
    assembled_plasmids[i].record.description = "Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)"
    assembled_plasmids[i].outputgbk(output = gibson_assembly_2frag_5['plasmid_id'][i]+ '.gbk', record_id = gibson_assembly_2frag_5['plasmid_id'][i], annotation=annot_dict)
    i = i+1
print('Downloaded QUEEN gbk files.')

In [None]:
quine(pLV_CS_146, process_description=True)
visualizeflow(pLV_CS_146, sf=True, ip=True, grouping=True, pd=True)

### Gibson_Assembly_3frag_1 Template

In [2385]:
#Set up the data for plasmids constructed by Gibson_Assembly_3frag_1
assembly_method = gibson_assemblies['assembly_method'] #extract the assembly_method column from the gibson sheet
gibson_assembly_3frag_1 = ['Gibson_Assembly_3frag_1'] #set up for extracting rows with Gibson_Assembly_3frag_1
gibson_assembly_3frag_1 = gibson_assemblies.query('@gibson_assembly_3frag_1 in assembly_method') #extract only rows containing Gibson_Assembly_3frag_1
gibson_assembly_3frag_1.index = range(len(gibson_assembly_3frag_1)) #re-index the extracted list to be in the order of only the rows containing Gibson_Assembly_3frag_1

In [2386]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][7]
pd1 = processes['pd1'][7]
#Step 2
pn2 = processes['pn2'][7]
pd2 = processes['pd2'][7]
#Step 3
pn3 = processes['pn3'][7]
pd3 = processes['pd3'][7]
#Step 4
pn4 = processes['pn4'][7]
pd4 = processes['pd4'][7]

In [2387]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(gibson_assembly_3frag_1):
    index_backbone = plasmid_name_list.index(gibson_assembly_3frag_1['backbone_id'][i])
    re1 = gibson_assembly_3frag_1['re1'][i]
    re2 = gibson_assembly_3frag_1['re2'][i]
    digested_backbone = double_digest_backbone(plasmid_list[index_backbone],
                                               re1, re2, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [2388]:
#Step 2: Amplify insert

pcr_products1 = []
i = 0
while i < len(gibson_assembly_3frag_1):
    index_template = plasmid_name_list.index(gibson_assembly_3frag_1['template_id'][i])
    index_fw = oligo_name_list.index(gibson_assembly_3frag_1['fw_primer'][i])
    index_rv = oligo_name_list.index(gibson_assembly_3frag_1['rv_primer'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn2, pd2)
    pcr_products1.append(insert)
    i = i+1

In [2389]:
#Step 3: Amplify insert

pcr_products2 = []
i = 0
while i < len(gibson_assembly_3frag_1):
    index_template = plasmid_name_list.index(gibson_assembly_3frag_1['template_id2'][i])
    index_fw = oligo_name_list.index(gibson_assembly_3frag_1['fw_primer2'][i])
    index_rv = oligo_name_list.index(gibson_assembly_3frag_1['rv_primer2'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn3, pd3)
    pcr_products2.append(insert)
    i = i+1

In [2390]:
#Step 4: Assemble

stitched_inserts = []
i = 0
while i < len(gibson_assembly_3frag_1):
    stitched = stitch_fragments(pcr_products1[i],
                                pcr_products2[i])
    stitched_inserts.append(stitched)
    i = i+1

assembled_plasmids = []
i = 0
while i < len(gibson_assembly_3frag_1):
    gibson = gibson_assembly(stitched_inserts[i],
                             digested_backbones[i],
                             assembly_name = gibson_assembly_3frag_1['plasmid_id'][i],
                             pn=pn4, pd=pd4)
    assembled_plasmids.append(gibson)
    i = i+1

In [2057]:
#Step 5: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(gibson_assembly_3frag_1):
    new_plasmid_known = QUEEN(record = str(gibson_assembly_3frag_1['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(gibson_assembly_3frag_1['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(gibson_assembly_3frag_1):
    if len(new_plasmids_known[i].seq) == len(assembled_plasmids[i].seq):
                              print(gibson_assembly_3frag_1['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(gibson_assembly_3frag_1['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pSI_336 is the same length as its Benchling sequence.
pSI_337 is not the same length as its Benchling sequence


In [2392]:
#Step 6: Output QUEEN gbk file

annot_dict = {
              "version"   :"", 
              "accession" :"", 
              "organism"  :"Synthetic DNA construct",
              "molecule_type":"DNA",
              "topology": "circular"
              }

i = 0
while i < len(gibson_assembly_3frag_1):
    assembled_plasmids[i].record.description = "Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)"
    assembled_plasmids[i].outputgbk(output = gibson_assembly_3frag_1['plasmid_id'][i]+ '.gbk', record_id = gibson_assembly_3frag_1['plasmid_id'][i], annotation=annot_dict)
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


### Gibson_Assembly_3frag_2 Template

In [2393]:
#Set up the data for plasmids constructed by Gibson_Assembly_3frag_2
assembly_method = gibson_assemblies['assembly_method'] #extract the assembly_method column from the gibson sheet
gibson_assembly_3frag_2 = ['Gibson_Assembly_3frag_2'] #set up for extracting rows with Gibson_Assembly_3frag_2
gibson_assembly_3frag_2 = gibson_assemblies.query('@gibson_assembly_3frag_2 in assembly_method') #extract only rows containing Gibson_Assembly_3frag_2
gibson_assembly_3frag_2.index = range(len(gibson_assembly_3frag_2)) #re-index the extracted list to be in the order of only the rows containing Gibson_Assembly_3frag_2

In [2394]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][8]
pd1 = processes['pd1'][8]
#Step 2
pn2 = processes['pn2'][8]
pd2 = processes['pd2'][8]
#Step 3
pn3 = processes['pn3'][8]
pd3 = processes['pd3'][8]
#Step 4
pn4 = processes['pn4'][8]
pd4 = processes['pd4'][8]

In [2395]:
#Step 1: Amplify backbone

pcr_products1 = []
i = 0
while i < len(gibson_assembly_3frag_2):
    index_template = plasmid_name_list.index(gibson_assembly_3frag_2['template_id'][i])
    index_fw = oligo_name_list.index(gibson_assembly_3frag_2['fw_primer'][i])
    index_rv = oligo_name_list.index(gibson_assembly_3frag_2['rv_primer'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn1, pd1)
    pcr_products1.append(insert)
    i = i+1

In [2396]:
#Step 2: Amplify insert

pcr_products2 = []
i = 0
while i < len(gibson_assembly_3frag_2):
    index_template = plasmid_name_list.index(gibson_assembly_3frag_2['template_id2'][i])
    index_fw = oligo_name_list.index(gibson_assembly_3frag_2['fw_primer2'][i])
    index_rv = oligo_name_list.index(gibson_assembly_3frag_2['rv_primer2'][i])
    insert = create_pcr_product_special(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn2, pd2)
    pcr_products2.append(insert)
    i = i+1

In [2397]:
#Step 3: Amplify insert

pcr_products3 = []
i = 0
while i < len(gibson_assembly_3frag_2):
    index_template = plasmid_name_list.index(gibson_assembly_3frag_2['template_id3'][i])
    index_fw = oligo_name_list.index(gibson_assembly_3frag_2['fw_primer3'][i])
    index_rv = oligo_name_list.index(gibson_assembly_3frag_2['rv_primer3'][i])
    insert = create_pcr_product_special(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn3, pd3)
    pcr_products3.append(insert)
    i = i+1

In [2398]:
#Step 4: Assemble

stitched_inserts = []
i = 0
while i < len(gibson_assembly_3frag_2):
    stitched = stitch_fragments(pcr_products2[i],
                                pcr_products3[i])
    stitched_inserts.append(stitched)
    i = i+1

assembled_plasmids = []
i = 0
while i < len(gibson_assembly_3frag_2):
    gibson = gibson_assembly(stitched_inserts[i],
                             pcr_products1[i],
                             assembly_name = gibson_assembly_3frag_2['plasmid_id'][i],
                             pn=pn4, pd=pd4)
    assembled_plasmids.append(gibson)
    i = i+1

In [2071]:
#Step 5: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(gibson_assembly_3frag_2):
    new_plasmid_known = QUEEN(record = str(gibson_assembly_3frag_2['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(gibson_assembly_3frag_2['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(gibson_assembly_3frag_2):
    if len(new_plasmids_known[i].seq) == len(assembled_plasmids[i].seq):
                              print(gibson_assembly_3frag_2['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(gibson_assembly_3frag_2['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pKI188 is the same length as its Benchling sequence.


In [2400]:
#Step 6: Output QUEEN gbk file

annot_dict = {
              "version"   :"", 
              "accession" :"", 
              "organism"  :"Synthetic DNA construct",
              "molecule_type":"DNA",
              "topology": "circular"
              }

i = 0
while i < len(gibson_assembly_3frag_2):
    assembled_plasmids[i].record.description = "Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)"
    assembled_plasmids[i].outputgbk(output = gibson_assembly_3frag_2['plasmid_id'][i]+ '.gbk', record_id = gibson_assembly_3frag_2['plasmid_id'][i], annotation=annot_dict)
    i = i+1
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.


### Gibson_Assembly_4frag_1 Template

In [2371]:
#Set up the data for plasmids constructed by Gibson_Assembly_4frag_1
assembly_method = gibson_assemblies['assembly_method'] #extract the assembly_method column from the gibson sheet
gibson_assembly_4frag_1 = ['Gibson_Assembly_4frag_1'] #set up for extracting rows with Gibson_Assembly_4frag_1
gibson_assembly_4frag_1 = gibson_assemblies.query('@gibson_assembly_4frag_1 in assembly_method') #extract only rows containing Gibson_Assembly_4frag_1
gibson_assembly_4frag_1.index = range(len(gibson_assembly_4frag_1)) #re-index the extracted list to be in the order of only the rows containing Gibson_Assembly_4frag_1

In [2372]:
#Set assembly steps

#Step 1
pn1 = processes['pn1'][9]
pd1 = processes['pd1'][9]
#Step 2
pn2 = processes['pn2'][9]
pd2 = processes['pd2'][9]
#Step 3
pn3 = processes['pn3'][9]
pd3 = processes['pd3'][9]
#Step 4
pn4 = processes['pn4'][9]
pd4 = processes['pd4'][9]
#Step 5
pn5 = processes['pn5'][9]
pd5 = processes['pd5'][9]

In [2373]:
#Step 1: Digest backbone

digested_backbones = []
i = 0
while i < len(gibson_assembly_4frag_1):
    index_backbone = plasmid_name_list.index(gibson_assembly_4frag_1['backbone_id'][i])
    re1 = gibson_assembly_4frag_1['re1'][i]
    re2 = gibson_assembly_4frag_1['re2'][i]
    digested_backbone = double_digest_backbone(plasmid_list[index_backbone],
                                               re1, re2, pn1, pd1)
    digested_backbones.append(digested_backbone)
    i = i+1

In [2374]:
#Step 2: Amplify insert

pcr_products1 = []
i = 0
while i < len(gibson_assembly_4frag_1):
    index_template = plasmid_name_list.index(gibson_assembly_4frag_1['template_id'][i])
    index_fw = oligo_name_list.index(gibson_assembly_4frag_1['fw_primer'][i])
    index_rv = oligo_name_list.index(gibson_assembly_4frag_1['rv_primer'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn2, pd2)
    pcr_products1.append(insert)
    i = i+1

In [2375]:
#Step 3: Amplify insert

pcr_products2 = []
i = 0
while i < len(gibson_assembly_4frag_1):
    index_template = plasmid_name_list.index(gibson_assembly_4frag_1['template_id2'][i])
    index_fw = oligo_name_list.index(gibson_assembly_4frag_1['fw_primer2'][i])
    index_rv = oligo_name_list.index(gibson_assembly_4frag_1['rv_primer2'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn3, pd3)
    pcr_products2.append(insert)
    i = i+1

In [2376]:
#Step 4: Amplify insert

pcr_products3 = []
i = 0
while i < len(gibson_assembly_4frag_1):
    index_template = plasmid_name_list.index(gibson_assembly_4frag_1['template_id3'][i])
    index_fw = oligo_name_list.index(gibson_assembly_4frag_1['fw_primer3'][i])
    index_rv = oligo_name_list.index(gibson_assembly_4frag_1['rv_primer3'][i])
    insert = create_pcr_product(plasmid_list[index_template],
                                        oligo_list[index_fw], 
                                        oligo_list[index_rv], 
                                        pn4, pd4)
    pcr_products3.append(insert)
    i = i+1

In [2380]:
#Step 5: Assemble

stitched_inserts_1 = []
i = 0
while i < len(gibson_assembly_4frag_1):
    stitched = stitch_fragments(pcr_products1[i],
                                pcr_products2[i])
    stitched_inserts_1.append(stitched)
    i = i+1
    
stitched_inserts_2 = []
i = 0
while i < len(gibson_assembly_4frag_1):
    stitched = stitch_fragments(stitched_inserts_1[i],
                                pcr_products3[i])
    stitched_inserts_2.append(stitched)
    i = i+1

#Manually created plasmid due to technical difficulties
#cut_backbone = cutdna(pPBbsr_MCS, 2723, 3877) #index 1 is the backbone piece we want to keep
cut_backbone = cropdna(digested_backbones[0], 33, 5665)
assembled_plasmid = joindna(stitched_inserts_2[0], cut_backbone, product = 'pNM1325', topology = "circular", pn=pn5, pd=pd5)
assembled_plasmid
    
#digested_inserts = []
#assembled_plasmids = []
#i = 0
#while i < len(gibson_assembly_4frag_1):
#    re1 = gibson_assembly_4frag_1['re1'][i]
#    re2 = gibson_assembly_4frag_1['re2'][i]
#    digested_insert = double_digest_insert(stitched_inserts_2[i], re1, re2, pn4, pd4)
#    digested_inserts.append(digested_insert)
#    ligation = joindna(digested_inserts[i],
#                       digested_backbones[i],
#                       product = gibson_assembly_4frag_1['plasmid_id'][i],
#                       topology = "circular", pn=pn4, pd=pd4)
#    assembled_plasmids.append(ligation)
#    i = i+1
    
#assembled_plasmids = [] #for whatever reason joindna in the gibson_assembly function isn't working here because it isn't recognizing the overlapping compatible ends (I think because of ClaI)
#and digestion then joindna didn't work because QUEEN can't ligate ClaI sticky end because it's only 2 bp long, so I just manually did the assembly
#i = 0
#while i < len(gibson_assembly_4frag_1):
#    gibson = gibson_assembly(stitched_inserts_2[i],
#                             digested_backbones[i],
#                             assembly_name = gibson_assembly_4frag_1['plasmid_id'][i],
#                             pn=pn4, pd=pd4)
#    assembled_plasmids.append(gibson)
#    i = i+1

<queen.QUEEN object; project='pNM1325_13', length='11534 bp', topology='circular'>

In [970]:
#Step 6: Verify new plasmid sequence lengths (optional)

new_plasmids_known = []
i = 0
while i < len(gibson_assembly_4frag_1):
    new_plasmid_known = QUEEN(record = str(gibson_assembly_4frag_1['benchling_of_final_plasmid'][i]),
                                  dbtype = 'benchling',
                                  product = str(gibson_assembly_4frag_1['plasmid_id'][i]))
    new_plasmids_known.append(new_plasmid_known)
    i = i+1

i = 0
while i < len(gibson_assembly_4frag_1):
    if len(new_plasmids_known[i].seq) == len(assembled_plasmids[i].seq):
                              print(gibson_assembly_4frag_1['plasmid_id'][i] + ' is the same length as its Benchling sequence.')
    else:
                              print(gibson_assembly_4frag_1['plasmid_id'][i] + ' is not the same length as its Benchling sequence')
    i = i+1

pNM1325 is the same length as its Benchling sequence.


In [2383]:
#Step 7: Output QUEEN gbk file

annot_dict = {
              "version"   :"", 
              "accession" :"", 
              "organism"  :"Synthetic DNA construct",
              "molecule_type":"DNA",
              "topology": "circular"
              }

assembled_plasmid.record.description = "Plasmid was generated by QUEEN v1.2.0 (https://github.com/yachielab/QUEEN)"
assembled_plasmid.outputgbk(output = gibson_assembly_4frag_1['plasmid_id'][0]+ '.gbk', record_id = gibson_assembly_4frag_1['plasmid_id'][0], annotation=annot_dict)
print('Downloaded QUEEN gbk files.')

Downloaded QUEEN gbk files.
