# Design


## Environment

- conda
- docker
- colab

- To make google drive as colab working directory
```python
from google.colab import drive
drive.mount('/content/drive')
```

- Change working directory

```python
import os
os.chdir('/content/drive/MyDrive/design_build')
```

```shell
!pwd
!wget https://media.addgene.org/snapgene-media/v1.7.9-0-g88a3305/sequences/222046/51c2cfab-a3b4-4d62-98df-0c77ec21164e/addgene-plasmid-50005-sequence-222046.gbk
```

In [1]:
!ls

README.md    data	   docs    index.qmd
_quarto.yml  design.ipynb  images  seq_analysis.ipynb


![](images/puc19.png){width=500px}

## Biopython

[Biopython](https://biopython.org/) is a collection of freely available Python tools for computational biology and bioinformatics.

In [4]:
from Bio import SeqIO
from pandas import DataFrame

records = SeqIO.read("data/parts/addgene-plasmid-50005-sequence-222046.gbk", "genbank")

features = []
for feature in records.features:
    features.append({
        "Label": feature.qualifiers.get("label", [""])[0],
        "Strand": feature.strand,
        "Start": feature.location.start,
        "End": feature.location.end, 
        "Type": feature.type
    })
print(features)
DataFrame(features)

[{'Label': '', 'Strand': 1, 'Start': ExactPosition(0), 'End': ExactPosition(2686), 'Type': 'source'}, {'Label': 'pBR322ori-F', 'Strand': 1, 'Start': ExactPosition(117), 'End': ExactPosition(137), 'Type': 'primer_bind'}, {'Label': 'L4440', 'Strand': 1, 'Start': ExactPosition(370), 'End': ExactPosition(388), 'Type': 'primer_bind'}, {'Label': 'CAP binding site', 'Strand': 1, 'Start': ExactPosition(504), 'End': ExactPosition(526), 'Type': 'protein_bind'}, {'Label': 'lac promoter', 'Strand': 1, 'Start': ExactPosition(540), 'End': ExactPosition(571), 'Type': 'promoter'}, {'Label': 'lac operator', 'Strand': 1, 'Start': ExactPosition(578), 'End': ExactPosition(595), 'Type': 'protein_bind'}, {'Label': 'M13/pUC Reverse', 'Strand': 1, 'Start': ExactPosition(583), 'End': ExactPosition(606), 'Type': 'primer_bind'}, {'Label': 'M13 rev', 'Strand': 1, 'Start': ExactPosition(602), 'End': ExactPosition(619), 'Type': 'primer_bind'}, {'Label': 'M13 Reverse', 'Strand': 1, 'Start': ExactPosition(602), 'End'



Unnamed: 0,Label,Strand,Start,End,Type
0,,1,0,2686,source
1,pBR322ori-F,1,117,137,primer_bind
2,L4440,1,370,388,primer_bind
3,CAP binding site,1,504,526,protein_bind
4,lac promoter,1,540,571,promoter
5,lac operator,1,578,595,protein_bind
6,M13/pUC Reverse,1,583,606,primer_bind
7,M13 rev,1,602,619,primer_bind
8,M13 Reverse,1,602,619,primer_bind
9,lacZ-alpha,1,614,938,CDS


## Primers

[primers](https://github.com/Lattice-Automation/primers) It is uniquely focused on DNA assembly flows like Gibson Assembly and Golden Gate cloning. You can design primers while adding sequence to the 5' ends of primers.

In [5]:
from primers import create, primers
from pandas import DataFrame
from random import sample, choices

myseq_list = choices(["A", "T", "G", "C"], k=100)
myseq = "".join(myseq_list)
print(myseq)

fwd, rev = create(myseq, add_fwd = "GGGG", add_rev = "TTTT")
# p1, p2 = primers(myseq, add_fwd = "GGGG", add_rev = "TTTT")
print(fwd)
print(rev)

## display table form
DataFrame(list(fwd.dict().values())[:-1], index = list(rev.dict().keys())[:-1])

## default argument values 
create

CGTGCCTCCATAAATAACTTGCAAGATTCTCACCATTCGAAGGTTCTCGACAAGGGGCGGGGGGTAAAAATAGCATTACTAGTTCGGATAAATCTGCCCT
Primer(seq='GGGGCGTGCCTCCATAAATAACTTG', len=25, tm=63.3, tm_total=70.6, gc=0.5, dg=0, fwd=True, off_target_count=0, scoring=Scoring(penalty=1.8, penalty_tm=1.3, penalty_tm_diff=0, penalty_gc=0.0, penalty_len=0.5, penalty_dg=0.0, penalty_off_target=0.0))
Primer(seq='TTTTAGGGCAGATTTATCCGAACTAGT', len=27, tm=63.4, tm_total=63.9, gc=0.4, dg=-0.37, fwd=False, off_target_count=0, scoring=Scoring(penalty=4.6, penalty_tm=1.4, penalty_tm_diff=0, penalty_gc=2.0, penalty_len=0.5, penalty_dg=0.7, penalty_off_target=0.0))


<function primers.primers.primers(seq: str, add_fwd: str = '', add_rev: str = '', add_fwd_len: Tuple[int, int] = (-1, -1), add_rev_len: Tuple[int, int] = (-1, -1), offtarget_check: str = '', optimal_tm: float = 62.0, optimal_gc: float = 0.5, optimal_len: int = 22, penalty_tm: float = 1.0, penalty_gc: float = 0.2, penalty_len: float = 0.5, penalty_tm_diff: float = 1.0, penalty_dg: float = 2.0, penalty_off_target: float = 20.0) -> Tuple[primers.primers.Primer, primers.primers.Primer]>

- Default arguments and values

```plain
<function primers.primers.primers(seq: str, add_fwd: str = '', add_rev: str = '', add_fwd_len: Tuple[int, int] = (-1, -1), add_rev_len: Tuple[int, int] = (-1, -1), offtarget_check: str = '', optimal_tm: float = 62.0, optimal_gc: float = 0.5, optimal_len: int = 22, penalty_tm: float = 1.0, penalty_gc: float = 0.2, penalty_len: float = 0.5, penalty_tm_diff: float = 1.0, penalty_dg: float = 2.0, penalty_off_target: float = 20.0) -> Tuple[primers.primers.Primer, primers.primers.Primer]>
```

- offtarget

In [6]:
from primers import create
from random import choices

def print_primer_info(x):
    from pandas import DataFrame
    df = DataFrame(list(x.dict().values())[:-1], index = list(x.dict().keys())[:-1])
    print(df)

primer_binding_seq = "GTCATATGCATTCGATGCGTTAGG"
rnd_seq1 = "".join(choices(["A", "T", "G", "C"], k=100))
rnd_seq2 = "".join(choices(["A", "T", "G", "C"], k=100))

myseq = primer_binding_seq+rnd_seq1
print(myseq)
len(myseq)

fwd, rev = create(myseq)
print_primer_info(fwd)

## primer considering offtargets
myseq2 = primer_binding_seq+rnd_seq1+primer_binding_seq+rnd_seq2
fwd2, rev = create(myseq2)
print_primer_info(fwd2)

## optimal_tm is ignored 
fwd2, rev = create(myseq2, optimal_tm = 62)
print_primer_info(fwd2)

GTCATATGCATTCGATGCGTTAGGGACACCCATGGCAACATGTGGATATAACTCGGTGCTGAGGAAAACTTCATACGCTCTTGACGTTCTATGCATGAAGCCCTTTCAACGCACGCTTCATTCA
                                     0
seq               GTCATATGCATTCGATGCGT
len                                 20
tm                                62.5
tm_total                          62.5
gc                                 0.5
dg                               -0.56
fwd                               True
off_target_count                     0
                                           0
seq               GTCATATGCATTCGATGCGTTAGGGA
len                                       26
tm                                      68.0
tm_total                                68.0
gc                                       0.5
dg                                     -0.56
fwd                                     True
off_target_count                           0
                                           0
seq               GTCATATGCATTCGATGCGTTAGGGA
len                          

## pydna

- [pyDNA](https://github.com/BjornFJohansson/pydna) The pydna python package provide a human-readable formal descriptions of 🧬 cloning and genetic assembly strategies in Python 🐍 which allow for simulation and verification.

In [7]:
from pydna.dseqrecord import Dseqrecord

dsr = Dseqrecord("ATGCGTTGC")
dsr.figure()

Dseqrecord(-9)
[48;5;11m[0mATGCGTTGC
TACGCAACG

In [9]:
from pydna.readers import read

p = read("data/parts/addgene-plasmid-50005-sequence-222046.gbk")
p.list_features()



Ft#,Label or Note,Dir,Sta,End,Len,type,orf?
0,nd,-->,0,2686,2686,source,no
1,L:pBR322ori-F,-->,117,137,20,primer_bind,no
2,L:L4440,-->,370,388,18,primer_bind,no
3,L:CAP binding si,-->,504,526,22,protein_bind,no
4,L:lac promoter,-->,540,571,31,promoter,no
5,L:lac operator,-->,578,595,17,protein_bind,no
6,L:M13/pUC Revers,-->,583,606,23,primer_bind,no
7,L:M13 rev,-->,602,619,17,primer_bind,no
8,L:M13 Reverse,-->,602,619,17,primer_bind,no
9,L:lacZ-alpha,-->,614,938,324,CDS,yes


In [123]:
extracted_site = p.extract_feature(10)
extracted_site.seq

Dseq(-57)
AAGC..ATTC
TTCG..TAAG

## Parts

- pUC19 from Addgene
- Remove BsaI site
- Insert a part into MCS

- Hinz, Aaron J., Benjamin Stenzler, and Alexandre J. Poulain. "Golden gate assembly of aerobic and anaerobic microbial bioreporters." Applied and environmental microbiology 88.1 (2022): e01485-21.

![plasmid pUC19](images/pUC19.png){width=500px}

![bsaI replacement](images/amp_bsaI1.png){height=200px}
![bsaI replacement2](images/amp_bsaI2.png){height=200px}

#### List of parts

- pUC19-J23100.gb
- pUC19-RB0030.gb
- pUC19-L2U3H03.gb
- pUC19-egfp.gb

![](images/puc19_egfp.png){width=500px}

## Feature amplification

### primer design


In [148]:
from pydna.readers import read
from primers import create

egfp = read("data/parts/pUC19-egfp.gb")

egfp_feature_list = egfp.list_features()
display(egfp_feature_list)

myseq = egfp.extract_feature(13)
str(myseq.seq)

fwd, rev = create(str(myseq.seq), add_fwd = "GGGG", add_rev = "TTTT")
print(fwd)

'LOCUS       pUC19-egfp        3371 bp DNA     circular SYN 18-MAY-2024\n'
Found locus 'pUC19-egfp' size '3371' residue_type 'DNA'
Some fields may be wrong.


Ft#,Label or Note,Dir,Sta,End,Len,type,orf?
0,nd,-->,0,3371,3371,source,no
1,L:pBR322ori-F,-->,117,137,20,primer_bind,no
2,L:L4440,-->,370,388,18,primer_bind,no
3,L:CAP binding si,-->,504,526,22,protein_bind,no
4,L:lac promoter,-->,540,571,31,promoter,no
5,L:lac operator,-->,578,595,17,protein_bind,no
6,L:M13/pUC Revers,-->,583,606,23,primer_bind,no
7,L:M13 rev,-->,602,619,17,primer_bind,no
8,L:M13 Reverse,-->,602,619,17,primer_bind,no
9,L:lacZ-alpha,-->,614,1623,1009,CDS,no


Primer(seq='GGGGATGGTGAGCAAGGGCG', len=20, tm=65.1, tm_total=72.1, gc=0.7, dg=0, fwd=True, off_target_count=0, scoring=Scoring(penalty=10.1, penalty_tm=3.1, penalty_tm_diff=0, penalty_gc=4.0, penalty_len=3.0, penalty_dg=0.0, penalty_off_target=0.0))


### PCR 

In [149]:
from pydna.all import pcr

pcr_product = pcr(fwd.seq, rev.seq, egfp.extract_feature(13))
pcr_product.figure()


    5ATGGTGAGCAAGGGCG...CATGGACGAGCTGTACAAGTAA3
                        ||||||||||||||||||||||
                       3GTACCTGCTCGACATGTTCATTTTTT5
5GGGGATGGTGAGCAAGGGCG3
     ||||||||||||||||
    3TACCACTCGTTCCCGC...GTACCTGCTCGACATGTTCATT5

## Golden Gate Assembly I

- Target sequence including overhang and BsaI site
- Generate primers
- PCR the target sequence

In [152]:
from primers import create
from pydna.all import pcr

frag1 = egfp.extract_feature(10)
frag2 = egfp.extract_feature(12)
frag3 = egfp.extract_feature(13)
frag4 = egfp.extract_feature(18)
frag5 = egfp.extract_feature(19)

targetseq = frag1+frag2+frag3+frag4+frag5
fwd, rev = create(str(targetseq.seq))

pcr_product = pcr(fwd.seq, rev.seq, targetseq)
pcr_product.figure()


5GGTCTCAGTCAATGGTGA...TACAAGTAAGGGATGAGACC3
                      ||||||||||||||||||||
                     3ATGTTCATTCCCTACTCTGG5
5GGTCTCAGTCAATGGTGA3
 ||||||||||||||||||
3CCAGAGTCAGTTACCACT...ATGTTCATTCCCTACTCTGG5

### Enzyme cutting 

In [153]:
from Bio.Restriction import BsaI

cut_product = pcr_product.cut(BsaI)
print(len(cut_product))
display(cut_product[0].figure())
print()
display(cut_product[1].figure())
print()
display(cut_product[2].figure())

3


Dseqrecord(-11)
[48;5;11mGGTCTCA[0m    
CCAGAGTCAGT




Dseqrecord(-728)
[48;5;11mGTCA[0mATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAA    
    TACCACTCGTTCCCGCTCCTCGACAAGTGGCCCCACCACGGGTAGGACCAGCTCGACCTGCCGCTGCATTTGCCGGTGTTCAAGTCGCACAGGCCGCTCCCGCTCCCGCTACGGTGGATGCCGTTCGACTGGGACTTCAAGTAGACGTGGTGGCCGTTCGACGGGCACGGGACCGGGTGGGAGCACTGGTGGGACTGGATGCCGCACGTCACGAAGTCGGCGATGGGGCTGGTGTA




Dseqrecord(-11)
[48;5;11mGGGA[0mTGAGACC
    ACTCTGG

### Assembly the fragments back

In [140]:
from pydna.assembly import Assembly
from pydna.common_sub_strings import terminal_overlap

asm = Assembly((cut_product[0], cut_product[1], cut_product[2]), algorithm = terminal_overlap, limit = 4)

candidate = asm.assemble_linear()

print(candidate[0])

candidate[0].figure()   

Dseqrecord
circular: False
size: 742
ID: id
Name: name
Description: description
Number of features: 13
/molecule_type=DNA
Dseq(-742)
GGTC..GACC
CCAG..CTGG


name| 4
     \/
     /\
      4|name| 4
             \/
             /\
              4|name

## Golden Gate Assembly II

### Find positions of a specific feature label

In [34]:
from pydna.readers import read

## read the parts
egfp = read("data/parts/pUC19-egfp.gb")
promoter = read("data/parts/pUC19-J23100.gb")
terminator = read("data/parts/pUC19-L2U3H03.gb")
rbs = read("data/parts/pUC19-RB0030.gb")

display(promoter.list_features())
display(rbs.list_features())
display(egfp.list_features())
display(terminator.list_features())


'LOCUS       pUC19-egfp        3371 bp DNA     circular SYN 18-MAY-2024\n'
Found locus 'pUC19-egfp' size '3371' residue_type 'DNA'
Some fields may be wrong.
'LOCUS       pUC19-J23100        2686 bp DNA     circular SYN 01-JUN-2024\n'
Found locus 'pUC19-J23100' size '2686' residue_type 'DNA'
Some fields may be wrong.
'LOCUS       pUC19-L2U3H03        2688 bp DNA     circular SYN 01-JUN-2024\n'
Found locus 'pUC19-L2U3H03' size '2688' residue_type 'DNA'
Some fields may be wrong.
'LOCUS       pUC19-RB0030        2750 bp DNA     circular SYN 01-JUN-2024\n'
Found locus 'pUC19-RB0030' size '2750' residue_type 'DNA'
Some fields may be wrong.


Ft#,Label or Note,Dir,Sta,End,Len,type,orf?
0,nd,-->,0,2686,2686,source,no
1,L:pBR322ori-F,-->,117,137,20,primer_bind,no
2,L:L4440,-->,370,388,18,primer_bind,no
3,L:CAP binding si,-->,504,526,22,protein_bind,no
4,L:lac promoter,-->,540,571,31,promoter,no
5,L:lac operator,-->,578,595,17,protein_bind,no
6,L:M13/pUC Revers,-->,583,606,23,primer_bind,no
7,L:M13 rev,-->,602,619,17,primer_bind,no
8,L:M13 Reverse,-->,602,619,17,primer_bind,no
9,L:lacZ-alpha,-->,614,938,324,CDS,no


Ft#,Label or Note,Dir,Sta,End,Len,type,orf?
0,nd,-->,0,2750,2750,source,no
1,L:pBR322ori-F,-->,117,137,20,primer_bind,no
2,L:L4440,-->,370,388,18,primer_bind,no
3,L:CAP binding si,-->,504,526,22,protein_bind,no
4,L:lac promoter,-->,540,571,31,promoter,no
5,L:lac operator,-->,578,595,17,protein_bind,no
6,L:M13/pUC Revers,-->,583,606,23,primer_bind,no
7,L:M13 rev,-->,602,619,17,primer_bind,no
8,L:M13 Reverse,-->,602,619,17,primer_bind,no
9,L:BsaI-F,-->,631,638,7,misc_feature,no


Ft#,Label or Note,Dir,Sta,End,Len,type,orf?
0,nd,-->,0,3371,3371,source,no
1,L:pBR322ori-F,-->,117,137,20,primer_bind,no
2,L:L4440,-->,370,388,18,primer_bind,no
3,L:CAP binding si,-->,504,526,22,protein_bind,no
4,L:lac promoter,-->,540,571,31,promoter,no
5,L:lac operator,-->,578,595,17,protein_bind,no
6,L:M13/pUC Revers,-->,583,606,23,primer_bind,no
7,L:M13 rev,-->,602,619,17,primer_bind,no
8,L:M13 Reverse,-->,602,619,17,primer_bind,no
9,L:lacZ-alpha,-->,614,1623,1009,CDS,no


Ft#,Label or Note,Dir,Sta,End,Len,type,orf?
0,nd,-->,0,2688,2688,source,no
1,L:pBR322ori-F,-->,117,137,20,primer_bind,no
2,L:L4440,-->,370,388,18,primer_bind,no
3,L:CAP binding si,-->,504,526,22,protein_bind,no
4,L:lac promoter,-->,540,571,31,promoter,no
5,L:lac operator,-->,578,595,17,protein_bind,no
6,L:M13/pUC Revers,-->,583,606,23,primer_bind,no
7,L:M13 rev,-->,602,619,17,primer_bind,no
8,L:M13 Reverse,-->,602,619,17,primer_bind,no
9,L:BsaI-F,-->,631,638,7,misc_feature,no


- Make a function for searching by label and returning positions

In [35]:
## search by label and return the location
def get_positions_by_label(record, label) :
    feature_list = record.features
    pos = {'start': None, 'end': None}
    for feature in feature_list:
        if "label" in feature.qualifiers:
            if feature.qualifiers['label'] == [label]:
                pos['start'] = int(feature.location.start)
                pos['end'] = int(feature.location.end)
    return pos

def get_record_between_labels(record, label1, label2) :
    pos = {'start': get_positions_by_label(record, label1)['start'], 'end': get_positions_by_label(record, label2)['end']}
    if pos['start'] == None or pos['end'] == None:
        print("label not found")
        return None
    else:
        return record[pos['start']:pos['end']]
    
target_record = get_record_between_labels(egfp, "BsaI-F", "BsaI-R")
# print("pos:", target_record)
display(target_record.figure())
target_record.seq

Dseqrecord(-742)
[48;5;11mGGTCTCA[0mGTCAATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAAGGGATGAGACC
CCAGAGTCAGTTACCACTCGTTCCCGCTCCTCGACAAGTGGCCCCACCACGGGTAGGACCAGCTCGACCTGCCGCTGCATTTGCCGGTGTTCAAGTCGCACAGGCCGCTCCCGCTCCCGCTACGGTGGATGCCGTTCGACTGGGACTTCAAGTAGACGTGGTGGCCGTTCGACGGGCACGGGACCGGGTGGGAGCACTGGTGGGACTGGATGCCGCACGTCACGAA

Dseq(-742)
GGTC..GACC
CCAG..CTGG

### Part preparation by PCR

- from "BsaI-F" to "BsaI-R"

In [67]:
from pydna.readers import read
from primers import create, primers
from pydna.all import pcr

label1 = "BsaI-F"
label2 = "BsaI-R"

promoter_target = get_record_between_labels(promoter, label1, label2)
rbs_target = get_record_between_labels(rbs, label1, label2)
egfp_target = get_record_between_labels(egfp, label1, label2)
terminator_target = get_record_between_labels(terminator, label1, label2)

fwd, rev = create(str(promoter_target.seq))
promoter_pcr_product = pcr(fwd.seq, rev.seq, promoter)
display(promoter_pcr_product.figure())

fwd, rev = create(str(rbs_target.seq))
rbs_pcr_product = pcr(fwd.seq, rev.seq, rbs)
display(rbs_pcr_product.figure())

fwd, rev = create(str(egfp_target.seq))
egfp_pcr_product = pcr(fwd.seq, rev.seq, egfp)
display(egfp_pcr_product.figure())

fwd, rev = create(str(terminator_target.seq))
terminator_pcr_product = pcr(fwd.seq, rev.seq, terminator)
display(terminator_pcr_product.figure())


5GGTCTCAAAGCTTGACG...CTAGCCTCCAGAGACC3
                     ||||||||||||||||
                    3GATCGGAGGTCTCTGG5
5GGTCTCAAAGCTTGACG3
 |||||||||||||||||
3CCAGAGTTTCGAACTGC...GATCGGAGGTCTCTGG5

5GGTCTCACTCCAGCTG...GAGGAGAAATAGTCATGAGACC3
                    ||||||||||||||||||||||
                   3CTCCTCTTTATCAGTACTCTGG5
5GGTCTCACTCCAGCTG3
 ||||||||||||||||
3CCAGAGTGAGGTCGAC...CTCCTCTTTATCAGTACTCTGG5

5GGTCTCAGTCAATGGTGA...TACAAGTAAGGGATGAGACC3
                      ||||||||||||||||||||
                     3ATGTTCATTCCCTACTCTGG5
5GGTCTCAGTCAATGGTGA3
 ||||||||||||||||||
3CCAGAGTCAGTTACCACT...ATGTTCATTCCCTACTCTGG5

5GGTCTCAGGGATAGCG...TTGTTGAGCGAATGAGACC3
                    |||||||||||||||||||
                   3AACAACTCGCTTACTCTGG5
5GGTCTCAGGGATAGCG3
 ||||||||||||||||
3CCAGAGTCCCTATCGC...AACAACTCGCTTACTCTGG5

### Enzyme cut and assembly

In [69]:
from Bio.Restriction import BsaI
from pydna.assembly import Assembly
from pydna.common_sub_strings import terminal_overlap

promoter_cut_product = promoter_pcr_product.cut(BsaI)
print(len(promoter_cut_product))
promoter_cut_product[1].name = "promoter"
display(promoter_cut_product[1].figure())

rbs_cut_product = rbs_pcr_product.cut(BsaI)
print(len(rbs_cut_product))
rbs_cut_product[1].name = "rbs"
display(rbs_cut_product[1].figure())

egfp_cut_product = egfp_pcr_product.cut(BsaI)
print(len(egfp_cut_product))
egfp_cut_product[1].name = "egfp"
display(egfp_cut_product[1].figure())

terminator_cut_product = terminator_pcr_product.cut(BsaI)
print(len(terminator_cut_product))
terminator_cut_product[1].name = "terminator"
display(terminator_cut_product[1].figure())


asm = Assembly((promoter_cut_product[1], rbs_cut_product[1], egfp_cut_product[1], terminator_cut_product[1]), algorithm = terminal_overlap, limit = 4)

candidate = asm.assemble_linear()
print(len(candidate))

candidate[0].figure()   



3


Dseqrecord(-43)
[48;5;11mAAGC[0mTTGACGGCTAGCTCAGTCCTAGGTACAGTGCTAGC    
    AACTGCCGATCGAGTCAGGATCCATGTCACGATCGGAGG

3


Dseqrecord(-107)
[48;5;11mCTCC[0mAGCTGTCACCGGATGTGCTTTCCGGTCTGATGAGTCCGTGAGGACGAAACAGCCTCTACAAATAATTTTGTTTAATCTAGAGATTAAAGAGGAGAAATA    
    TCGACAGTGGCCTACACGAAAGGCCAGACTACTCAGGCACTCCTGCTTTGTCGGAGATGTTTATTAAAACAAATTAGATCTCTAATTTCTCCTCTTTATCAGT

3


Dseqrecord(-728)
[48;5;11mGTCA[0mATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGGCGATGCCACCTACGGCAAGCTGACCCTGAAGTTCATCTGCACCACCGGCAAGCTGCCCGTGCCCTGGCCCACCCTCGTGACCACCCTGACCTACGGCGTGCAGTGCTTCAGCCGCTACCCCGACCACATGAAGCAGCACGACTTCTTCAAGTCCGCCATGCCCGAAGGCTACGTCCAGGAGCGCACCATCTTCTTCAAGGACGACGGCAACTACAAGACCCGCGCCGAGGTGAAGTTCGAGGGCGACACCCTGGTGAACCGCATCGAGCTGAAGGGCATCGACTTCAAGGAGGACGGCAACATCCTGGGGCACAAGCTGGAGTACAACTACAACAGCCACAACGTCTATATCATGGCCGACAAGCAGAAGAACGGCATCAAGGTGAACTTCAAGATCCGCCACAACATCGAGGACGGCAGCGTGCAGCTCGCCGACCACTACCAGCAGAACACCCCCATCGGCGACGGCCCCGTGCTGCTGCCCGACAACCACTACCTGAGCACCCAGTCCGCCCTGAGCAAAGACCCCAACGAGAAGCGCGATCACATGGTCCTGCTGGAGTTCGTGACCGCCGCCGGGATCACTCTCGGCATGGACGAGCTGTACAAGTAA    
    TACCACTCGTTCCCGCTCCTCGACAAGTGGCCCCACCACGGGTAGGACCAGCTCGACCTGCCGCTGCATTTGCCGGTGTTCAAGTCGCACAGGCCGCTCCCGCTCCCGCTACGGTGGATGCCGTTCGACTGGGACTTCAAGTAGACGTGGTGGCCGTTCGACGGGCACGGGACCGGGTGGGAGCACTGGTGGGACTGGATGCCGCACGTCACGAAGTCGGCGATGGGGCTGGTGTA

3


Dseqrecord(-45)
[48;5;11mGGGA[0mTAGCGTGACCGGCGCATCGGTCACGCTATTTGTTGAG    
    ATCGCACTGGCCGCGTAGCCAGTGCGATAAACAACTCGCTT

1


promoter| 4
         \/
         /\
          4|rbs| 4
                \/
                /\
                 4|egfp| 4
                        \/
                        /\
                         4|terminator

### Assembly with wrong fragments

In [76]:
asm = Assembly((promoter_cut_product[0], rbs_cut_product[1], egfp_cut_product[1], terminator_cut_product[1]), algorithm = terminal_overlap, limit = 4)

candidate = asm.assemble_linear()
print(len(candidate))


0
