# Integration of basicsynbio and DNA chisel

This notebook explores the integration of DNA chisel into basicsynbio for linker design purposes

## Aims and objectives for cell/s below

- [ ] Try out DNA Chisel with easy to implement constraints.
- [ ] Make a Bowtie 2 file for sequences that will be present in the basicsynbio PartLinkerCollections.
- [ ] Run DNA Chisel to generate the backbone linker for the addgene collection.

In [1]:
from dnachisel import (
    AvoidMatches,
    AvoidPattern,
    DnaOptimizationProblem,
    EnforceGCContent,
    EnforceMeltingTemperature,
    EnzymeSitePattern,
    random_dna_sequence,
)

In [9]:
linker_base_sequence = "GGCTCG" + random_dna_sequence(45, seed=10) + "GTCC"
constraints = (
    EnforceMeltingTemperature(mini=55, maxi=65, location=(13, 34)),
    AvoidPattern(EnzymeSitePattern("EcoRI")),
    AvoidPattern(EnzymeSitePattern("SpeI")),
    AvoidPattern(EnzymeSitePattern("XbaI")),
    AvoidPattern(EnzymeSitePattern("PstI")),
    AvoidPattern(EnzymeSitePattern("BsaI")),
    AvoidPattern(EnzymeSitePattern("BsmBI")),
    AvoidPattern("TTGACA"), # E.coli sig70 -35 site
    AvoidPattern("TATAAT"), # E.coli sig70 -10 site
    AvoidPattern("TTGNNNNNNNNNNNNNNNNNNNNTATNNT"), # E.coli sig70 promoter weak consensus,
    AvoidPattern("TGGCACGNNNNTTGC"), # E.coli sig54 promoter consensus
    AvoidPattern("GAACTNNNNNNNNNNNNNNNNGTCNNA"), # E.coli sig24 promoter consensus
    AvoidPattern("AAAGA"), # RBS
    AvoidPattern("AGGAGG"), # Shine-Dalgarno sequence or 2xArg bad codon
    AvoidPattern("ATG"), # Start codon
    AvoidPattern("TTATNCACA"), # DnaA binding sites
    AvoidPattern("TGTGANNNNNNTCACANT"), # CAP binding sites
    AvoidPattern("NGCTNAGCN"), # IS10 insertion site
    AvoidPattern("GGGNNNNNCCC"), # IS231 insertion site
    AvoidPattern("(G{3,}[ATGC]{1,7}){3,}G{3,}"), # G-quadruplex
    AvoidPattern("GGGG"), # G-quadruplex
)
problem = DnaOptimizationProblem(
    sequence=linker_base_sequence,
    constraints=constraints
)
problem.resolve_constraints()
print(problem.constraints_text_summary())
print(linker_base_sequence)
print(problem.sequence)





constraint:   0%|          | 0/21 [00:00<?, ?it/s, now=AvoidPattern[0-55](patter...]

location:   0%|          | 0/2 [00:00<?, ?it/s, now=None][A[A

location:   0%|          | 0/2 [00:00<?, ?it/s, now=10-13(+)][A[A

                                                             [A[A

location:   0%|          | 0/1 [00:00<?, ?it/s, now=10-13(+)][A[A

location:   0%|          | 0/1 [00:00<?, ?it/s, now=13-34]   [A[A

                                                          [A[A

location:   0%|          | 0/1 [00:00<?, ?it/s, now=13-34][A[A

                                                                                    ===> SUCCESS - all constraints evaluations pass
✔PASS ┍ EnforceMeltingTemperature[13-34]
      │ Tm = 57.8
✔PASS ┍ AvoidPattern[0-55](pattern:EcoRI(GAATTC))
      │ Passed. Pattern not found !
✔PASS ┍ AvoidPattern[0-55](pattern:SpeI(ACTAGT))
      │ Passed. Pattern not found !
✔PASS ┍ AvoidPattern[0-55](pattern:XbaI(TCTAGA))
      │ Passed. Pattern not fo

In [11]:
help(AvoidMatches)

Help on class AvoidMatches in module dnachisel.builtin_specifications.AvoidMatches:

class AvoidMatches(dnachisel.Specification.Specification.Specification)
 |  AvoidMatches(match_length, bowtie_index=None, sequences=None, mismatches=0, location=None, boost=1)
 |  
 |  Enforce that the sequence has no matches longer than N in a given index.
 |  
 |  This specification can be used to ensure that a sequence has no matches
 |  with a given organism, or a set of sequences, which can be useful to
 |  create orthogonal sequences or primer-friendly regions.
 |  
 |  This specification uses Bowtie in the background and requires Bowtie
 |  installed on your machine (it can be as simple as ``apt install bowtie``
 |  on Ubuntu).
 |  
 |  It allows you to specify the ``match_length`` such that no subsegment of
 |  size match_length or more has any homology in the given bowtie index (which
 |  can be built from genomes using e.g. the genome_collector library). An
 |  homology can mean either perfec