# SeqFlipper

This notebook describes how to flip sections of assemblies:

  ![alt text](SeqFlipper.png "SeqFlipper")

This is useful when we already have the parts for the section, but need them in the other direction. We define an **overhang** as the sequence that has no complement, as read in the 5'->3' direction. A **negative overhang** is the missing sequence (in the 5'->3' direction). For example:

    Overhang          (gctc) Negative overhang
      CTAC.............
          .............CGAG

### Explanation

The original sequence is put together in the following way:

    -->         -->                   -->
    ...     CTAC.............     GCTC...
    ...GATG     .............CGAG     ...


Arrows show orientation of sense sequence.
What should we put in '____' so that the sequence is flipped?

    -->                               -->
    ...     ____.............     GCTC...
    ...GATG     .............____     ...

Answer: the following overhangs will flip the section:

    -->      (1)  -->                     -->
    ...       GAGC.............       GCTC...
    ...GATG       .............CATC       ...
         (2)                  (2)       (1)



Result:

    -->                     -->
    ...CTAC.............GCTC...
    ...GATG.............CGAG...
                     <--


**Summary** of modifications for the middle, flipped, part:

* Left overhang (GAGC): put the reverse complement of the *next* section's overhang (as read 5'->3')
* Right side (gtag): put the reverse complement of the *previous* section's negative overhang (onto the 5'->3' strand)

```
          -->
5'   GAGC.............gtag   3'

3'   ____.............CATC   5'
```


### Example

In [1]:
%load_ext autoreload
%autoreload 2

In [2]:
import pandas
import genedom
import easy_dna
import proglog
proglog.notebook()

We use an example assembly to demonstrate this. The method requires that we can choose the left overhang of the first part and the right overhang of the last part of the flipped section, therefore it is best practice to flank the section with connectors.

The original map looked like this:

![alt text](Assembly_1_original_map.png "Map")

We will flip the orientation of section p13-p17. 

**SeqFlipper** is a Python module that helps in this. It contains the `FlipSeq()` function that converts the EMMA table csv file that we use with genedom. This is the original table:

In [56]:
emma = pandas.read_csv('EMMA_original.csv')
emma

Unnamed: 0,slot_name,left_overhang,right_overhang,enzyme,left_addition,right_addition,extra_avoided_sites,description
0,p1,TAGG,ATGG,BsmBI,,,,For slot p1 of the EMMA standard
1,p2,ATGG,GACT,BsmBI,,,,For slot p2 of the EMMA standard
2,p3,GACT,GGAC,BsmBI,,,,For slot p3 of the EMMA standard
3,p4,GGAC,TCCG,BsmBI,,,,For slot p4 of the EMMA standard
4,p5,TCCG,CCAG,BsmBI,,,,For slot p5 of the EMMA standard
5,p6,CCAG,CAGC,BsmBI,,,,For slot p6 of the EMMA standard
6,p7,CAGC,AGGC,BsmBI,,GG,,For slot p7 of the EMMA standard
7,p8,AGGC,GCGT,BsmBI,,,,For slot p8 of the EMMA standard
8,p8a,AGGC,ATCC,BsmBI,,,,For slot p8a of the EMMA standard
9,p8b,ATCC,GCGT,BsmBI,,,,For slot p8b of the EMMA standard


In [57]:
import seqflipper

In [58]:
left = 'p13'
right = 'p17'

In [59]:
new_emma = seqflipper.FlipSeq(left, right, emma)
new_emma

Unnamed: 0,slot_name,left_overhang,right_overhang,enzyme,left_addition,right_addition,extra_avoided_sites,description
0,p1,TAGG,ATGG,BsmBI,,,,For slot p1 of the EMMA standard
1,p2,ATGG,GACT,BsmBI,,,,For slot p2 of the EMMA standard
2,p3,GACT,GGAC,BsmBI,,,,For slot p3 of the EMMA standard
3,p4,GGAC,TCCG,BsmBI,,,,For slot p4 of the EMMA standard
4,p5,TCCG,CCAG,BsmBI,,,,For slot p5 of the EMMA standard
5,p6,CCAG,CAGC,BsmBI,,,,For slot p6 of the EMMA standard
6,p7,CAGC,AGGC,BsmBI,,GG,,For slot p7 of the EMMA standard
7,p8,AGGC,GCGT,BsmBI,,,,For slot p8 of the EMMA standard
8,p8a,AGGC,ATCC,BsmBI,,,,For slot p8a of the EMMA standard
9,p8b,ATCC,GCGT,BsmBI,,,,For slot p8b of the EMMA standard


Note modified lines 14 and 18 in the new EMMA table.

In [49]:
new_emma.to_csv('EMMA_flipped.csv', index=False)

#### Part domestication

In [50]:
records_to_domesticate = easy_dna.records_from_data_files(folder="files")
EMMA_PLUS = genedom.GoldenGateDomesticator.standard_from_spreadsheet("EMMA_flipped.csv")
genedom.batch_domestication(
    records=records_to_domesticate, standard=EMMA_PLUS, target="domestication_report_flipped/")

HBox(children=(IntProgress(value=0, description='record', max=13), HTML(value='')))



(0, None)

(Ideally, you want to use a new assembly plan)

In [51]:
import dnacauldron as dc
repository = dc.SequenceRepository()
repository.import_records(folder="domestication_report_flipped/domesticated/")
repository.import_records(folder="original/")
assembly_plan = dc.AssemblyPlan.from_spreadsheet(
    name="seqflipper",
    path="assembly_plan_original.csv",
    assembly_class=dc.Type2sRestrictionAssembly
)
simulation = assembly_plan.simulate(sequence_repository=repository)

Simulating assembly plan seqflipper...


HBox(children=(IntProgress(value=0, description='assembly', max=1), HTML(value='')))



#### Cloning simulation

In [52]:
stats = simulation.compute_stats()
print (stats)
report_writer = dc.AssemblyReportWriter(
    include_assemblgit@github.com:Edinburgh-Genome-Foundry/Examples.gity_plots=True,
    include_mix_graphs=True
)
simulation.write_report("flipped_predicted_constructs/", assembly_report_writer=report_writer)

{'cancelled_assemblies': 0, 'errored_assemblies': 0, 'valid_assemblies': 1}
Generating assemblies reports...


HBox(children=(IntProgress(value=0, description='assembly', max=1), HTML(value='')))

The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead.
  if not cb.iterable(width):
The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead.
  if cb.iterable(node_size):  # many node sizes
findfont: Font family ['Inconsolata'] not found. Falling back to DejaVu Sans.
findfont: Font family ['Inconsolata'] not found. Falling back to DejaVu Sans.
The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead.
  if not cb.iterable(width):
The iterable function was deprecated in Matplotlib 3.1 and will be removed in 3.3. Use np.iterable instead.
  if cb.iterable(node_size):  # many node sizes




#### Results

The new map:

![alt text](Assembly_1_flipped.png "Map")

Note that the orientation of section p13-p17 has changed, compared to the original map.