# Getting assembly data and running RSs-free codon harmonization

The parameter used to get the best assembly version is to check the date of the last modification and the Status, which informs on the assembly level (complete genome, Scaffold, Contig).

As Bart suggested it makes sense to prioritize the level of the assembly, indeed the frequences in a contig will be very different from the one of the whole genome. 


I'll do a test using Bacillus methanolicus, species from which in uniprot there is a curated version of ALCD1 == methanold dehydrogenase

The schematic of the algorithm looks something like this:

Bacillus methanolicus

<pre>
Read tsv file from prokaryotes using csv package:
    For row in file:
        If a match is found within #Organism/Name column:
            Write a dict with {‘species’: …, ‘date’: …, 'status':..., ‘assembly’:…}
            The dict should be added to a list [{ , , }, { , , }, { , , }]
    Date_list = []
    For dictionary in list:
        Date_list.append(Dictionary[‘date’] )
    Latest = Max(date_list)
    For dictionary in list:
        If dictionary[‘date’] == latest:
            Assembly = dictionary[‘assembly’]
</pre>

In [1]:
from pipeline.scripts.to_and_from_biobrick_wikidata import run_harmonization_per_model

In [7]:
input_wiki = [[{'reaction': {'EC': '1.14.18.3', 'KEGG': 'R09518'}},
  {'reaction': {'EC': '1.1.1.244', 'KEGG': 'R00605'}},
  {'reaction': {'EC': '1.2.1.22', 'KEGG': None}}],
 [{'reaction': {'EC': '1.14.18.3', 'KEGG': 'R09518'}},
  {'reaction': {'EC': '1.1.1.244', 'KEGG': 'R00605'}},
  {'reaction': {'EC': '1.1.1.27', 'KEGG': 'R00703'}}]]

In [1]:
output_Riemer = [{'1.1.1.244':{'BB_name': 'BBa_K1619001', 'BB_Seq': 'ATGAAAAACACTCAAAGTGCATTTTACATGCCTTCAGTCAATCTATTTGGTGCAGGCTCTGTTAATGAGGTTGGAACTCGATTAGCTGGTCTTGGTGTGAAAAAAGCTTTATTAGTTACAGATGCTGGTCTTCACAGTTTAGGCCTTTCTGAAAAAATTGCCGGTATCATTCGTGAAGCTGGTGTGGAAGTAGCTATTTTTCCAAAAGCCGAACCAAATCCAACTGATAAAAACGTCGCAGAAGGTTTAGAAGCGTATAACGCTGAAAACTGTGACAGCATTGTCACTCTTGGCGGCGGAAGCTCACATGATGCTGGAAAAGCCATTGCATTAGTAGCTGCTAACGGTGGAACAATTCACGATTATGAAGGTGTCGATGTATCAAAAAAACCAATGGTCCCTCTAATTGCGATTAATACAACAGCTGGTACAGGCAGTGAATTAACTAAATTCACAATCATCACAGATACTGAACGCAAAGTGAAAATGGCCATTGTTGATAAACATGTAACACCTACACTTTCAATCAATGACCCAGAGCTAATGGTTGGAATGCCTCCGTCCTTAACAGCTGCTACTGGATTAGATGCATTAACTCATGCGATTGAAGCATATGTTTCAACTGGTGCTACTCCAATTACAGATGCACTTGCAATTCAGGCGATCAAAATTATTTCTAAATACTTGCCGCGTGCAGTTGCAAATGGAAAAGACATTGAAGCACGTGAACAAATGGCCTTCGCACAATCATTAGCTGGCATGGCATTCAATAACGCGGGTTTAGGCTATGTTCATGCGATTGCACACCAATTAGGAGGATTCTACAACTTCCCTCATGGCGTTTGCAATGCGATCCTTCTGCCGCATGTTTGTCGTTTCAACTTAATTTCTAAAGTGGAACGTTATGCAGAAATCGCTGCTTTTCTTGGTGAAAATGTCGACGGCCTAAGCACCTACGAAGCAGCTGAAAAAGCTATTAAAGCGATCGAAAGAATGGCTAGAGACCTTAACATTCCAAAAGGCTTTAAAGAACTAGGTGCTAAAGAAGAAGATATTGAGACTTTAGCTAAAAATGCGATGAATGATGCATGTGCATTAACAAATCCTCGTAAACCTAAGTTAGAAGAAGTCATCCAAATTATTAAAAATGCTATGTAA', 'species': 'Bacillus methanolicus'}, 
 '1.14.18.3':{'BB_name': 'BBa_K1390019' , 'BB_Seq': 'ATGGCAGCAACAACCATTGGTGGTGCAGCTGCGGCGGAAGCGCCGCTGCTGGACAAGAAGTGGCTCACGTTCGCACTGGCGATTTACACCGTGTTCTACCTGTGGGTGCGGTGGTACGAAGGTGTCTATGGCTGGTCCGCCGGACTGGACTCGTTCGCGCCGGAGTTCGAGACCTACTGGATGAATTTCCTGTACACCGAGATCGTCCTGGAGATCGTGACGGCTTCGATCCTGTGGGGCTATCTCTGGAAGACCCGCGACCGCAACCTGGCCGCGCTGACCCCGCGTGAAGAGCTGCGCCGCAACTTCACCCACCTGGTGTGGCTGGTGGCCTACGCCTGGGCCATCTACTGGGGCGCATCCTACTTCACCGAGCAGGACGGCACCTGGCATCAGACGATCGTGCGCGACACCGACTTCACGCCGTCGCACATCATCGAGTTCTATCTGAGCTACCCGATCTACATCATCACCGGTTTTGCGGCGTTCATCTACGCCAAGACGCGTCTGCCGTTCTTCGCGAAGGGCATCTCGCTGCCGTACCTGGTGCTGGTGGTGGGTCCGTTCATGATTCTGCCGAACGTGGGTCTGAACGAATGGGGCCACACCTTCTGGTTCATGGAAGAGCTGTTCGTGGCGCCGCTGCACTACGGCTTCGTGATCTTCGGCTGGCTGGCACTGGCCGTCATGGGCACCCTGACCCAGACCTTCTACAGATTCGCTCAGGGAGGGCTGGGGCAGTCGCTCTGTGAAGCCGTGGACGAAGGCTTGATCGCGAAATAA' , 'species': 'Methylococcus capsulatus'}, 
'1.2.1.22':{'BB_name': 'BBa_K1324000', 'BB_Seq': 'ATGACCAATAATCCCCCTTCAGCACAGATTAAGCCCGGCGAGTATGGTTTCCCCCTCAAGTTAAAAGCCCGCTATGACAACTTTATTGGCGGCGAATGGGTAGCCCCTGCCGACGGCGAGTATTACCAGAATCTGACGCCGGTGACCGGGCAGCTGCTGTGCGAAGTGGCGTCTTCGGGCAAACGAGACATCGATCTGGCGCTGGATGCTGCGCACAAAGTGAAAGATAAATGGGCGCACACCTCGGTGCAGGATCGTGCGGCGATTCTGTTTAAGATTGCCGATCGAATGGAACAAAACCTCGAGCTGTTAGCGACAGCTGAAACCTGGGATAACGGCAAACCCATTCGCGAAACCAGTGCTGCTGATGTACCGCTGGCGATTGACCATTTCCGCTATTTCGCCTCGTGTATCCGGGCACAGGAAGGCGGTATCAGTGAAGTTGATAGCGAAACCGTGGCCTATCATTTCCACGAACCGTTAGGCGTGGTGGGGCAGATTATTCCGTGGAACTTCCCGCTGCTGATGGCGAGCTGGAAAATGGCTCCCGCGCTGGCGGCGGGCAACTGTGTGGTGCTGAAACCCGCACGTCTTACCCCGCTTTCTGTACTGCTGCTAATGGAAATCGTCGGTGATTTACTGCCGCCGGGCGTGGTGAACGTGGTCAACGGCGCAGGTGGGGAAATTGGCGAATATCTGGCGACCTCGAAACGCATCGCCAAAGTGGCGTTTACCGGCTCAACGGAAGTGGGCCAACAAATTATGCAATACGCCACGCAAAACATTATTCCGGTGACGCTGGAGCTGGGCGGCAAATCGCCAAATATCTTCTTTGCTGATGTGATGGATGAAGAAGATGCCTTTTTCGATAAAGCGCTGGAAGGCTTTGCACTGTTTGCCTTTAACCAGGGCGAAGTTTGCACCTGTCCGAGTCGTGCTTTAGTGCAGGAATCTATCTACGAACGCTTTATGGAACGCGCCATCCGCCGTGTCGAAAGCATTCGTAGCGGTAACCCGCTCGACAGCGTGACGCAAATGGGCGCGCAGGTTTCTCACGGGCAACTGGAAACCATCCTCAACTACATTGATATCGGTAAAAAAGAGGGCGCTGACGTGCTCACAGGCGGGCGGCGCAAGCTGCTGGAAGGTGAACTGAAAGACGGCTACTACCTCGAACCGACGATTCTGTTTGGTCAGAACAATATGCGGGTGTTCCAGGAGGAGATTTTTGGCCCGGTGCTGGCGGTGACCACCTTCAAAACGATGGAAGAAGCGCTGGAGCTGGCGAACGATACGCAATATGGCCTGGGCGCGGGCGTCTGGAGCCGCAACGGTAATCTGGCCTATAAGATGGGGCGCGGCATACAGGCTGGGCGCGTGTGGACCAACTGTTATCACGCTTACCCGGCACATGCGGCGTTTGGTGGCTACAAACAATCAGGTATCGGTCGCGAAACCCACAAGATGATGCTGGAGCATTACCAGCAAACCAAGTGCCTGCTGGTGAGCTACTCGGATAAACCGTTGGGGCTGTTCTGA', 'species':'Escherichia coli'}},
{'1.1.1.244':{'BB_name': 'BBa_K1619001', 'BB_Seq': 'ATGAAAAACACTCAAAGTGCATTTTACATGCCTTCAGTCAATCTATTTGGTGCAGGCTCTGTTAATGAGGTTGGAACTCGATTAGCTGGTCTTGGTGTGAAAAAAGCTTTATTAGTTACAGATGCTGGTCTTCACAGTTTAGGCCTTTCTGAAAAAATTGCCGGTATCATTCGTGAAGCTGGTGTGGAAGTAGCTATTTTTCCAAAAGCCGAACCAAATCCAACTGATAAAAACGTCGCAGAAGGTTTAGAAGCGTATAACGCTGAAAACTGTGACAGCATTGTCACTCTTGGCGGCGGAAGCTCACATGATGCTGGAAAAGCCATTGCATTAGTAGCTGCTAACGGTGGAACAATTCACGATTATGAAGGTGTCGATGTATCAAAAAAACCAATGGTCCCTCTAATTGCGATTAATACAACAGCTGGTACAGGCAGTGAATTAACTAAATTCACAATCATCACAGATACTGAACGCAAAGTGAAAATGGCCATTGTTGATAAACATGTAACACCTACACTTTCAATCAATGACCCAGAGCTAATGGTTGGAATGCCTCCGTCCTTAACAGCTGCTACTGGATTAGATGCATTAACTCATGCGATTGAAGCATATGTTTCAACTGGTGCTACTCCAATTACAGATGCACTTGCAATTCAGGCGATCAAAATTATTTCTAAATACTTGCCGCGTGCAGTTGCAAATGGAAAAGACATTGAAGCACGTGAACAAATGGCCTTCGCACAATCATTAGCTGGCATGGCATTCAATAACGCGGGTTTAGGCTATGTTCATGCGATTGCACACCAATTAGGAGGATTCTACAACTTCCCTCATGGCGTTTGCAATGCGATCCTTCTGCCGCATGTTTGTCGTTTCAACTTAATTTCTAAAGTGGAACGTTATGCAGAAATCGCTGCTTTTCTTGGTGAAAATGTCGACGGCCTAAGCACCTACGAAGCAGCTGAAAAAGCTATTAAAGCGATCGAAAGAATGGCTAGAGACCTTAACATTCCAAAAGGCTTTAAAGAACTAGGTGCTAAAGAAGAAGATATTGAGACTTTAGCTAAAAATGCGATGAATGATGCATGTGCATTAACAAATCCTCGTAAACCTAAGTTAGAAGAAGTCATCCAAATTATTAAAAATGCTATGTAA', 'species': 'Bacillus methanolicus'}, 
'1.14.18.3':{'BB_name': 'BBa_K1390019' , 'BB_Seq': 'ATGGCAGCAACAACCATTGGTGGTGCAGCTGCGGCGGAAGCGCCGCTGCTGGACAAGAAGTGGCTCACGTTCGCACTGGCGATTTACACCGTGTTCTACCTGTGGGTGCGGTGGTACGAAGGTGTCTATGGCTGGTCCGCCGGACTGGACTCGTTCGCGCCGGAGTTCGAGACCTACTGGATGAATTTCCTGTACACCGAGATCGTCCTGGAGATCGTGACGGCTTCGATCCTGTGGGGCTATCTCTGGAAGACCCGCGACCGCAACCTGGCCGCGCTGACCCCGCGTGAAGAGCTGCGCCGCAACTTCACCCACCTGGTGTGGCTGGTGGCCTACGCCTGGGCCATCTACTGGGGCGCATCCTACTTCACCGAGCAGGACGGCACCTGGCATCAGACGATCGTGCGCGACACCGACTTCACGCCGTCGCACATCATCGAGTTCTATCTGAGCTACCCGATCTACATCATCACCGGTTTTGCGGCGTTCATCTACGCCAAGACGCGTCTGCCGTTCTTCGCGAAGGGCATCTCGCTGCCGTACCTGGTGCTGGTGGTGGGTCCGTTCATGATTCTGCCGAACGTGGGTCTGAACGAATGGGGCCACACCTTCTGGTTCATGGAAGAGCTGTTCGTGGCGCCGCTGCACTACGGCTTCGTGATCTTCGGCTGGCTGGCACTGGCCGTCATGGGCACCCTGACCCAGACCTTCTACAGATTCGCTCAGGGAGGGCTGGGGCAGTCGCTCTGTGAAGCCGTGGACGAAGGCTTGATCGCGAAATAA' , 'species': 'Methylococcus capsulatus'}, 
'1.1.1.27': {'BB_name': 'BBa_K1696003' , 'BB_Seq': 'GTGGCAAGTATTACGGATAAGGATCACCAAAAAGTTATTCTCGTTGGTGACGGCGCCGTTGGTTCAAGTTATGCCTATGCAATGGTATTGCAAGGTATTGCACAAGAAATCGGGATCGTTGACATTTTTAAGGACAAGACGAAGGGTGACGCGATTGACTTAAGCAACGCGCTGCCATTCACCAGCCCAAAGAAGATTTATTCAGCTGAATACAGCGATGCCAAGGATGCTGATCTGGTTGTTATCACTGCTGGTGCTCCTCAGAAGCCAGGCGAAACCCGCTTGGATCTGGTTAACAAGAACTTGAAGATCTTGAAGTCCATTGTTGATCCGATTGTGGATTCTGGCTTTAACGGTATCTTCTTGGTTGCTGCCAACCCAGTTGATATCTTGACCTATGCAACTTGGAAACTTTCCGGCTTCCCGAAGAACCGGGTTGTTGGTTCAGGTACTTCATTGGATACCGCACGTTTCCGTCAGTCCATTGCTGAAATGGTTAACGTTGATGCACGTTCGGTCCACGCTTACATCATGGGTGAACATGGTGACACTGAATTCCCTGTATGGTCACACGCTAACATCGGTGGCGTTACCATTGCCGAATGGGTTAAAGCACATCCGGAAATCAAGGAAGACAAGCTTGTTAAGATGTTTGAAGACGTTCGTGACGCTGCTTACGAAATCATCAAACTCAAGGGCGCAACCTTCTATGGTATCGCAACTGCTTTGGCACGTATCTCCAAGGCTATCCTGAACGATGAAAATGCTGTTCTGCCACTGTCCGTTTACATGGATGGTCAATATGGCTTGAACGACATCTACATCGGTACCCCAGCTGTGATCAACCGCAATGGTATCCAGAACATTCTGGAAATTCCATTGACCGACCACGAAGAGGAATCCATGCAGAAATCTGCTTCACAATTGAAGAAGGTTCTGACTGATGCCTTCGCGAAGAACGACATCGAAACCCGTCAGTAA' ,'species': 'Lactobacillus casei'}}]


In [2]:
output_Riemer2 = [{'1.1.1.244':{'BB_name': 'BBa_K1619001', 'BB_Seq': 'ATGAAAAACACTCAAAGTGCATTTTACATGCCTTCAGTCAATCTATTTGGTGCAGGCTCTGTTAATGAGGTTGGAACTCGATTAGCTGGTCTTGGTGTGAAAAAAGCTTTATTAGTTACAGATGCTGGTCTTCACAGTTTAGGCCTTTCTGAAAAAATTGCCGGTATCATTCGTGAAGCTGGTGTGGAAGTAGCTATTTTTCCAAAAGCCGAACCAAATCCAACTGATAAAAACGTCGCAGAAGGTTTAGAAGCGTATAACGCTGAAAACTGTGACAGCATTGTCACTCTTGGCGGCGGAAGCTCACATGATGCTGGAAAAGCCATTGCATTAGTAGCTGCTAACGGTGGAACAATTCACGATTATGAAGGTGTCGATGTATCAAAAAAACCAATGGTCCCTCTAATTGCGATTAATACAACAGCTGGTACAGGCAGTGAATTAACTAAATTCACAATCATCACAGATACTGAACGCAAAGTGAAAATGGCCATTGTTGATAAACATGTAACACCTACACTTTCAATCAATGACCCAGAGCTAATGGTTGGAATGCCTCCGTCCTTAACAGCTGCTACTGGATTAGATGCATTAACTCATGCGATTGAAGCATATGTTTCAACTGGTGCTACTCCAATTACAGATGCACTTGCAATTCAGGCGATCAAAATTATTTCTAAATACTTGCCGCGTGCAGTTGCAAATGGAAAAGACATTGAAGCACGTGAACAAATGGCCTTCGCACAATCATTAGCTGGCATGGCATTCAATAACGCGGGTTTAGGCTATGTTCATGCGATTGCACACCAATTAGGAGGATTCTACAACTTCCCTCATGGCGTTTGCAATGCGATCCTTCTGCCGCATGTTTGTCGTTTCAACTTAATTTCTAAAGTGGAACGTTATGCAGAAATCGCTGCTTTTCTTGGTGAAAATGTCGACGGCCTAAGCACCTACGAAGCAGCTGAAAAAGCTATTAAAGCGATCGAAAGAATGGCTAGAGACCTTAACATTCCAAAAGGCTTTAAAGAACTAGGTGCTAAAGAAGAAGATATTGAGACTTTAGCTAAAAATGCGATGAATGATGCATGTGCATTAACAAATCCTCGTAAACCTAAGTTAGAAGAAGTCATCCAAATTATTAAAAATGCTATGTAA', 'species': 'Bacillus methanolicus'}, 
 '1.14.18.3':{'BB_name': 'BBa_K1390019' , 'BB_Seq': 'ATGGCAGCAACAACCATTGGTGGTGCAGCTGCGGCGGAAGCGCCGCTGCTGGACAAGAAGTGGCTCACGTTCGCACTGGCGATTTACACCGTGTTCTACCTGTGGGTGCGGTGGTACGAAGGTGTCTATGGCTGGTCCGCCGGACTGGACTCGTTCGCGCCGGAGTTCGAGACCTACTGGATGAATTTCCTGTACACCGAGATCGTCCTGGAGATCGTGACGGCTTCGATCCTGTGGGGCTATCTCTGGAAGACCCGCGACCGCAACCTGGCCGCGCTGACCCCGCGTGAAGAGCTGCGCCGCAACTTCACCCACCTGGTGTGGCTGGTGGCCTACGCCTGGGCCATCTACTGGGGCGCATCCTACTTCACCGAGCAGGACGGCACCTGGCATCAGACGATCGTGCGCGACACCGACTTCACGCCGTCGCACATCATCGAGTTCTATCTGAGCTACCCGATCTACATCATCACCGGTTTTGCGGCGTTCATCTACGCCAAGACGCGTCTGCCGTTCTTCGCGAAGGGCATCTCGCTGCCGTACCTGGTGCTGGTGGTGGGTCCGTTCATGATTCTGCCGAACGTGGGTCTGAACGAATGGGGCCACACCTTCTGGTTCATGGAAGAGCTGTTCGTGGCGCCGCTGCACTACGGCTTCGTGATCTTCGGCTGGCTGGCACTGGCCGTCATGGGCACCCTGACCCAGACCTTCTACAGATTCGCTCAGGGAGGGCTGGGGCAGTCGCTCTGTGAAGCCGTGGACGAAGGCTTGATCGCGAAATAA' , 'species': 'Methylococcus capsulatus'}, 
'1.2.1.22':{'BB_name': None, 'BB_Seq': None, 'species':None}},
{'1.1.1.244':{'BB_name': 'BBa_K1619001', 'BB_Seq': 'ATGAAAAACACTCAAAGTGCATTTTACATGCCTTCAGTCAATCTATTTGGTGCAGGCTCTGTTAATGAGGTTGGAACTCGATTAGCTGGTCTTGGTGTGAAAAAAGCTTTATTAGTTACAGATGCTGGTCTTCACAGTTTAGGCCTTTCTGAAAAAATTGCCGGTATCATTCGTGAAGCTGGTGTGGAAGTAGCTATTTTTCCAAAAGCCGAACCAAATCCAACTGATAAAAACGTCGCAGAAGGTTTAGAAGCGTATAACGCTGAAAACTGTGACAGCATTGTCACTCTTGGCGGCGGAAGCTCACATGATGCTGGAAAAGCCATTGCATTAGTAGCTGCTAACGGTGGAACAATTCACGATTATGAAGGTGTCGATGTATCAAAAAAACCAATGGTCCCTCTAATTGCGATTAATACAACAGCTGGTACAGGCAGTGAATTAACTAAATTCACAATCATCACAGATACTGAACGCAAAGTGAAAATGGCCATTGTTGATAAACATGTAACACCTACACTTTCAATCAATGACCCAGAGCTAATGGTTGGAATGCCTCCGTCCTTAACAGCTGCTACTGGATTAGATGCATTAACTCATGCGATTGAAGCATATGTTTCAACTGGTGCTACTCCAATTACAGATGCACTTGCAATTCAGGCGATCAAAATTATTTCTAAATACTTGCCGCGTGCAGTTGCAAATGGAAAAGACATTGAAGCACGTGAACAAATGGCCTTCGCACAATCATTAGCTGGCATGGCATTCAATAACGCGGGTTTAGGCTATGTTCATGCGATTGCACACCAATTAGGAGGATTCTACAACTTCCCTCATGGCGTTTGCAATGCGATCCTTCTGCCGCATGTTTGTCGTTTCAACTTAATTTCTAAAGTGGAACGTTATGCAGAAATCGCTGCTTTTCTTGGTGAAAATGTCGACGGCCTAAGCACCTACGAAGCAGCTGAAAAAGCTATTAAAGCGATCGAAAGAATGGCTAGAGACCTTAACATTCCAAAAGGCTTTAAAGAACTAGGTGCTAAAGAAGAAGATATTGAGACTTTAGCTAAAAATGCGATGAATGATGCATGTGCATTAACAAATCCTCGTAAACCTAAGTTAGAAGAAGTCATCCAAATTATTAAAAATGCTATGTAA', 'species': 'Bacillus methanolicus'}, 
'1.14.18.3':{'BB_name': 'BBa_K1390019' , 'BB_Seq': 'ATGGCAGCAACAACCATTGGTGGTGCAGCTGCGGCGGAAGCGCCGCTGCTGGACAAGAAGTGGCTCACGTTCGCACTGGCGATTTACACCGTGTTCTACCTGTGGGTGCGGTGGTACGAAGGTGTCTATGGCTGGTCCGCCGGACTGGACTCGTTCGCGCCGGAGTTCGAGACCTACTGGATGAATTTCCTGTACACCGAGATCGTCCTGGAGATCGTGACGGCTTCGATCCTGTGGGGCTATCTCTGGAAGACCCGCGACCGCAACCTGGCCGCGCTGACCCCGCGTGAAGAGCTGCGCCGCAACTTCACCCACCTGGTGTGGCTGGTGGCCTACGCCTGGGCCATCTACTGGGGCGCATCCTACTTCACCGAGCAGGACGGCACCTGGCATCAGACGATCGTGCGCGACACCGACTTCACGCCGTCGCACATCATCGAGTTCTATCTGAGCTACCCGATCTACATCATCACCGGTTTTGCGGCGTTCATCTACGCCAAGACGCGTCTGCCGTTCTTCGCGAAGGGCATCTCGCTGCCGTACCTGGTGCTGGTGGTGGGTCCGTTCATGATTCTGCCGAACGTGGGTCTGAACGAATGGGGCCACACCTTCTGGTTCATGGAAGAGCTGTTCGTGGCGCCGCTGCACTACGGCTTCGTGATCTTCGGCTGGCTGGCACTGGCCGTCATGGGCACCCTGACCCAGACCTTCTACAGATTCGCTCAGGGAGGGCTGGGGCAGTCGCTCTGTGAAGCCGTGGACGAAGGCTTGATCGCGAAATAA' , 'species': 'Methylococcus capsulatus'}, 
'1.1.1.27': {'BB_name': 'BBa_K1696003' , 'BB_Seq': 'GTGGCAAGTATTACGGATAAGGATCACCAAAAAGTTATTCTCGTTGGTGACGGCGCCGTTGGTTCAAGTTATGCCTATGCAATGGTATTGCAAGGTATTGCACAAGAAATCGGGATCGTTGACATTTTTAAGGACAAGACGAAGGGTGACGCGATTGACTTAAGCAACGCGCTGCCATTCACCAGCCCAAAGAAGATTTATTCAGCTGAATACAGCGATGCCAAGGATGCTGATCTGGTTGTTATCACTGCTGGTGCTCCTCAGAAGCCAGGCGAAACCCGCTTGGATCTGGTTAACAAGAACTTGAAGATCTTGAAGTCCATTGTTGATCCGATTGTGGATTCTGGCTTTAACGGTATCTTCTTGGTTGCTGCCAACCCAGTTGATATCTTGACCTATGCAACTTGGAAACTTTCCGGCTTCCCGAAGAACCGGGTTGTTGGTTCAGGTACTTCATTGGATACCGCACGTTTCCGTCAGTCCATTGCTGAAATGGTTAACGTTGATGCACGTTCGGTCCACGCTTACATCATGGGTGAACATGGTGACACTGAATTCCCTGTATGGTCACACGCTAACATCGGTGGCGTTACCATTGCCGAATGGGTTAAAGCACATCCGGAAATCAAGGAAGACAAGCTTGTTAAGATGTTTGAAGACGTTCGTGACGCTGCTTACGAAATCATCAAACTCAAGGGCGCAACCTTCTATGGTATCGCAACTGCTTTGGCACGTATCTCCAAGGCTATCCTGAACGATGAAAATGCTGTTCTGCCACTGTCCGTTTACATGGATGGTCAATATGGCTTGAACGACATCTACATCGGTACCCCAGCTGTGATCAACCGCAATGGTATCCAGAACATTCTGGAAATTCCATTGACCGACCACGAAGAGGAATCCATGCAGAAATCTGCTTCACAATTGAAGAAGGTTCTGACTGATGCCTTCGCGAAGAACGACATCGAAACCCGTCAGTAA' ,'species': 'Lactobacillus casei'}}]


In [3]:
len(output_Riemer2)

2

In [7]:
for i in output_Riemer2:
    for key, subdict in i.items():
        print(key)

1.1.1.244
1.14.18.3
1.2.1.22
1.1.1.244
1.14.18.3
1.1.1.27


## Running harmonization

In [4]:
out_harm = run_harmonization_per_model(output_Riemer2, 'new_ch4_lac.csv')

DEBUG:root:Bacillus methanolicus
DEBUG:root:ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/724/485/GCA_000724485.1_ASM72448v1/GCA_000724485.1_ASM72448v1_cds_from_genomic.fna.gz


Beginning file download with wget module
 65% [..................................................                          ] 647168 / 983137

DEBUG:root:codonharmonizer Bacillus_methanolicus.cds.fna --write_freqs > Bacillus_methanolicus.freq.csv


 66% [..................................................                          ] 655360 / 983137 67% [...................................................                         ] 663552 / 983137 68% [...................................................                         ] 671744 / 983137 69% [....................................................                        ] 679936 / 983137 69% [.....................................................                       ] 688128 / 983137 70% [.....................................................                       ] 696320 / 983137 71% [......................................................                      ] 704512 / 983137 72% [.......................................................                     ] 712704 / 983137 73% [.......................................................                     ] 720896 / 983137 74% [........................................................                    ] 729088 / 983137

INFO:numexpr.utils:NumExpr defaulting to 4 threads.
DEBUG:root:{'AAA': 'AAA', 'AAC': 'AAT', 'AAG': 'AAA', 'AAT': 'AAC', 'ACA': 'ACG', 'ACC': 'ACC', 'ACG': 'ACC', 'ACT': 'ACG', 'AGA': 'CGA', 'AGC': 'AGC', 'AGG': 'AGA', 'AGT': 'AGC', 'ATA': 'ATA', 'ATC': 'ATC', 'ATG': 'ATG', 'ATT': 'ATT', 'CAA': 'CAG', 'CAC': 'CAC', 'CAG': 'CAA', 'CAT': 'CAT', 'CCA': 'CCG', 'CCC': 'CCC', 'CCG': 'CCG', 'CCT': 'CCA', 'CGA': 'CGG', 'CGC': 'CGC', 'CGG': 'CGT', 'CGT': 'CGT', 'CTA': 'TTG', 'CTC': 'TTA', 'CTG': 'CTG', 'CTT': 'TTA', 'GAA': 'GAA', 'GAC': 'GAC', 'GAG': 'GAG', 'GAT': 'GAT', 'GCA': 'GCG', 'GCC': 'GCG', 'GCG': 'GCC', 'GCT': 'GCG', 'GGA': 'GGA', 'GGC': 'GGC', 'GGG': 'GGG', 'GGT': 'GGT', 'GTA': 'GTA', 'GTC': 'GTT', 'GTG': 'GTT', 'GTT': 'GTG', 'TAA': 'TAA', 'TAC': 'TAC', 'TAG': 'TGA', 'TAT': 'TAT', 'TCA': 'AGC', 'TCC': 'TCA', 'TCG': 'TCT', 'TCT': 'TCA', 'TGA': 'TGA', 'TGC': 'TGC', 'TGG': 'TGG', 'TGT': 'TGC', 'TTA': 'TTA', 'TTC': 'TTC', 'TTG': 'CTG', 'TTT': 'TTT'}
DEBUG:root:N==0
DEBUG:root:N==0
DEBUG:ro

There is no restriction site for EcoRI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for XbaI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for PstI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for SpeI in the harmonized sequence. 
Number of substitution = 0
>gene_seq (src:Bacillus methanolicus-tgt:Ecoli) (CHI_0=0.18682553084640247-CHI=0.06501796238204563)
ATGAAAAATACGCAGAGCGCGTTTTACATGCCAAGCGTTAACTTGTTTGGTGCGGGCTCAGTGAACGAGG
TGGGAACGCGGTTAGCGGGTTTAGGTGTTAAAAAAGCGTTATTAGTGACGGATGCGGGTTTACACAGCTT
AGGCTTATCAGAAAAAATTGCGGGTATCATTCGTGAAGCGGGTGTTGAAGTAGCGATTTTTCCGAAAGCG
GAACCGAACCCGACGGATAAAAATGTTGCGGAAGGTTTAGAAGCCTATAATGCGGAAAATTGCGACAGCA
TTGTTACGTTAGGCGGCGGAAGCAGCCATGATGCGGGAAAAGCGATTGCGTTAGTAGCGGCGAATGGTGG
AACGATTCACGATTATGAAGGTGTTGATGTAAGCAAAAAACCGATGGTTCCATTGATTGCCATTAACACG
ACGGCGGGTACGGGCAGCGAATTAACGAAATTCACGATCATCACGGATACGGAACGCAAAGTTAAAATGG
CGATTGTGGATAAACATGTAACGCCA

DEBUG:root:ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/008/325/GCA_000008325.1_ASM832v1/GCA_000008325.1_ASM832v1_cds_from_genomic.fna.gz


Beginning file download with wget module
100% [..........................................................................] 1008137 / 1008137

DEBUG:root:codonharmonizer Methylococcus_capsulatus.cds.fna --write_freqs > Methylococcus_capsulatus.freq.csv
DEBUG:root:{'AAA': 'AAA', 'AAC': 'AAT', 'AAG': 'AAA', 'AAT': 'AAC', 'ACA': 'ACG', 'ACC': 'ACC', 'ACG': 'ACC', 'ACT': 'ACG', 'AGA': 'CGA', 'AGC': 'AGC', 'AGG': 'AGA', 'AGT': 'AGC', 'ATA': 'ATA', 'ATC': 'ATC', 'ATG': 'ATG', 'ATT': 'ATT', 'CAA': 'CAG', 'CAC': 'CAC', 'CAG': 'CAA', 'CAT': 'CAT', 'CCA': 'CCG', 'CCC': 'CCC', 'CCG': 'CCG', 'CCT': 'CCA', 'CGA': 'CGG', 'CGC': 'CGC', 'CGG': 'CGT', 'CGT': 'CGT', 'CTA': 'TTG', 'CTC': 'TTA', 'CTG': 'CTG', 'CTT': 'TTA', 'GAA': 'GAA', 'GAC': 'GAC', 'GAG': 'GAG', 'GAT': 'GAT', 'GCA': 'GCG', 'GCC': 'GCG', 'GCG': 'GCC', 'GCT': 'GCG', 'GGA': 'GGA', 'GGC': 'GGC', 'GGG': 'GGG', 'GGT': 'GGT', 'GTA': 'GTA', 'GTC': 'GTT', 'GTG': 'GTT', 'GTT': 'GTG', 'TAA': 'TAA', 'TAC': 'TAC', 'TAG': 'TGA', 'TAT': 'TAT', 'TCA': 'AGC', 'TCC': 'TCA', 'TCG': 'TCT', 'TCT': 'TCA', 'TGA': 'TGA', 'TGC': 'TGC', 'TGG': 'TGG', 'TGT': 'TGC', 'TTA': 'TTA', 'TTC': 'TTC', 'TTG': 'CT

There is no restriction site for EcoRI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for XbaI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for PstI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for SpeI in the harmonized sequence. 
Number of substitution = 0
>gene_seq (src:Methylococcus capsulatus-tgt:Ecoli) (CHI_0=0.16076274131757531-CHI=0.09779542970346927)
ATGGCGGCGACGACCATTGGTGGTGCGGCGGCCGCCGAAGCCCCGCTGCTGGACAAAAAATGGTTAACCT
TCGCGCTGGCCATTTACACCGTTTTCTACCTGTGGGTTCGTTGGTACGAAGGTGTTTATGGCTGGTCAGC
GGGACTGGACTCTTTCGCCCCGGAGTTCGAGACCTACTGGATGAACTTCCTGTACACCGAGATCGTTCTG
GAGATCGTTACCGCGTCTATCCTGTGGGGCTATTTATGGAAAACCCGCGACCGCAATCTGGCGGCCCTGA
CCCCGCGTGAAGAGCTGCGCCGCAATTTCACCCACCTGGTTTGGCTGGTTGCGTACGCGTGGGCGATCTA
CTGGGGCGCGTCATACTTCACCGAGCAAGACGGCACCTGGCATCAAACCATCGTTCGCGACACCGACTTC
ACCCCGTCTCACATCATCGAGTTCTATCTGAGCTACCCGATCTACATCATCACCGGTTTTGCCGCCTTCA
TCTACGCGAAAACCCGTCTGCCG

DEBUG:root:None
DEBUG:root:Bacillus methanolicus
DEBUG:root:ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/724/485/GCA_000724485.1_ASM72448v1/GCA_000724485.1_ASM72448v1_cds_from_genomic.fna.gz


Beginning file download with wget module
 99% [........................................................................... ] 974848 / 983137

DEBUG:root:codonharmonizer Bacillus_methanolicus.cds.fna --write_freqs > Bacillus_methanolicus.freq.csv


 99% [........................................................................... ] 983040 / 983137100% [............................................................................] 983137 / 983137

DEBUG:root:{'AAA': 'AAA', 'AAC': 'AAT', 'AAG': 'AAA', 'AAT': 'AAC', 'ACA': 'ACG', 'ACC': 'ACC', 'ACG': 'ACC', 'ACT': 'ACG', 'AGA': 'CGA', 'AGC': 'AGC', 'AGG': 'AGA', 'AGT': 'AGC', 'ATA': 'ATA', 'ATC': 'ATC', 'ATG': 'ATG', 'ATT': 'ATT', 'CAA': 'CAG', 'CAC': 'CAC', 'CAG': 'CAA', 'CAT': 'CAT', 'CCA': 'CCG', 'CCC': 'CCC', 'CCG': 'CCG', 'CCT': 'CCA', 'CGA': 'CGG', 'CGC': 'CGC', 'CGG': 'CGT', 'CGT': 'CGT', 'CTA': 'TTG', 'CTC': 'TTA', 'CTG': 'CTG', 'CTT': 'TTA', 'GAA': 'GAA', 'GAC': 'GAC', 'GAG': 'GAG', 'GAT': 'GAT', 'GCA': 'GCG', 'GCC': 'GCG', 'GCG': 'GCC', 'GCT': 'GCG', 'GGA': 'GGA', 'GGC': 'GGC', 'GGG': 'GGG', 'GGT': 'GGT', 'GTA': 'GTA', 'GTC': 'GTT', 'GTG': 'GTT', 'GTT': 'GTG', 'TAA': 'TAA', 'TAC': 'TAC', 'TAG': 'TGA', 'TAT': 'TAT', 'TCA': 'AGC', 'TCC': 'TCA', 'TCG': 'TCT', 'TCT': 'TCA', 'TGA': 'TGA', 'TGC': 'TGC', 'TGG': 'TGG', 'TGT': 'TGC', 'TTA': 'TTA', 'TTC': 'TTC', 'TTG': 'CTG', 'TTT': 'TTT'}
DEBUG:root:N==0
DEBUG:root:N==0
DEBUG:root:N==0
DEBUG:root:N==0
DEBUG:root:ATGAAAAATACGCAGAG

There is no restriction site for EcoRI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for XbaI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for PstI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for SpeI in the harmonized sequence. 
Number of substitution = 0
>gene_seq (src:Bacillus methanolicus-tgt:Ecoli) (CHI_0=0.18682553084640247-CHI=0.06501796238204563)
ATGAAAAATACGCAGAGCGCGTTTTACATGCCAAGCGTTAACTTGTTTGGTGCGGGCTCAGTGAACGAGG
TGGGAACGCGGTTAGCGGGTTTAGGTGTTAAAAAAGCGTTATTAGTGACGGATGCGGGTTTACACAGCTT
AGGCTTATCAGAAAAAATTGCGGGTATCATTCGTGAAGCGGGTGTTGAAGTAGCGATTTTTCCGAAAGCG
GAACCGAACCCGACGGATAAAAATGTTGCGGAAGGTTTAGAAGCCTATAATGCGGAAAATTGCGACAGCA
TTGTTACGTTAGGCGGCGGAAGCAGCCATGATGCGGGAAAAGCGATTGCGTTAGTAGCGGCGAATGGTGG
AACGATTCACGATTATGAAGGTGTTGATGTAAGCAAAAAACCGATGGTTCCATTGATTGCCATTAACACG
ACGGCGGGTACGGGCAGCGAATTAACGAAATTCACGATCATCACGGATACGGAACGCAAAGTTAAAATGG
CGATTGTGGATAAACATGTAACGCCA

DEBUG:root:ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/008/325/GCA_000008325.1_ASM832v1/GCA_000008325.1_ASM832v1_cds_from_genomic.fna.gz


Beginning file download with wget module
 99% [......................................................................... ]  999424 / 1008137

DEBUG:root:codonharmonizer Methylococcus_capsulatus.cds.fna --write_freqs > Methylococcus_capsulatus.freq.csv


 99% [......................................................................... ] 1007616 / 1008137100% [..........................................................................] 1008137 / 1008137

DEBUG:root:{'AAA': 'AAA', 'AAC': 'AAT', 'AAG': 'AAA', 'AAT': 'AAC', 'ACA': 'ACG', 'ACC': 'ACC', 'ACG': 'ACC', 'ACT': 'ACG', 'AGA': 'CGA', 'AGC': 'AGC', 'AGG': 'AGA', 'AGT': 'AGC', 'ATA': 'ATA', 'ATC': 'ATC', 'ATG': 'ATG', 'ATT': 'ATT', 'CAA': 'CAG', 'CAC': 'CAC', 'CAG': 'CAA', 'CAT': 'CAT', 'CCA': 'CCG', 'CCC': 'CCC', 'CCG': 'CCG', 'CCT': 'CCA', 'CGA': 'CGG', 'CGC': 'CGC', 'CGG': 'CGT', 'CGT': 'CGT', 'CTA': 'TTG', 'CTC': 'TTA', 'CTG': 'CTG', 'CTT': 'TTA', 'GAA': 'GAA', 'GAC': 'GAC', 'GAG': 'GAG', 'GAT': 'GAT', 'GCA': 'GCG', 'GCC': 'GCG', 'GCG': 'GCC', 'GCT': 'GCG', 'GGA': 'GGA', 'GGC': 'GGC', 'GGG': 'GGG', 'GGT': 'GGT', 'GTA': 'GTA', 'GTC': 'GTT', 'GTG': 'GTT', 'GTT': 'GTG', 'TAA': 'TAA', 'TAC': 'TAC', 'TAG': 'TGA', 'TAT': 'TAT', 'TCA': 'AGC', 'TCC': 'TCA', 'TCG': 'TCT', 'TCT': 'TCA', 'TGA': 'TGA', 'TGC': 'TGC', 'TGG': 'TGG', 'TGT': 'TGC', 'TTA': 'TTA', 'TTC': 'TTC', 'TTG': 'CTG', 'TTT': 'TTT'}
DEBUG:root:N==0
DEBUG:root:N==0
DEBUG:root:N==0
DEBUG:root:N==0
DEBUG:root:ATGGCGGCGACGACCAT

There is no restriction site for EcoRI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for XbaI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for PstI in the harmonized sequence. 
Number of substitution = 0
There is no restriction site for SpeI in the harmonized sequence. 
Number of substitution = 0
>gene_seq (src:Methylococcus capsulatus-tgt:Ecoli) (CHI_0=0.16076274131757531-CHI=0.09779542970346927)
ATGGCGGCGACGACCATTGGTGGTGCGGCGGCCGCCGAAGCCCCGCTGCTGGACAAAAAATGGTTAACCT
TCGCGCTGGCCATTTACACCGTTTTCTACCTGTGGGTTCGTTGGTACGAAGGTGTTTATGGCTGGTCAGC
GGGACTGGACTCTTTCGCCCCGGAGTTCGAGACCTACTGGATGAACTTCCTGTACACCGAGATCGTTCTG
GAGATCGTTACCGCGTCTATCCTGTGGGGCTATTTATGGAAAACCCGCGACCGCAATCTGGCGGCCCTGA
CCCCGCGTGAAGAGCTGCGCCGCAATTTCACCCACCTGGTTTGGCTGGTTGCGTACGCGTGGGCGATCTA
CTGGGGCGCGTCATACTTCACCGAGCAAGACGGCACCTGGCATCAAACCATCGTTCGCGACACCGACTTC
ACCCCGTCTCACATCATCGAGTTCTATCTGAGCTACCCGATCTACATCATCACCGGTTTTGCCGCCTTCA
TCTACGCGAAAACCCGTCTGCCG

DEBUG:root:ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/309/565/GCA_000309565.2_ASM30956v2/GCA_000309565.2_ASM30956v2_cds_from_genomic.fna.gz


Beginning file download with wget module
 82% [..............................................................              ] 729088 / 885118

DEBUG:root:codonharmonizer Lactobacillus_casei.cds.fna --write_freqs > Lactobacillus_casei.freq.csv


 83% [...............................................................             ] 737280 / 885118 84% [................................................................            ] 745472 / 885118 85% [................................................................            ] 753664 / 885118 86% [.................................................................           ] 761856 / 885118 86% [..................................................................          ] 770048 / 885118 87% [..................................................................          ] 778240 / 885118 88% [...................................................................         ] 786432 / 885118 89% [....................................................................        ] 794624 / 885118 90% [....................................................................        ] 802816 / 885118 91% [.....................................................................       ] 811008 / 885118

DEBUG:root:{'AAA': 'AAA', 'AAC': 'AAT', 'AAG': 'AAA', 'AAT': 'AAC', 'ACA': 'ACG', 'ACC': 'ACC', 'ACG': 'ACC', 'ACT': 'ACG', 'AGA': 'CGA', 'AGC': 'AGC', 'AGG': 'AGA', 'AGT': 'AGC', 'ATA': 'ATA', 'ATC': 'ATC', 'ATG': 'ATG', 'ATT': 'ATT', 'CAA': 'CAG', 'CAC': 'CAC', 'CAG': 'CAA', 'CAT': 'CAT', 'CCA': 'CCG', 'CCC': 'CCC', 'CCG': 'CCG', 'CCT': 'CCA', 'CGA': 'CGG', 'CGC': 'CGC', 'CGG': 'CGT', 'CGT': 'CGT', 'CTA': 'TTG', 'CTC': 'TTA', 'CTG': 'CTG', 'CTT': 'TTA', 'GAA': 'GAA', 'GAC': 'GAC', 'GAG': 'GAG', 'GAT': 'GAT', 'GCA': 'GCG', 'GCC': 'GCG', 'GCG': 'GCC', 'GCT': 'GCG', 'GGA': 'GGA', 'GGC': 'GGC', 'GGG': 'GGG', 'GGT': 'GGT', 'GTA': 'GTA', 'GTC': 'GTT', 'GTG': 'GTT', 'GTT': 'GTG', 'TAA': 'TAA', 'TAC': 'TAC', 'TAG': 'TGA', 'TAT': 'TAT', 'TCA': 'AGC', 'TCC': 'TCA', 'TCG': 'TCT', 'TCT': 'TCA', 'TGA': 'TGA', 'TGC': 'TGC', 'TGG': 'TGG', 'TGT': 'TGC', 'TTA': 'TTA', 'TTC': 'TTC', 'TTG': 'CTG', 'TTT': 'TTT'}
DEBUG:root:N==0
DEBUG:root:552
DEBUG:root:divisible by 3
DEBUG:root:      aa  count      fre

There is a RS GAATTC for the enzyme EcoRI at position 552
There is no restriction site for XbaI in the harmonized sequence. 
Number of substitution = 1
There is a RS CTGCAG for the enzyme PstI at position 87
There is no restriction site for SpeI in the harmonized sequence. 
Number of substitution = 2
>gene_seq (src:Lactobacillus casei-tgt:Ecoli) (CHI_0=0.22107595408498684-CHI=0.07397428618317337)
GTTGCGAGCATTACCGATAAAGATCACCAGAAAGTGATTTTAGTGGGTGACGGCGCGGTGGGTAGCAGCT
ATGCGTATGCGATGGTACTGCAGGGTATTGCGCAGGAAATCGGGATCGTGGACATTTTTAAAGACAAAAC
CAAAGGTGACGCCATTGACTTAAGCAATGCCCTGCCGTTCACCAGCCCGAAAAAAATTTATAGCGCGGAA
TACAGCGATGCGAAAGATGCGGATCTGGTGGTGATCACGGCGGGTGCGCCACAAAAACCGGGCGAAACCC
GCCTGGATCTGGTGAATAAAAATCTGAAAATCCTGAAATCAATTGTGGATCCGATTGTTGATTCAGGCTT
TAATGGTATCTTCCTGGTGGCGGCGAATCCGGTGGATATCCTGACCTATGCGACGTGGAAATTATCAGGC
TTCCCGAAAAATCGTGTGGTGGGTAGCGGTACGAGCCTGGATACCGCGCGTTTCCGTCAATCAATTGCGG
AAATGGTGAATGTGGATGCGCGTTCTGTTCACGCGTACATCATGGGTGAACATGGTGACACGGAATTCCC
AGTATGGAGCCACGCGAATATCGGTGGCGTGA

In [5]:
out_harm

{'Model 1': {'1.1.1.244': '>gene_seq (src:Bacillus methanolicus-tgt:Ecoli) (CHI_0=0.18682553084640247-CHI=0.06501796238204563)\nATGAAAAATACGCAGAGCGCGTTTTACATGCCAAGCGTTAACTTGTTTGGTGCGGGCTCAGTGAACGAGG\nTGGGAACGCGGTTAGCGGGTTTAGGTGTTAAAAAAGCGTTATTAGTGACGGATGCGGGTTTACACAGCTT\nAGGCTTATCAGAAAAAATTGCGGGTATCATTCGTGAAGCGGGTGTTGAAGTAGCGATTTTTCCGAAAGCG\nGAACCGAACCCGACGGATAAAAATGTTGCGGAAGGTTTAGAAGCCTATAATGCGGAAAATTGCGACAGCA\nTTGTTACGTTAGGCGGCGGAAGCAGCCATGATGCGGGAAAAGCGATTGCGTTAGTAGCGGCGAATGGTGG\nAACGATTCACGATTATGAAGGTGTTGATGTAAGCAAAAAACCGATGGTTCCATTGATTGCCATTAACACG\nACGGCGGGTACGGGCAGCGAATTAACGAAATTCACGATCATCACGGATACGGAACGCAAAGTTAAAATGG\nCGATTGTGGATAAACATGTAACGCCAACGTTAAGCATCAACGACCCGGAGTTGATGGTGGGAATGCCACC\nGTCATTAACGGCGGCGACGGGATTAGATGCGTTAACGCATGCCATTGAAGCGTATGTGAGCACGGGTGCG\nACGCCGATTACGGATGCGTTAGCGATTCAAGCCATCAAAATTATTTCAAAATACCTGCCGCGTGCGGTGG\nCGAACGGAAAAGACATTGAAGCGCGTGAACAGATGGCGTTCGCGCAGAGCTTAGCGGGCATGGCGTTCAA\nCAATGCCGGTTTAGGCTATGTGCATGCCATTGCGCACCAGTTAGGAGGATTCTACAATTTCCCACATGGC\nGTGTGCAA

## Considerations
##### It works: 

The printed output of *run_harmonization_per_model* functions shows that the final sequence does not have the two RSs found in the harmonized sequence.

