# Introduction

This notebook should serve as a basic template and a walkthrough for making a submission csv file for generating hyak simulation data. 

First thing is first. We need to define what we are working with. For this example I will be defining a left and right context and then combining multiple parts and insulating sequences into a simuation dictionary. 

In [1]:
#Context sequences
fiveflank = "CTTACGATGTTCCAGATTACGCTCCCATGGTGAGCAAGGGCGAGGAGCTGTTCACCGGGGTGGTGCCCATCCTGGTCGAGCTGGACGGCGACGTAAACGGCCACAAGTTCAGCGTGTCCGGCGAGGGCGAGGTGAGTCTATGGGACCCTTGATGTTTTCTTTCCCCTTCTTTTCTATGGTTAAGTTCATGTCATAGGAAGGGGAGAAGTAACAGGGTACAGTTTAGAATGGGAAACAGACGAATGATTGCATCAGTGTGGAAGTCTCAGGATCGTTTTAGTTTCTTTTATTTGCTGTTCATAACAATTggatcc"
threeflank = "ggatccGTTTTCTTTTGTTTAATTCTTGCTTTCTTTTTTTTTCTTCTCCGCAATTTTTACTATTATACTTAATGCCTTAACATTGTGTATAACAAAAGGAAATATCTCTGAGATACATTAAGTAACTTAAAAAAAAACTTTACACAGTCTGCCTAGTACATTACTATTTGGAATATATGTGTGCTTATTTGCATATTCATAATCTCCCTACTTTATTTTCTTTTATTTTTAATTGATACATAATCATTATACATATTTATGGGTTAAAGTGT"
#part sequences
parts = {'ASBV-1': 'GGGACGGGCCAUCAUCUAUCCCUGAAGAGACGAAGGCUUCGGCCAAGUCGAAACGGAAACGUCGGAUAGUCGCCCGUCCC',
         'sTRSV-2': 'GGGCCUGUCACCGGAUGUGCUUUCCGGUCUGAUGAGUCCCUGAAAUGGGACGAAACAGGCCC',
         'Sman': 'GGGCGAAAGCCGGCGCGUCCUGGAUUCCACUGCUUCGGCAGGUACAUCCAGCUGAUGAGUCCCAAAUAGGACGAAACGCGCU',
         'ASBV-3': 'GGGACGGGCCAUCAUCUAUCCCUGAAGAGACGAAGGCUUCGGCCUCGUCGAAACGGAAACGUGGGAUAGUCGCCCGUCCC'}
#insulating sequences
insulatingsequences = {'ASBV-1': ['CUGACCAACCCAUA', 'AAUAGUAACCAAAC', '044', '017'],
'ASBV-3': ['UGGGAGAAAUAGUAC', 'UGUGGAACAAACG', '043', '017'],
'Sman': ['CGAGAGAACACAUGA', 'AAAAAAAACAA', '030', '007'],
'sTRSV-2': ['UGCUAGCGAUGCGC', 'CUGCGUAAACG', '033', '017']}

import pickle
#Now let's load the reference structures
referencedata = pickle.load(open('exampledata/reference_part_structures.p', 'rb'))

These devices are going to be folded multiple times at different windows and at different rates, as a cosequence they are going to need a unique name for every submission

Now we have to submit these files for simulations, we are going to do multiple rounds of simulations - consequently we have to submit in a specific file hirearchy:

>EXPERIMENTFOLDER
>>PYRFOLD (THIS IS THE LOWER LEVEL PYRFOLD)

>>submission.py

>>submission_file.csv



#Buidling the submission.csv file#

We are going to use the the **object** Folding sub data

    pyrfold.FoldingSubData(self, name, sequence, windowstart=0, windowstop=0, partstartstoplist=[], partnamelist=[], referencepart=None, forcedhelixes=[], polrate=30, foldtimeafter=1, experimenttype=2, pseudoknots=0, entanglements=0, numberofsimulations=10, helix_min_free_eng=6.3460741)
    
##Step 1 define a dictionary##

In [5]:
submission_dict = {}

##Step 2 iterate through all components and populate dictionary##

In [6]:
for part, partsequence in parts.iteritems():
    #This iterates through the ribozyme part examples that are loaded in earlier
    
    #we can pull the specific insulating sequence
    temp_ins_seq = insulatingsequences[part]
    temp_five_prime_ins = temp_ins_seq[0]
    temp_three_prime_ins = temp_ins_seq[1]
    
    seq_to_fold = temp_five_prime_ins + partsequence + temp_three_prime_ins
    
    #Describe all other imporant variables
    fold_name = part 
    windowstart = 1
    windowstop = len(seq_to_fold)
    polrate = 20
    
    
    #ADD the sequence and all imporatant information to submission dict
    # right now the 'key' name of the subission dict should be the name ascribed to the folding sub object
    temp = pyrfold.FoldingSubData(name=fold_name, sequence=seq_to_fold, windowstart=1, windowstop=windowstop, 
                           polrate=20, numberofsimulations=30)
    submission_dict[fold_name] = temp

In [7]:
submission_dict

{'ASBV-1': <pyrfold.foldingsub.FoldingSubData at 0x7f98a155d650>,
 'ASBV-3': <pyrfold.foldingsub.FoldingSubData at 0x7f98a155d690>,
 'Sman': <pyrfold.foldingsub.FoldingSubData at 0x7f98a155d710>,
 'sTRSV-2': <pyrfold.foldingsub.FoldingSubData at 0x7f98a155d6d0>}

##Step 3 write the csv to file##

In [9]:
pyrfold.pyrfile.filled_in_form(filename = 'test_submissiop', devicenametosubobj=submission_dict)

Now that this is build we can move everything else into the file structure manually. 

After this we can move this to hyak and then start processing the data. **NOTE** one must use the submission_directory.py script since there is an additional directory level. 

In [3]:
import pyrfold

In [4]:
pyrfold.FoldingSubData?

In [8]:
pyrfold.pyrfile.filled_in_form?