# Training and testing LSTM50 on the yeast genome 

This short Jupyter notebook takes you through the main steps we did for our LSTM50 model on yeast.
The code was originally written in Python 2.7, but has been amended for Python 3.6. We ran it in Google Colab where you can let a GPU run the training and prediction of the model (it's all set up and just there to use). If you want to run this on your own server/machine, you need to install Keras (we used v 2.1.6) and Tensorflow (we used tensorflow_gpu-1.12.0-cp27-none-linux_x86_64.whl).
It may be easiest to create and environment in which you install these. And then invoke that environment 
with the kernel for this notebook (you'll need to install a few things for having your environment available among the kernels in your Jupyter session --- check the internet, there's lots of help).

The commands below call code (functions) in various python modules that we created for our purposes. All 
these modules are included in the home Github repo of this notebook. So to run the code you must download/clone
the repo contents to a folder on your machine/server.

If you run the notebook in Google Colab you can place all code in a Google drive and then mount the drive. This makes the code available in the Colab session (and that's what we do below). Output is also placed there, so may may need a few Gb's of free space on your drive!

The data needed here is the yeast genome, which you can download from various public sources, e.g. the UCSC 
genome site (http://www.genome.ucsc.edu/). 

We first mount a Google drive and then set some paths to where code and genome data are placed, and where to place output (you'll need to create these folders in your Google drive):

In [3]:
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

Mounted at /content/drive


In [4]:
# Insert the directory
import sys
sys.path.insert(0,"/content/drive/MyDrive/DNA_proj/development/")
sys.path.insert(0,"/content/drive/MyDrive/DNA_proj/data/")
sys.path.insert(0,"/content/drive/MyDrive/DNA_proj/results/training/")
sys.path.insert(0,"/content/drive/MyDrive/DNA_proj/results/predictions/")
sys.path.insert(0,"/content/drive/MyDrive/DNA_proj/results/GCbias/")


In [11]:
rootGenome = "/content/drive/MyDrive/DNA_proj/data/"
fileName = "S288C_reference_sequence_R64-1-1_20110203.fsa"
fileGenome = rootGenome + fileName
chromoList = ['R64_chr1', 'R64_chr2', 'R64_chr3', 'R64_chr4', 'R64_chr5', 'R64_chr6', 'R64_chr7', 'R64_chr8','R64_chr9', 'R64_chr10', 'R64_chr11', 'R64_chr12','R64_chr13', 'R64_chr14', 'R64_chr15', 'R64_chr16','R64_chr17' ]


For training and validating the model we split the genome in two truly non-overlapping parts (you can skip this and just use e.g. an 80-20 split for training/validation).

In [4]:
import dnaNet_dataGen as dataGen

In [5]:
rootOutput =rootGenome
ext = '.txt'
chromoFileList = []
for ch in chromoList:
    chromoFileList.append( rootOutput + ch + ext )
fileNamePart1 =  r'/part1.txt'
fileNamePart2 =  r'/part2.txt'
splitNrPositions = [40000, 10000 ]
dataGen.splitChromoStrings(chromoFileList = chromoFileList, splitNrPositions = splitNrPositions,lineLength = 60,  rootOutput = rootOutput, fileNamePart1 = fileNamePart1, fileNamePart2 = fileNamePart2)


doneTheseChromos for part 1:  ['/content/drive/MyDrive/DNA_proj/data/R64_chr1.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr2.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr3.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr4.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr5.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr6.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr7.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr8.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr9.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr10.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr11.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr12.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr13.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr14.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr15.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr16.txt', '/content/drive/MyDrive/DNA_proj/data/R64_chr17.txt']
Number of chromos in vs out:  17 17
doneTheseChromos for par

For defining and training the model we will use a function called allInOneWithDynSampling_ConvLSTMmodel, which is placed in 
our dnaNet_LSTM module. The function is a 'wrapper': It covers the definition of the model and the configuration for the 
training process (and some more). It has a long list of arguments, here we use some of them. If you want to get deeper into its
details, you will be best off by reading the code. If you manage to come to terms with it you will also know how to define new 
model types and plug them into this wrapper. For now let us first set the model details. 

The model architecture that we will use is a 'convolutional-LSTM'. We have coded our version in the function makeConv1DLSTMmodel
(in dnaNet_LSTM). This function is then called from within allInOneWithDynSampling_ConvLSTMmodel with whatever argument values we
have suppplied. We want a model with flank size 50, with an intial convolutional layer using 64 filters of size 4, two bidirectional LSTM-layers where the first is sequence-to-sequence and a final dense layer with 50 untis. So we set (don't worry about the many unexplained arguments):

In [6]:
usedThisModel = 'makeConv1DLSTMmodel'
customFlankSize = 50
#convo:
overlap = 0
pool_b = 0
poolAt = [1, 3] #not in use
maxPooling_b = 0
poolStrides = 1
lengthWindows = [4]
nrFilters = [256] 
filterStride = 1
padding = 'valid'
#lstm layers:
nrOfParallelLSTMstacks = 1 #parallel LSTMs
nrLSTMlayers = 1 #OBS: the run data file will record nrLSTMlayers as this number plus 1 if summarizingLSTMLayer_b == 1
summarizingLSTMLayer_b = 1
LSTMFiltersByLastConvFilters_b = 1
nrLSTMFilters = [-1]  #-1: just placeholder to be recorded in runData file
tryAveraging_b = 0
#Final dense layers:
finalDenseLayers_b = 1
hiddenUnits = [50]
inclFrqModel_b = 0
insertFrqModel_b = 0
#Additional
exonicInfoBinaryFileName  = ''

Next we set the configuration for the training process. First some sizes: Here we let the training consist of 20 rounds ('repeats') each of 10 'epochs' which in turn each consists of 100 steps of loading in a batch of 500 samples and updating the model parameters on it (the last four parameters are not in use, so just set to placeholder values). Each such round then comprises training on 500000 randomly picked samples; at completion of each round a validation is run on 100000 samples (nrTestSamples): 

In [13]:
#In anger (yeast):
nrOuterLoops = 1
firstIterNr = 0
nrOfRepeats = 20
firstRepeatNr = 0 #if = n > 0: loads in model from repeatNr n-1 
testDataIntervalIdTotrainDataInterval_b = 1
trainTestSplitRatio = 0.8 #no used when we run training in splitExercise mode
nrEpochs = 10 
batchSize = 500
stepsPerEpoch = 100
trainDataIntervalStepSize = 0 
trainDataInterval = [0,15000000]
nrTestSamples = 100000
testDataInterval = [10000000,-12000000]


Second (still parameters for the training process) some parameters, the most important being what optimizer we use, learning rate at start and whether to include the reverse complement of each sample to the batch or not:


In [14]:
#!!!:
augmentWithRevComplementary_b = 0 
optimizer = 'ADAM' 
learningRate = 0.001 
chromoNameBound = 1000 
labelsCodetype = 0 #1: base pair type prediction
# 
dynSamplesTransformStyle_b = 0 
rootFrq = '' 
file = "" 
frqModelFileName = rootFrq + file 
flankSizeFrqModel = -1 
exclFrqModelFlanks_b = 0

## 
dropout_b = 0 
dropoutVal = 0.0 
momentum = 0.1 #but we use Adam here, so the value here isn't used
onlyOneRandomChromo_b = 0 
avoidChromo = [] 
on_binf_b = 1

Finally, set a model name and a path for where to place the output:

In [15]:
subStr = '1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00'

modelName = 'modelLSTM_' + subStr
modelDescr = subStr

rootOutput =  r"/content/drive/MyDrive/DNA_proj/results/training/"


Before we can call the allInOneWithDynSampling_ConvLSTMmodel wrapper, we must   see to that we have a GPU running for us and of course load the python module holding the wrapper:

In [10]:
%tensorflow_version 2.x
import tensorflow as tf
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))


Found GPU at: /device:GPU:0


In [5]:
import dnaNet_LSTM_v3_py3 as dnaNet

In [12]:
#This cell is just for the case that we have changed the code and need to reload:
import imp
imp.reload(dnaNet)

<module 'dnaNet_LSTM_v3_py3' from '/content/drive/MyDrive/DNA_proj/development/dnaNet_LSTM_v3_py3.py'>

So now we are all set to run the wrapper with all the values we have set for its arguments above. The time this training takes depends on the GPU and our access to it (one or two hours is not unrealistic). When running, the wrapper writes files to the /results/training directory, which we set right above as rootOutput (see also top where we mounted the drive); in each round files defining the model as well as some keeping track of the training-results are stored; a 'runData.txt' file records the settings of the run.

A note: if you want rather to run a 80-20 split version, you don't need to split the genome and when running the wrapper leave splitExercise_b out (or set it to 0); further let fileGenome point to the genome file and leave fileGenome_forVal out (or set it to ' ').

In [13]:
fileGenome = r"/content/drive/MyDrive/DNA_proj/data/part1.txt"
fileGenome_forVal = r"/content/drive/MyDrive/DNA_proj/data/part2.txt"
labelsCodetype = 0 #1: base pair type prediction
dnaNet.allInOneWithDynSampling_ConvLSTMmodel(splitExercise_b = 1, genomeFileName_forVal = fileGenome_forVal, rootOutput = rootOutput, usedThisModel = usedThisModel, labelsCodetype = labelsCodetype, nrOuterLoops = nrOuterLoops, firstIterNr = firstIterNr, nrOfRepeats = nrOfRepeats,  firstRepeatNr = firstRepeatNr, convLayers_b = 1, overlap = overlap, learningRate = learningRate, momentum = momentum,  genomeFileName = fileGenome, chromoNameBound = chromoNameBound, trainTestSplitRatio = trainTestSplitRatio, customFlankSize = customFlankSize, inclFrqModel_b = inclFrqModel_b, insertFrqModel_b = insertFrqModel_b, exclFrqModelFlanks_b = exclFrqModelFlanks_b, frqModelFileName = frqModelFileName, flankSizeFrqModel = flankSizeFrqModel, modelName = modelName, testDataIntervalIdTotrainDataInterval_b = testDataIntervalIdTotrainDataInterval_b, trainDataIntervalStepSize = trainDataIntervalStepSize, trainDataInterval0 = trainDataInterval , nrTestSamples = nrTestSamples, testDataInterval = testDataInterval,   genSamples_b = 1,  nrOfParallelLSTMstacks = nrOfParallelLSTMstacks, lengthWindows = lengthWindows, nrLSTMlayers = nrLSTMlayers, summarizingLSTMLayer_b = summarizingLSTMLayer_b, LSTMFiltersByLastConvFilters_b = LSTMFiltersByLastConvFilters_b, nrLSTMFilters = nrLSTMFilters, finalDenseLayers_b = finalDenseLayers_b, hiddenUnits = hiddenUnits, nrFilters = nrFilters, padding = padding, filterStride = filterStride, tryAveraging_b= tryAveraging_b, pool_b = pool_b, maxPooling_b = maxPooling_b, poolAt = poolAt, poolStrides = poolStrides, optimizer = optimizer, dropoutVal = dropoutVal, dropout_b = dropout_b, augmentWithRevComplementary_b = augmentWithRevComplementary_b, batchSize = batchSize, nrEpochs = nrEpochs, stepsPerEpoch = stepsPerEpoch, shuffle_b = 0, on_binf_b = on_binf_b) 


Now at outer iteration:  0
trainDataInterval  [0, 15000000]
testDataInterval  [0, 15000000]
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 1000
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 1000
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 15000001
Genome data file 1st line:
  >ref|NC_001133| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=I]

Found data for this chromosome: ref|NC_001133| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=I]
1 1 1
Found data for this chromosome: ref|NC_001134| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=II]
Found data for this chromosome: ref|NC_001135| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=III]
Found data for this chromosome: ref|

  super(Adam, self).__init__(name, **kwargs)


Epoch 2/10


  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3357731103897095
Test accuracy: 0.3591499924659729
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr0
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3285356760025024
Test accuracy: 0.365119993686676
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr1
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3267571926116943
Test accuracy: 0.3661800026893616
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr2
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3219184875488281
Test accuracy: 0.3705199956893921
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr3
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3236052989959717
Test accuracy: 0.3694100081920624
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr4
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3183401823043823
Test accuracy: 0.37060999870300293
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr5
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3176774978637695
Test accuracy: 0.3746500015258789
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr6
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3150471448898315
Test accuracy: 0.374210000038147
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr7
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3148387670516968
Test accuracy: 0.3754499852657318
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr8
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3147965669631958
Test accuracy: 0.3738600015640259
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr9
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3127226829528809
Test accuracy: 0.3768799901008606
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr10
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3095237016677856
Test accuracy: 0.37953001260757446
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr11
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3220796585083008
Test accuracy: 0.3729900121688843
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr12
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3110098838806152
Test accuracy: 0.38047000765800476
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr13
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.30729341506958
Test accuracy: 0.3800300061702728
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr14
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.308707356452942
Test accuracy: 0.3793799877166748
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr15
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3103021383285522
Test accuracy: 0.37654998898506165
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr16
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3069136142730713
Test accuracy: 0.3824099898338318
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr17
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3018290996551514
Test accuracy: 0.382750004529953
I've now reloaded the model from the previous iteration (for test-only: for this repeatNr:  /content/drive/MyDrive/DNA_proj/results/training/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr18
Next: compile it ..
Compiled model ...
Epoch 2/10


  super(Adam, self).__init__(name, **kwargs)
  history = net.fit_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps_per_epoch= stepsPerEpoch, epochs=nrEpochs, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_queue_size=2, workers=1, use_multiprocessing=False,  initial_epoch=1)


Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10
dict_keys(['loss', 'accuracy'])
Now testing ...
Split-exercise. The data for this test/validation are taken from another source than the train data
Train-test split ratio set to: 0.259125


  score, acc = net.evaluate_generator(myGenerator(customFlankSize,batchSize, oneSided_b, inclFrqModel_b, insertFrqModel_b, labelsCodetype, forTrain_b), steps = np.int(float(nrTestSamples)/batchSize))


Test score: 1.3017898797988892
Test accuracy: 0.38317999243736267


To examine the training process you can plot training-validation results with this call: 

In [18]:
import dnaNet_plots as plots

bigLoopIterNr = 0
modelFileNameList = [rootOutput + 'modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00']

batchSizeRatioList = [1]
lastRepeatNrList = [nrOfRepeats -1]
epochsPerRepeatList = [10. -1] #you have to look up nrEpochs per repeat (the log file for the run, runData ); subtract 1 since epoch 1 in each repeat is apperently not saved

modelNameList = ['LSTM4']
fileNameAcc = 'LSTM4_total_trainTest_acc_vs_allEpochs'
fileNameLoss = 'LSTM4_total_trainTest_loss_vs_allEpochs'
plots.collectivePerfPlot(modelFileNameList = modelFileNameList, batchSizeRatioList = batchSizeRatioList, lastRepeatNrList = lastRepeatNrList, epochsPerRepeatList = epochsPerRepeatList, rootOutput = rootOutput, modelNameList = modelNameList, fileNameAcc= fileNameAcc, fileNameLoss = fileNameLoss )


When the training is done (and if we are satisfied with it), we can apply the trained model to DNA sequence. Here we want to let the model 
'predict' on the complete yeast genome. To carry this out and many other tasks downstream of it, we use a python module dedicated to these purposes:   

In [9]:
import dnaNet_stats_py3 as stats

Settings for the prediction (we segment the genome in 100kb pieces):

In [16]:
rootGenome = rootGenome #was set above
rootOutput = r"/content/drive/MyDrive/DNA_proj/results/predictions/"
rootModel = r"/content/drive/MyDrive/DNA_proj/results/training/"
lastRepeatNr = nrOfRepeats -1 
modelFileNameNN = "modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr" + str(lastRepeatNr)

augmentWithRevComplementary_b = 0
leftRight_b = 1
customFlankSize = 50
computePredAcc_b = 1
Fourier_b = 0

segmentLength = 1e5

batchSize = 528
windowLength = 1
stepSize = 1
Fourier_b = 0
on_binf_b = 1

#start positions
chromosomeOrderList = ['R64_chr1', 'R64_chr2', 'R64_chr3', 'R64_chr4', 'R64_chr5', 'R64_chr6', 'R64_chr7', 'R64_chr8','R64_chr9', 'R64_chr10', 'R64_chr11', 'R64_chr12','R64_chr13', 'R64_chr14', 'R64_chr15', 'R64_chr16']
chromosomeDict = {'R64_chr1':[0,1e8], 'R64_chr2':[0,1e8], 'R64_chr3':[0,1e8], 'R64_chr4':[0,1e8], 'R64_chr5':[0,1e8], 'R64_chr6':[0,1e8], 'R64_chr7':[0,1e8], 'R64_chr8':[0,1e8],'R64_chr9':[0,1e8], 'R64_chr10':[0,1e8], 'R64_chr11':[0,1e8], 'R64_chr12':[0,1e8],'R64_chr13':[0,1e8], 'R64_chr14':[0,1e8], 'R64_chr15':[0,1e8], 'R64_chr16':[0,1e8]}
startAtSegmentDict = {}



Run the predictions -- this may take a while too:

In [31]:
stats.predictOnChromosomes(rootGenome = rootGenome, 
                         chromosomeDict = chromosomeDict,
                         chromosomeOrderList = chromosomeOrderList, 
                         rootOutput = rootOutput,
                         rootModel = rootModel,
                         modelFileName = modelFileNameNN,
                        segmentLength = segmentLength,
                        augmentWithRevComplementary_b = augmentWithRevComplementary_b, #!!!!!
                        startAtSegmentDict = startAtSegmentDict,
                        customFlankSize = customFlankSize,
                        computePredAcc_b = computePredAcc_b, 
                        overlap = 0,
                        leftRight_b = leftRight_b, #use 1 for bi-directional models
                        batchSize = batchSize,
                        windowLength = windowLength,
                        stepSize = stepSize,
                        Fourier_b = Fourier_b,
                        on_binf_b = on_binf_b)
                        

Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 100
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 100
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001133| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=I]

Found data for this chromosome: ref|NC_001133| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=I]
60 60 60
[['ref|NC_001133| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=I]', 230218]]
[]
Length of genome sequence read in:230218
Length of exonic-info sequence read in:230218
ACGTacgt checked 0 tokens
Length genome sequence, ACGT's and W's:230218
Of these 0 are W's
Length genome sequence, only ACGT's:230218
nrSegments:  2
Now at segment  0
Length of encode

In [17]:
#Yeast:
segmentLength = 100000

averageRevComplementary_b = 0

chromosomeOrderList = ['R64_chr1', 'R64_chr2', 'R64_chr3', 'R64_chr4', 'R64_chr5', 'R64_chr6', 'R64_chr7', 'R64_chr8','R64_chr9', 'R64_chr10', 'R64_chr11', 'R64_chr12','R64_chr13', 'R64_chr14', 'R64_chr15', 'R64_chr16' ]
chromosomeDict = {'R64_chr1':[0,1e8], 'R64_chr2':[0,1e8], 'R64_chr3':[0,1e8], 'R64_chr4':[0,1e8], 'R64_chr5':[0,1e8], 'R64_chr6':[0,1e8], 'R64_chr7':[0,1e8], 'R64_chr8':[0,1e8],'R64_chr9':[0,1e8], 'R64_chr10':[0,1e8], 'R64_chr11':[0,1e8], 'R64_chr12':[0,1e8],'R64_chr13':[0,1e8], 'R64_chr14':[0,1e8], 'R64_chr15':[0,1e8], 'R64_chr16':[0,1e8]}
rootAnnotationFiles = r"/content/drive/MyDrive/DNA_proj/data/S288C_reference_genome_R64-1-1_20110203/"
annotationTypes = ['simpleRepeats']



This call computes accuracy-figures per segment and saves the results in dictionaries:  

In [33]:
resultsDictByAnnoSeg, resultsDictByAnno  = stats.getAccuracyChromosomes(chromosomeOrderList = chromosomeOrderList, 
                         rootOutput = rootOutput,
                         modelFileName = modelFileNameNN, 
                         segmentLength = segmentLength,
                         averageRevComplementary_b = averageRevComplementary_b,
                         windowLength = windowLength,
                         stepSize = stepSize, 
                         annotationTypes = annotationTypes,
                         rootAnnotationFiles = rootAnnotationFiles,
                         chromosomeDict = chromosomeDict)

/content/drive/MyDrive/DNA_proj/data/S288C_reference_genome_R64-1-1_20110203/R64_chr1_annotationArray_simpleRepeats
Annotation file /content/drive/MyDrive/DNA_proj/data/S288C_reference_genome_R64-1-1_20110203/R64_chr1_annotationArray_simpleRepeats not found
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr1/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predReturn_R64_chr1_seg100000_segment0_avgRevCompl0_win1_step1
0 (99950,)
99950
99950
Acc at R64_chr1_seg100000_segment0: 0.386830 based on corr 38664 and tot 99950
Acc at R64_chr1_seg100000_segment0 as recorded/done in computeAcc: 0.386830 , based on nr of corr 38664, tot cnt 99950
Acc at R64_chr1_seg100000_segment0 for repeats as recorded/done in computeAcc: 0.000000 , based on nr of corr 0, tot cnt 0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr1/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout

We can then aggregate the results per chromosome and get the results in a table (in .txt and Latex format too) and in a plot:

In [34]:
#Do the aggregation so that it covers all chromosomes/anotations for which results are had:
dictionaryName = 'accuracyChromoByAnnoDictionary'
stats.calculateAggrAccOnChromos(rootOutput =rootOutput , chromosomeDict= chromosomeDict,  dictionaryName = dictionaryName)


In [35]:
#Yeast, w. train test split
rootResults = rootOutput
captionText = 'LSTM50 on R64'

import pickle

#Results per chromo/anno
loadFile = rootResults + 'accuracyByChromoAnnoDictionary' 
resultsDictChromo = pickle.load(open( loadFile, "rb"))

#Aggr over all chromos
loadFile = rootResults + 'accuracyByAnnoDictionary' 
resultsDict = pickle.load(open( loadFile, "rb"))

fileName = 'table_chromo_anno.txt'
rowNames = chromosomeOrderList
colNames = ['all']   #the models will appear in columns in this order
#with final aggregation row:
stats.makeTexTable(inputDict = resultsDictChromo , rowColHeading = 'chr/annotation', rowNames = rowNames,
                                  columnNames = colNames, inputDict2 = resultsDict, captionText = captionText, rootOutput = rootOutput, fileName = fileName )


('\\begin{table}[h!]\n  \\begin{center}\n    \\label{tab:table1c}\n    \\begin{tabular}{l | c | c | r} % <-- Alignments: 1st column left, 2nd middle and 3rd right, with vertical lines in between\n      \\textbf{chr/annotation}& \\textbf{all}\\\\\n      \\hline\nR64_chr1 & 0.3852\\\\\nR64_chr2 & 0.386\\\\\nR64_chr3 & 0.3834\\\\\nR64_chr4 & 0.3884\\\\\nR64_chr5 & 0.383\\\\\nR64_chr6 & 0.3842\\\\\nR64_chr7 & 0.387\\\\\nR64_chr8 & 0.3828\\\\\nR64_chr9 & 0.3841\\\\\nR64_chr10 & 0.3843\\\\\nR64_chr11 & 0.3868\\\\\nR64_chr12 & 0.3839\\\\\nR64_chr13 & 0.3871\\\\\nR64_chr14 & 0.3838\\\\\nR64_chr15 & 0.3846\\\\\nR64_chr16 & 0.3857\\\\\nAll & 0.3855\\\\\n    \\end{tabular}\n        \\caption{LSTM50 on R64}\n  \\end{center}\n \\end{table}',
 ['all'])

In [36]:
chromosomeOrderList = chromosomeOrderList[::-1]
saveAtDpi = 300
addAvg_b = 1
avgLevel = 0.3751
resultsDictByChromoModel = stats.collectAccuracyChromosomesSeveralModels(rootOutput = rootOutput, 
                                  rootPredictModelList = [rootOutput], 
                                  modelFileList = [modelFileNameNN],
                                  modelNameList = ['LSTM4'],
                                  chromosomeOrderList = chromosomeOrderList,
                                  plot_b = 1,
                                  addAvg_b = addAvg_b,
                                  avgLevel = avgLevel,
                                  saveAtDpi = saveAtDpi)


{'R64_chr1': {'all': [0.3851912978244561, 77019.0, 199950]}, 'R64_chr2': {'all': [0.38597787361710106, 308763.0, 799950]}, 'R64_chr3': {'all': [0.3834405734289048, 115013.0, 299950]}, 'R64_chr4': {'all': [0.38844094803160106, 582642.0, 1499950]}, 'R64_chr5': {'all': [0.38303230323032306, 191497.0, 499950]}, 'R64_chr6': {'all': [0.3841810452613153, 76817.0, 199950]}, 'R64_chr7': {'all': [0.38700535026751337, 386986.0, 999950]}, 'R64_chr8': {'all': [0.38275827582758276, 191360.0, 499950]}, 'R64_chr9': {'all': [0.3841455181897737, 153639.0, 399950]}, 'R64_chr10': {'all': [0.3843303093078077, 269012.0, 699950]}, 'R64_chr11': {'all': [0.38678223185265437, 232050.0, 599950]}, 'R64_chr12': {'all': [0.38388919445972297, 383870.0, 999950]}, 'R64_chr13': {'all': [0.38711261736763153, 348382.0, 899950]}, 'R64_chr14': {'all': [0.38375312522323024, 268608.0, 699950]}, 'R64_chr15': {'all': [0.38459822991149556, 384579.0, 999950]}, 'R64_chr16': {'all': [0.3856969831657314, 347108.0, 899950]}}
nChromo

Running the Fourier transforms on the model's probability arrays (prob of reference base at each position) is done in a single call to a wrapper. First some settings, and then the call (it runs surprisingly fast):  

In [18]:
rootOutput =  r"/content/drive/MyDrive/DNA_proj/results/predictions/"

#We reuse some of the settings from above
modelFileNameNN = modelFileNameNN
chromosomeOrderList = chromosomeOrderList
chromosomeDict = chromosomeDict 

#placeholders
modelFileName_forATorGCbias = ''
rootOutput_forATorGCbias = ''
rootInput_forATorGCbias = ''
forATorGCbias_b = 0 #!

#General settings:
segmentLength = 100000

augmentWithRevComplementary_b = 0  #!

#window lgth and stepsize used in generating the avg prediction
windowLength = 1
stepSize = 1
                      
#Param's for Fourier plots:
shuffle_b = 0 #!
randomizeDisqualified_b = 0 #!
randomizingByShuffle_b = 0 #!
fullGC_b = 0 #!

#Which input to use:
inputArrayType = 1 # 1: ref base prob's; 0: pred returns
plotOnlyNorm_b = 1 #default:1 

#Here we only plot the 20-2000 range; extend the list with more starts/stops to cover several ranges
fourierStep = 10
fourierRawPlotFrq = 5
fourierStartList = [20]
fourierStopList = [2000]
fourierWindowLengthList = [100]

ratioQcutoff = 0.9 #0.7 w hg18
dumpFourier_b = 0
dumpFileNamePrefix = ''

In [21]:
#Call to Fourier-wrapper:
for i in range(len(fourierStartList)):
    fourierStart = fourierStartList[i]
    fourierStop = fourierStopList[i]
    fourierWindowLength = fourierWindowLengthList[i]
    stats.computeFourierChromosomes(chromosomeOrderList = chromosomeOrderList,
                                    rootOutput = rootOutput,
                                    modelFileName = modelFileNameNN,  
                                    segmentLength = segmentLength,
                                    inputArrayType = inputArrayType,
                                    averageRevComplementary_b = augmentWithRevComplementary_b,
                                    ratioQcutoff = ratioQcutoff,
                                    windowLength = windowLength,
                                    stepSize = stepSize,
                                    plotOnlyNorm_b = plotOnlyNorm_b,
                                    fourierWindowLength = fourierWindowLength,
                                    fourierStart = fourierStart,
                                    fourierStop = fourierStop,
                                    fourierStep = fourierStep, 
                                    fourierRawPlotFrq = fourierRawPlotFrq,
                                    shuffle_b = shuffle_b,
                                    randomizeDisqualified_b =randomizeDisqualified_b,
                                    randomizingByShuffle_b = randomizingByShuffle_b,
                                    forATorGCbias_b = forATorGCbias_b, 
                                    rootOutput_forATorGCbias= rootOutput_forATorGCbias,
                                    rootInput_forATorGCbias = rootInput_forATorGCbias,
                                    fullGC_b = fullGC_b,
                                    dumpFourier_b = dumpFourier_b,
                                    dumpFileNamePrefix = dumpFileNamePrefix, 
                                    modelFileName_forATorGCbias = modelFileName_forATorGCbias)


/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr1/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr1_seg100000_segment0_avgRevCompl0
99950
avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29426.839650 (1.049864)
After stop (2000) the max (min) modulus of Fourier coeff is: 210.230853 (0

  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29697.831064 (1.643515)
After stop (2000) the max (min) modulus of Fourier coeff is: 209.900434 (0.119637)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr2_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: 

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29416.182640 (0.740562)
After stop (2000) the max (min) modulus of Fourier coeff is: 166.824541 (0.195305)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr3/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr3_seg100000_segment1_avgRevComp

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29052.058680 (0.986998)
After stop (2000) the max (min) modulus of Fourier coeff is: 130.222999 (0.095254)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment1_avgRevComp

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29404.292610 (1.600772)
After stop (2000) the max (min) modulus of Fourier coeff is: 139.787868 (0.044269)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: 

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29265.734605 (1.386578)
After stop (2000) the max (min) modulus of Fourier coeff is: 154.602397 (0.131552)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment10_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq:

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29396.891083 (0.993020)
After stop (2000) the max (min) modulus of Fourier coeff is: 234.877937 (0.004686)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment15_avgRevCompl0
Directory /content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/FourierOnRefBaseProb/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1L

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29102.621265 (1.641079)
After stop (2000) the max (min) modulus of Fourier coeff is: 193.948657 (0.152533)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr5_seg100000_segment1_avgRevComp

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29105.743320 (0.795037)
After stop (2000) the max (min) modulus of Fourier coeff is: 179.534320 (0.093396)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr5_seg100000_segment5_avgRevCompl0
Directory /content/drive/MyDrive/DNA_proj/results/predictions/R64_chr6/FourierOnRefBaseProb/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr6/modelLSTM_1LayerConv2LayerLstm1La

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29117.474822 (1.102197)
After stop (2000) the max (min) modulus of Fourier coeff is: 204.773702 (0.089226)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr6/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr6_seg100000_segment1_avgRevComp

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29331.874089 (2.260079)
After stop (2000) the max (min) modulus of Fourier coeff is: 181.821248 (0.175409)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment1_avgRevComp

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29524.286480 (2.331624)
After stop (2000) the max (min) modulus of Fourier coeff is: 200.636579 (0.142829)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: 

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29259.128834 (1.452040)
After stop (2000) the max (min) modulus of Fourier coeff is: 159.207896 (0.135655)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment10_avgRevCompl0
Directory /content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/FourierOnRefBaseProb/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1L

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29114.367970 (1.281803)
After stop (2000) the max (min) modulus of Fourier coeff is: 182.982966 (0.141092)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr8_seg100000_segment1_avgRevComp

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29214.955211 (1.177778)
After stop (2000) the max (min) modulus of Fourier coeff is: 163.762453 (0.084975)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr8_seg100000_segment5_avgRevCompl0
Directory /content/drive/MyDrive/DNA_proj/results/predictions/R64_chr9/FourierOnRefBaseProb/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr9/modelLSTM_1LayerConv2LayerLstm1La

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29236.555776 (0.627924)
After stop (2000) the max (min) modulus of Fourier coeff is: 189.140804 (0.063725)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr9/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr9_seg100000_segment1_avgRevComp

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 28948.790239 (0.570886)
After stop (2000) the max (min) modulus of Fourier coeff is: 157.065084 (0.205506)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr10_seg100000_segment1_avgRevCo

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29234.369282 (0.912876)
After stop (2000) the max (min) modulus of Fourier coeff is: 173.485094 (0.029691)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr10_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29020.654668 (1.446970)
After stop (2000) the max (min) modulus of Fourier coeff is: 217.127108 (0.135478)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr11_seg100000_segment1_avgRevCo

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29379.177113 (0.814187)
After stop (2000) the max (min) modulus of Fourier coeff is: 239.826590 (0.022053)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr11_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29137.326646 (0.935978)
After stop (2000) the max (min) modulus of Fourier coeff is: 215.289807 (0.144831)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment1_avgRevCo

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 28668.860124 (1.011173)
After stop (2000) the max (min) modulus of Fourier coeff is: 137.502665 (0.115112)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 28983.946808 (0.479065)
After stop (2000) the max (min) modulus of Fourier coeff is: 154.375063 (0.176360)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment10_avgRevCompl0
Directory /content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/FourierOnRefBaseProb/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLs

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 29461.290629 (0.123043)
After stop (2000) the max (min) modulus of Fourier coeff is: 227.881130 (0.192511)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment1_avgRevCo

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29242.932006 (1.086159)
After stop (2000) the max (min) modulus of Fourier coeff is: 203.873191 (0.071302)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 28902.672975 (0.947985)
After stop (2000) the max (min) modulus of Fourier coeff is: 155.659557 (0.107302)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr14_seg100000_segment1_avgRevCo

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29404.368736 (0.725354)
After stop (2000) the max (min) modulus of Fourier coeff is: 166.239475 (0.155845)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr14_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 28942.617415 (1.128687)
After stop (2000) the max (min) modulus of Fourier coeff is: 165.060105 (0.043624)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment1_avgRevCo

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29280.054831 (2.720182)
After stop (2000) the max (min) modulus of Fourier coeff is: 215.674137 (0.116464)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29139.058846 (1.109937)
After stop (2000) the max (min) modulus of Fourier coeff is: 173.394296 (0.069434)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment10_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment0_avgRevComp

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 28952.290704 (0.650513)
After stop (2000) the max (min) modulus of Fourier coeff is: 191.460033 (0.212649)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment1_avgRevCo

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 29342.472081 (1.194753)
After stop (2000) the max (min) modulus of Fourier coeff is: 135.979735 (0.027423)
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment5_avgRevCompl0
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq

You should now be able to find the output plots in  FouriersOnRefBaseProb/20To2000 subfolders of the predictions/chrNN folders.

To run the Fourier on GC content you must first create the GC-indicator arrays for the purpose. This can be done by abusing the code aimed for computing accuracies (by a little 'hack' we arrange it so that the model is always right if the given position holds a G or C, and else it is wrong):

In [27]:
#This cell is just for the case that we have changed the code and need to reload:
import imp
imp.reload(stats)

<module 'dnaNet_stats_py3' from '/content/drive/MyDrive/DNA_proj/development/dnaNet_stats_py3.py'>

In [22]:
rootGenome = rootGenome

rootOutput = r"/content/drive/MyDrive/DNA_proj/results/predictions/"
rootModel = r"/content/drive/MyDrive/DNA_proj/results/training/"
lastRepeatNr = nrOfRepeats -1 
modelFileNameNN = "modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr" + str(lastRepeatNr)

rootOutputBias = r"/content/drive/MyDrive/DNA_proj/results/"  

chromosomeOrderList = ['R64_chr1', 'R64_chr2', 'R64_chr3', 'R64_chr4', 'R64_chr5', 'R64_chr6', 'R64_chr7', 'R64_chr8','R64_chr9', 'R64_chr10', 'R64_chr11', 'R64_chr12','R64_chr13', 'R64_chr14', 'R64_chr15', 'R64_chr16']
chromosomeDict = {'R64_chr1':[0,1e8], 'R64_chr2':[0,1e8], 'R64_chr3':[0,1e8], 'R64_chr4':[0,1e8], 'R64_chr5':[0,1e8], 'R64_chr6':[0,1e8], 'R64_chr7':[0,1e8], 'R64_chr8':[0,1e8],'R64_chr9':[0,1e8], 'R64_chr10':[0,1e8], 'R64_chr11':[0,1e8], 'R64_chr12':[0,1e8],'R64_chr13':[0,1e8], 'R64_chr14':[0,1e8], 'R64_chr15':[0,1e8], 'R64_chr16':[0,1e8]}
startAtSegmentDict = {}


The accuracies (as 'average prediction' or so) that get printed to the screen are the GC%'s .. it takes a little while to run this (a few minutes):

In [28]:
#set this so that it matches the setting for the computed predArray(s)
segmentLength = 100000
averageRevComplementary_b = 0

#set these as desired
windowLength = 1
stepSize = 1
Fourier_b = 0
on_binf_b = 1
defaultAccuracy = 0.25

#CG bias, little 'hack':
forATorGCbias_b = 1 #!
recodeA = [0,0,0,0]
recodeC = [1,1,1,1]
recodeG = [1,1,1,1]
recodeT = [0,0,0,0] 
modelFileName_forATorGCbias ="GCbias"
rootOutput_forATorGCbias  = rootOutputBias + r"GCbias/"

stats.computeAccuracyOnChromosomes(rootGenome = rootGenome, 
                         chromosomeDict = chromosomeDict,
                         chromosomeOrderList = chromosomeOrderList, 
                         rootOutput = rootOutput,
                         rootModel = rootModel,
                         modelFileName = modelFileNameNN,
                        segmentLength = segmentLength,
                        startAtSegmentDict = startAtSegmentDict,
                        averageRevComplementary_b = averageRevComplementary_b, #!!!!!
                        windowLength = windowLength,
                        stepSize = stepSize,
                        Fourier_b = Fourier_b,
                        defaultAccuracy = defaultAccuracy,
                        on_binf_b = on_binf_b,
                        forATorGCbias_b = forATorGCbias_b, 
                        rootOutput_forATorGCbias= rootOutput_forATorGCbias,
                        modelFileName_forATorGCbias = modelFileName_forATorGCbias,
                        recodeA = recodeA,
                             recodeC = recodeC,
                             recodeG = recodeG,
                             recodeT = recodeT
                             )

Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001133| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=I]

Found data for this chromosome: ref|NC_001133| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=I]
60 60 60
[['ref|NC_001133| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=I]', 230218]]
[]
Length of genome sequence read in:230218
Length of exonic-info sequence read in:230218
ACGTacgt checked 0 tokens
Length genome sequence, ACGT's and W's:230218
Of these 0 are W's
Length genome sequence, only ACGT's:230218
I'm doing a forATorGCbias run!
nrSegments:  2
/co

  plt.figure()


cntCorr: 39135 , cntTot: 99950 ; average prediction acc : 0.391546
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.391546
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr1/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr1_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr1/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr1_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 39455 , cntTot: 100000 ; average prediction acc : 0.394550
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.394550
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001134| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=II]

Found data for this chromosome: ref|NC_001134| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=II]
60 60 60
[['ref|NC_001134| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=II]', 813184]]
[]
Length of genome sequence read in:813184
Length of exonic-info sequence rea

  plt.figure()


cntCorr: 38526 , cntTot: 99950 ; average prediction acc : 0.385453
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.385453
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr2_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr2_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38442 , cntTot: 100000 ; average prediction acc : 0.384420
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.384420
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr2_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr2_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38137 , cntTot: 100000 ; average prediction acc : 0.381370
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381370
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr2_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr2_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37965 , cntTot: 100000 ; average prediction acc : 0.379650
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.379650
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr2_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr2_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38670 , cntTot: 100000 ; average prediction acc : 0.386700
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.386700
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr2_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr2_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37394 , cntTot: 100000 ; average prediction acc : 0.373940
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.373940
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr2_seg100000_segment6_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr2_seg100000_segment6_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 39083 , cntTot: 100000 ; average prediction acc : 0.390830
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.390830
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr2_seg100000_segment7_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr2/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr2_seg100000_segment7_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38455 , cntTot: 100000 ; average prediction acc : 0.384550
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.384550
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001135| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=III]

Found data for this chromosome: ref|NC_001135| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=III]
60 60 60
[['ref|NC_001135| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=III]', 316620]]
[]
Length of genome sequence read in:316620
Length of exonic-info sequence 

  plt.figure()


cntCorr: 39102 , cntTot: 99950 ; average prediction acc : 0.391216
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.391216
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr3/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr3_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr3/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr3_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37197 , cntTot: 100000 ; average prediction acc : 0.371970
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.371970
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr3/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr3_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr3/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr3_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 39289 , cntTot: 100000 ; average prediction acc : 0.392890
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.392890
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001136| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=IV]

Found data for this chromosome: ref|NC_001136| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=IV]
60 60 60
[['ref|NC_001136| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=IV]', 1531933]]
[]
Length of genome sequence read in:1531933
Length of exonic-info sequence r

  plt.figure()


cntCorr: 38375 , cntTot: 99950 ; average prediction acc : 0.383942
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383942
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38615 , cntTot: 100000 ; average prediction acc : 0.386150
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.386150
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38235 , cntTot: 100000 ; average prediction acc : 0.382350
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.382350
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38186 , cntTot: 100000 ; average prediction acc : 0.381860
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381860
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37698 , cntTot: 100000 ; average prediction acc : 0.376980
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.376980
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37472 , cntTot: 100000 ; average prediction acc : 0.374720
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.374720
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment6_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment6_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37508 , cntTot: 100000 ; average prediction acc : 0.375080
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.375080
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment7_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment7_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37631 , cntTot: 100000 ; average prediction acc : 0.376310
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.376310
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment8_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment8_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37892 , cntTot: 100000 ; average prediction acc : 0.378920
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.378920
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment9_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment9_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37629 , cntTot: 100000 ; average prediction acc : 0.376290
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.376290
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment10_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment10_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37375 , cntTot: 100000 ; average prediction acc : 0.373750
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.373750
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment11_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment11_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37339 , cntTot: 100000 ; average prediction acc : 0.373390
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.373390
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment12_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment12_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38006 , cntTot: 100000 ; average prediction acc : 0.380060
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.380060
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment13_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment13_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38110 , cntTot: 100000 ; average prediction acc : 0.381100
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381100
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr4_seg100000_segment14_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr4/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr4_seg100000_segment14_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38188 , cntTot: 100000 ; average prediction acc : 0.381880
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381880
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001137| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=V]

Found data for this chromosome: ref|NC_001137| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=V]
60 60 60
[['ref|NC_001137| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=V]', 576874]]
[]
Length of genome sequence read in:576874
Length of exonic-info sequence read i

  plt.figure()


cntCorr: 39171 , cntTot: 99950 ; average prediction acc : 0.391906
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.391906
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr5_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr5_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37870 , cntTot: 100000 ; average prediction acc : 0.378700
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.378700
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr5_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr5_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38470 , cntTot: 100000 ; average prediction acc : 0.384700
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.384700
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr5_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr5_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38571 , cntTot: 100000 ; average prediction acc : 0.385710
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.385710
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr5_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr5/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr5_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38311 , cntTot: 100000 ; average prediction acc : 0.383110
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383110
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001138| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=VI]

Found data for this chromosome: ref|NC_001138| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=VI]
60 60 60
[['ref|NC_001138| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=VI]', 270161]]
[]
Length of genome sequence read in:270161
Length of exonic-info sequence rea

  plt.figure()


cntCorr: 39447 , cntTot: 99950 ; average prediction acc : 0.394667
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.394667
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr6/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr6_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr6/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr6_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37861 , cntTot: 100000 ; average prediction acc : 0.378610
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.378610
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001139| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=VII]

Found data for this chromosome: ref|NC_001139| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=VII]
60 60 60
[['ref|NC_001139| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=VII]', 1090940]]
[]
Length of genome sequence read in:1090940
Length of exonic-info sequenc

  plt.figure()


cntCorr: 37696 , cntTot: 99950 ; average prediction acc : 0.377149
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.377149
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr7_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38097 , cntTot: 100000 ; average prediction acc : 0.380970
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.380970
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr7_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37976 , cntTot: 100000 ; average prediction acc : 0.379760
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.379760
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr7_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37900 , cntTot: 100000 ; average prediction acc : 0.379000
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.379000
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr7_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38389 , cntTot: 100000 ; average prediction acc : 0.383890
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383890
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr7_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38153 , cntTot: 100000 ; average prediction acc : 0.381530
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381530
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr7_seg100000_segment6_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment6_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37736 , cntTot: 100000 ; average prediction acc : 0.377360
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.377360
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr7_seg100000_segment7_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment7_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37740 , cntTot: 100000 ; average prediction acc : 0.377400
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.377400
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr7_seg100000_segment8_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment8_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37764 , cntTot: 100000 ; average prediction acc : 0.377640
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.377640
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr7_seg100000_segment9_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr7/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr7_seg100000_segment9_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38346 , cntTot: 100000 ; average prediction acc : 0.383460
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383460
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001140| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=VIII]

Found data for this chromosome: ref|NC_001140| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=VIII]
60 60 60
[['ref|NC_001140| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=VIII]', 562643]]
[]
Length of genome sequence read in:562643
Length of exonic-info sequen

  plt.figure()


cntCorr: 38692 , cntTot: 99950 ; average prediction acc : 0.387114
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.387114
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr8_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr8_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38678 , cntTot: 100000 ; average prediction acc : 0.386780
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.386780
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr8_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr8_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37969 , cntTot: 100000 ; average prediction acc : 0.379690
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.379690
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr8_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr8_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38309 , cntTot: 100000 ; average prediction acc : 0.383090
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383090
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr8_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr8/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr8_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38438 , cntTot: 100000 ; average prediction acc : 0.384380
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.384380
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001141| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=IX]

Found data for this chromosome: ref|NC_001141| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=IX]
60 60 60
[['ref|NC_001141| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=IX]', 439888]]
[]
Length of genome sequence read in:439888
Length of exonic-info sequence rea

  plt.figure()


cntCorr: 39401 , cntTot: 99950 ; average prediction acc : 0.394207
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.394207
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr9/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr9_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr9/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr9_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38234 , cntTot: 100000 ; average prediction acc : 0.382340
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.382340
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr9/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr9_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr9/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr9_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38701 , cntTot: 100000 ; average prediction acc : 0.387010
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.387010
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr9/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr9_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr9/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr9_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 39173 , cntTot: 100000 ; average prediction acc : 0.391730
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.391730
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001142| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=X]

Found data for this chromosome: ref|NC_001142| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=X]
60 60 60
[['ref|NC_001142| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=X]', 745751]]
[]
Length of genome sequence read in:745751
Length of exonic-info sequence read i

  plt.figure()


cntCorr: 38460 , cntTot: 99950 ; average prediction acc : 0.384792
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.384792
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr10_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr10_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38616 , cntTot: 100000 ; average prediction acc : 0.386160
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.386160
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr10_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr10_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37912 , cntTot: 100000 ; average prediction acc : 0.379120
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.379120
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr10_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr10_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38415 , cntTot: 100000 ; average prediction acc : 0.384150
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.384150
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr10_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr10_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38054 , cntTot: 100000 ; average prediction acc : 0.380540
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.380540
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr10_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr10_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38367 , cntTot: 100000 ; average prediction acc : 0.383670
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383670
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr10_seg100000_segment6_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr10/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr10_seg100000_segment6_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38429 , cntTot: 100000 ; average prediction acc : 0.384290
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.384290
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001143| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XI]

Found data for this chromosome: ref|NC_001143| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XI]
60 60 60
[['ref|NC_001143| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XI]', 666816]]
[]
Length of genome sequence read in:666816
Length of exonic-info sequence rea

  plt.figure()


cntCorr: 38333 , cntTot: 99950 ; average prediction acc : 0.383522
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383522
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr11_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr11_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38018 , cntTot: 100000 ; average prediction acc : 0.380180
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.380180
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr11_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr11_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38136 , cntTot: 100000 ; average prediction acc : 0.381360
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381360
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr11_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr11_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38240 , cntTot: 100000 ; average prediction acc : 0.382400
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.382400
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr11_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr11_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37276 , cntTot: 100000 ; average prediction acc : 0.372760
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.372760
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr11_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr11/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr11_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38673 , cntTot: 100000 ; average prediction acc : 0.386730
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.386730
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001144| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XII]

Found data for this chromosome: ref|NC_001144| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XII]
60 60 60
[['ref|NC_001144| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XII]', 1078177]]
[]
Length of genome sequence read in:1078177
Length of exonic-info sequenc

  plt.figure()


cntCorr: 39232 , cntTot: 99950 ; average prediction acc : 0.392516
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.392516
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr12_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37789 , cntTot: 100000 ; average prediction acc : 0.377890
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.377890
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr12_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38104 , cntTot: 100000 ; average prediction acc : 0.381040
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381040
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr12_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38704 , cntTot: 100000 ; average prediction acc : 0.387040
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.387040
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr12_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 39938 , cntTot: 100000 ; average prediction acc : 0.399380
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.399380
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr12_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38734 , cntTot: 100000 ; average prediction acc : 0.387340
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.387340
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr12_seg100000_segment6_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment6_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37776 , cntTot: 100000 ; average prediction acc : 0.377760
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.377760
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr12_seg100000_segment7_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment7_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37685 , cntTot: 100000 ; average prediction acc : 0.376850
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.376850
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr12_seg100000_segment8_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment8_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38739 , cntTot: 100000 ; average prediction acc : 0.387390
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.387390
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr12_seg100000_segment9_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr12/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr12_seg100000_segment9_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37978 , cntTot: 100000 ; average prediction acc : 0.379780
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.379780
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001145| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XIII]

Found data for this chromosome: ref|NC_001145| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XIII]
60 60 60
[['ref|NC_001145| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XIII]', 924431]]
[]
Length of genome sequence read in:924431
Length of exonic-info sequen

  plt.figure()


cntCorr: 38751 , cntTot: 99950 ; average prediction acc : 0.387704
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.387704
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr13_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38215 , cntTot: 100000 ; average prediction acc : 0.382150
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.382150
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr13_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38026 , cntTot: 100000 ; average prediction acc : 0.380260
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.380260
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr13_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38352 , cntTot: 100000 ; average prediction acc : 0.383520
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383520
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr13_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37992 , cntTot: 100000 ; average prediction acc : 0.379920
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.379920
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr13_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37532 , cntTot: 100000 ; average prediction acc : 0.375320
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.375320
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr13_seg100000_segment6_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment6_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38263 , cntTot: 100000 ; average prediction acc : 0.382630
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.382630
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr13_seg100000_segment7_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment7_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37848 , cntTot: 100000 ; average prediction acc : 0.378480
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.378480
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr13_seg100000_segment8_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr13/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr13_seg100000_segment8_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38462 , cntTot: 100000 ; average prediction acc : 0.384620
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.384620
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001146| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XIV]

Found data for this chromosome: ref|NC_001146| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XIV]
60 60 60
[['ref|NC_001146| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XIV]', 784333]]
[]
Length of genome sequence read in:784333
Length of exonic-info sequence 

  plt.figure()


cntCorr: 39249 , cntTot: 99950 ; average prediction acc : 0.392686
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.392686
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr14_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr14_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38842 , cntTot: 100000 ; average prediction acc : 0.388420
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.388420
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr14_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr14_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38162 , cntTot: 100000 ; average prediction acc : 0.381620
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381620
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr14_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr14_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38393 , cntTot: 100000 ; average prediction acc : 0.383930
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383930
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr14_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr14_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38514 , cntTot: 100000 ; average prediction acc : 0.385140
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.385140
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr14_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr14_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38288 , cntTot: 100000 ; average prediction acc : 0.382880
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.382880
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr14_seg100000_segment6_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr14/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr14_seg100000_segment6_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 39000 , cntTot: 100000 ; average prediction acc : 0.390000
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.390000
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001147| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XV]

Found data for this chromosome: ref|NC_001147| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XV]
60 60 60
[['ref|NC_001147| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XV]', 1091291]]
[]
Length of genome sequence read in:1091291
Length of exonic-info sequence r

  plt.figure()


cntCorr: 38265 , cntTot: 99950 ; average prediction acc : 0.382841
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.382841
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr15_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37846 , cntTot: 100000 ; average prediction acc : 0.378460
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.378460
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr15_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38721 , cntTot: 100000 ; average prediction acc : 0.387210
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.387210
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr15_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37907 , cntTot: 100000 ; average prediction acc : 0.379070
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.379070
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr15_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38306 , cntTot: 100000 ; average prediction acc : 0.383060
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383060
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr15_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38445 , cntTot: 100000 ; average prediction acc : 0.384450
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.384450
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr15_seg100000_segment6_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment6_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37521 , cntTot: 100000 ; average prediction acc : 0.375210
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.375210
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr15_seg100000_segment7_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment7_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37838 , cntTot: 100000 ; average prediction acc : 0.378380
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.378380
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr15_seg100000_segment8_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment8_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37807 , cntTot: 100000 ; average prediction acc : 0.378070
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.378070
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr15_seg100000_segment9_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr15/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr15_seg100000_segment9_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38331 , cntTot: 100000 ; average prediction acc : 0.383310
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.383310
No repeat sections were recorded in the genome data.
Reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
Fast reading in genome data ... 
Only considering data following fasta header lines (: chromo names 
 for eucaryots) of length < 200
OBS: no file containing exonic info was provided, so exonic status is set to 0 from 0 - 100000001
Genome data file 1st line:
  >ref|NC_001148| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XVI]

Found data for this chromosome: ref|NC_001148| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XVI]
60 60 60
[['ref|NC_001148| [org=Saccharomyces cerevisiae] [strain=S288C] [moltype=genomic] [chromosome=XVI]', 948066]]
[]
Length of genome sequence read in:948066
Length of exonic-info sequence 

  plt.figure()


cntCorr: 38689 , cntTot: 99950 ; average prediction acc : 0.387084
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.387084
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr16_seg100000_segment1_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment1_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38260 , cntTot: 100000 ; average prediction acc : 0.382600
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.382600
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr16_seg100000_segment2_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment2_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37720 , cntTot: 100000 ; average prediction acc : 0.377200
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.377200
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr16_seg100000_segment3_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment3_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38117 , cntTot: 100000 ; average prediction acc : 0.381170
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381170
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr16_seg100000_segment4_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment4_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37994 , cntTot: 100000 ; average prediction acc : 0.379940
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.379940
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr16_seg100000_segment5_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment5_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37816 , cntTot: 100000 ; average prediction acc : 0.378160
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.378160
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr16_seg100000_segment6_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment6_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38080 , cntTot: 100000 ; average prediction acc : 0.380800
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.380800
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr16_seg100000_segment7_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment7_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 37524 , cntTot: 100000 ; average prediction acc : 0.375240
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.375240
No repeat sections were recorded in the genome data.
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_labelArray_R64_chr16_seg100000_segment8_avgRevCompl0
/content/drive/MyDrive/DNA_proj/results/predictions/R64_chr16/modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr19_predArray_R64_chr16_seg100000_segment8_avgRevCompl0
lSamples  100000 , nrSteps  100000


  plt.figure()


cntCorr: 38126 , cntTot: 100000 ; average prediction acc : 0.381260
corrDisq: 0 , cntDisqTot: 0 ; average prediction acc no disq: 0.381260
No repeat sections were recorded in the genome data.


Then run the Fourier transforms again on new settings (this is even faster than above):

In [29]:
#GC bias
#To get the qual arrays from the model-pred run we need:
rootOutput = r"/content/drive/MyDrive/DNA_proj/results/predictions/"
lastRepeatNr = nrOfRepeats -1 
modelFileNameNN = "modelLSTM_1LayerConv2LayerLstm1LayerDense50_flanks50_win4_filters256_stride1_overlap0_dropout00_bigLoopIter0_repeatNr" + str(lastRepeatNr)
#To get the predReturs fo the bias:
modelFileName_forATorGCbias = 'GCbias'
rootOutput_forATorGCbias = r"/content/drive/MyDrive/DNA_proj/results/GCbias/"  
rootInput_forATorGCbias =rootOutput_forATorGCbias
forATorGCbias_b = 1 #!



These settings are unchanged, except inputArrayType which must be set to 0!:

In [30]:

#We reuse some of the settings from above
modelFileNameNN = modelFileNameNN
chromosomeOrderList = chromosomeOrderList
chromosomeDict = chromosomeDict 

#General settings:
segmentLength = 100000

augmentWithRevComplementary_b = 0  #!

#window lgth and stepsize used in generating the avg prediction
windowLength = 1
stepSize = 1
                      
#Param's for Fourier plots:
shuffle_b = 0 #!
randomizeDisqualified_b = 0 #!
randomizingByShuffle_b = 0 #!
fullGC_b = 0 #!

#Which input to use:
inputArrayType = 0 # 1: ref base prob's; 0: pred returns (GCbias)
plotOnlyNorm_b = 1 #default:1 

#Here we only plot the 20-2000 range; extend the list with more starts/stops to cover several ranges
fourierStep = 10
fourierRawPlotFrq = 5
fourierStartList = [20]
fourierStopList = [2000]
fourierWindowLengthList = [100]

ratioQcutoff = 0.9 #0.7 w hg18
dumpFourier_b = 0
dumpFileNamePrefix = ''

In [31]:
#Then call Fourier-wrapper:
for i in range(len(fourierStartList)):
    fourierStart = fourierStartList[i]
    fourierStop = fourierStopList[i]
    fourierWindowLength = fourierWindowLengthList[i]
    stats.computeFourierChromosomes(chromosomeOrderList = chromosomeOrderList,
                                    rootOutput = rootOutput,
                                    modelFileName = modelFileNameNN,  
                                    segmentLength = segmentLength,
                                    inputArrayType = inputArrayType,
                                    averageRevComplementary_b = augmentWithRevComplementary_b,
                                    ratioQcutoff = ratioQcutoff,
                                    windowLength = windowLength,
                                    stepSize = stepSize,
                                    plotOnlyNorm_b = plotOnlyNorm_b,
                                    fourierWindowLength = fourierWindowLength,
                                    fourierStart = fourierStart,
                                    fourierStop = fourierStop,
                                    fourierStep = fourierStep, 
                                    fourierRawPlotFrq = fourierRawPlotFrq,
                                    shuffle_b = shuffle_b,
                                    randomizeDisqualified_b =randomizeDisqualified_b,
                                    randomizingByShuffle_b = randomizingByShuffle_b,
                                    forATorGCbias_b = forATorGCbias_b, 
                                    rootOutput_forATorGCbias= rootOutput_forATorGCbias,
                                    rootInput_forATorGCbias = rootInput_forATorGCbias,
                                    fullGC_b = fullGC_b,
                                    dumpFourier_b = dumpFourier_b,
                                    dumpFileNamePrefix = dumpFileNamePrefix, 
                                    modelFileName_forATorGCbias = modelFileName_forATorGCbias)


Directory /content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr1/FourierOnPredReturn/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr1/GCbias_predReturn_R64_chr1_seg100000_segment0_avgRevCompl0_win1_step1
0 (99950,)
99950
avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 39135.000000 (3.855938)
After stop (2000) the max (min) mod

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr1/GCbias_predReturn_R64_chr1_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 39455.000000 (3.488686)
After stop (2000) the max (min) modulus of Fourier coeff is: 812.552551 (0.740524)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr1/GCbias_predReturn_R64_chr1_seg100000_segment2_avgRevCompl0_win1_step1
Directory /content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr2/FourierOnPredReturn/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr2/GCbia

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr2/GCbias_predReturn_R64_chr2_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38442.000000 (1.694917)
After stop (2000) the max (min) modulus of Fourier coeff is: 899.119202 (0.852205)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr2/GCbias_predReturn_R64_chr2_seg100000_segment2_avgRevCompl0_win1_step1
2 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38670.000000 (5.712512)
After stop (2000) the max (min) modulus of Fourier coeff is: 947.602356 (0.590547)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr2/GCbias_predReturn_R64_chr2_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr3/GCbias_predReturn_R64_chr3_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37197.000000 (3.212271)
After stop (2000) the max (min) modulus of Fourier coeff is: 598.906738 (0.452759)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr3/GCbias_predReturn_R64_chr3_seg100000_segment2_avgRevCompl0_win1_step1
2 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr4/GCbias_predReturn_R64_chr4_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38615.000000 (1.860952)
After stop (2000) the max (min) modulus of Fourier coeff is: 854.235840 (0.686210)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr4/GCbias_predReturn_R64_chr4_seg100000_segment2_avgRevCompl0_win1_step1
2 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37698.000000 (4.765543)
After stop (2000) the max (min) modulus of Fourier coeff is: 706.232666 (0.813331)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr4/GCbias_predReturn_R64_chr4_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37629.000000 (1.740185)
After stop (2000) the max (min) modulus of Fourier coeff is: 888.740234 (0.529046)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr4/GCbias_predReturn_R64_chr4_seg100000_segment10_avgRevCompl0_win1_step1
10 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38188.000000 (3.895235)
After stop (2000) the max (min) modulus of Fourier coeff is: 741.861450 (0.847426)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr4/GCbias_predReturn_R64_chr4_seg100000_segment15_avgRevCompl0_win1_step1
Directory /content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr5/FourierOnPredReturn/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr5/GCbias_predReturn_R64_chr5_seg100000_segment0_avgRevCompl0_win1_step1
0 (99950,)
99950
avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght: 

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr5/GCbias_predReturn_R64_chr5_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37870.000000 (1.001016)
After stop (2000) the max (min) modulus of Fourier coeff is: 697.037964 (1.302505)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr5/GCbias_predReturn_R64_chr5_seg100000_segment2_avgRevCompl0_win1_step1
2 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38311.000000 (3.292802)
After stop (2000) the max (min) modulus of Fourier coeff is: 778.410522 (0.149960)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr5/GCbias_predReturn_R64_chr5_seg100000_segment5_avgRevCompl0_win1_step1
Directory /content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr6/FourierOnPredReturn/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr6/GCbias_predReturn_R64_chr6_seg100000_segment0_avgRevCompl0_win1_step1
0 (99950,)
99950
avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr6/GCbias_predReturn_R64_chr6_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37861.000000 (4.244049)
After stop (2000) the max (min) modulus of Fourier coeff is: 803.977234 (0.678442)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr6/GCbias_predReturn_R64_chr6_seg100000_segment2_avgRevCompl0_win1_step1
Directory /content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr7/FourierOnPredReturn/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr7/GCbia

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 37696.000000 (2.964901)
After stop (2000) the max (min) modulus of Fourier coeff is: 887.371582 (0.774630)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr7/GCbias_predReturn_R64_chr7_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38389.000000 (1.924333)
After stop (2000) the max (min) modulus of Fourier coeff is: 728.271545 (0.542412)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr7/GCbias_predReturn_R64_chr7_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38346.000000 (2.088079)
After stop (2000) the max (min) modulus of Fourier coeff is: 854.756409 (0.676943)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr7/GCbias_predReturn_R64_chr7_seg100000_segment10_avgRevCompl0_win1_step1
Directory /content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr8/FourierOnPredReturn/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr8/GCbias_predReturn_R64_chr8_seg100000_segment0_avgRevCompl0_win1_step1
0 (99950,)
99950


  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 38692.000000 (2.146831)
After stop (2000) the max (min) modulus of Fourier coeff is: 826.818970 (0.454288)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr8/GCbias_predReturn_R64_chr8_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr8/GCbias_predReturn_R64_chr8_seg100000_segment5_avgRevCompl0_win1_step1
Directory /content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr9/FourierOnPredReturn/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr9/GCbias_predReturn_R64_chr9_seg100000_segment0_avgRevCompl0_win1_step1
0 (99950,)
99950
avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr9/GCbias_predReturn_R64_chr9_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38234.000000 (3.469328)
After stop (2000) the max (min) modulus of Fourier coeff is: 755.936829 (0.648541)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr9/GCbias_predReturn_R64_chr9_seg100000_segment2_avgRevCompl0_win1_step1
2 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr10/GCbias_predReturn_R64_chr10_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38616.000000 (1.772816)
After stop (2000) the max (min) modulus of Fourier coeff is: 725.846252 (0.931791)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr10/GCbias_predReturn_R64_chr10_seg100000_segment2_avgRevCompl0_win1_step1
2 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38054.000000 (3.885078)
After stop (2000) the max (min) modulus of Fourier coeff is: 910.987732 (0.826151)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr10/GCbias_predReturn_R64_chr10_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr11/GCbias_predReturn_R64_chr11_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38018.000000 (2.839303)
After stop (2000) the max (min) modulus of Fourier coeff is: 795.839050 (1.062823)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr11/GCbias_predReturn_R64_chr11_seg100000_segment2_avgRevCompl0_win1_step1
2 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37276.000000 (7.150682)
After stop (2000) the max (min) modulus of Fourier coeff is: 774.341125 (0.835015)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr11/GCbias_predReturn_R64_chr11_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr12/GCbias_predReturn_R64_chr12_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37789.000000 (2.669303)
After stop (2000) the max (min) modulus of Fourier coeff is: 835.425293 (0.970405)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr12/GCbias_predReturn_R64_chr12_seg100000_segment2_avgRevCompl0_win1_step1
2 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 39938.000000 (6.128075)
After stop (2000) the max (min) modulus of Fourier coeff is: 594.277039 (0.465773)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr12/GCbias_predReturn_R64_chr12_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37978.000000 (3.925025)
After stop (2000) the max (min) modulus of Fourier coeff is: 1114.926636 (0.427163)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr12/GCbias_predReturn_R64_chr12_seg100000_segment10_avgRevCompl0_win1_step1
Directory /content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr13/FourierOnPredReturn/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr13/GCbias_predReturn_R64_chr13_seg100000_segment0_avgRevCompl0_win1_step1
0 (99950,)
99950


  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 38751.000000 (2.401865)
After stop (2000) the max (min) modulus of Fourier coeff is: 819.407166 (0.658489)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr13/GCbias_predReturn_R64_chr13_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr13/GCbias_predReturn_R64_chr13_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37532.000000 (3.961549)
After stop (2000) the max (min) modulus of Fourier coeff is: 669.964661 (0.083615)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr13/GCbias_predReturn_R64_chr13_seg100000_segment6_avgRevCompl0_win1_step1
6 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 39249.000000 (1.739794)
After stop (2000) the max (min) modulus of Fourier coeff is: 963.375427 (0.950406)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr14/GCbias_predReturn_R64_chr14_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38842.000000 (1.533661)
After stop (2000) the max (min) modulus of Fourier coeff is: 1372.126709 (0.280411)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr14/GCbias_predReturn_R64_chr14_seg100000_segment2_avgRevCompl0_win1_step1
2 (

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr14/GCbias_predReturn_R64_chr14_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38288.000000 (2.974340)
After stop (2000) the max (min) modulus of Fourier coeff is: 741.596252 (0.499196)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr14/GCbias_predReturn_R64_chr14_seg100000_segment6_avgRevCompl0_win1_step1
6 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr15/GCbias_predReturn_R64_chr15_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37846.000000 (4.855501)
After stop (2000) the max (min) modulus of Fourier coeff is: 724.919922 (0.499967)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr15/GCbias_predReturn_R64_chr15_seg100000_segment2_avgRevCompl0_win1_step1
2 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38306.000000 (1.972946)
After stop (2000) the max (min) modulus of Fourier coeff is: 752.990601 (0.386541)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr15/GCbias_predReturn_R64_chr15_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-0

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 38331.000000 (0.763626)
After stop (2000) the max (min) modulus of Fourier coeff is: 1028.242310 (0.629951)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr15/GCbias_predReturn_R64_chr15_seg100000_segment10_avgRevCompl0_win1_step1
Directory /content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr16/FourierOnPredReturn/20to2000/ created. Output will be placed there.
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr16/GCbias_predReturn_R64_chr16_seg100000_segment0_avgRevCompl0_win1_step1
0 (99950,)
99950


  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig1, ax1 = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig2, ax2 = plt.subplots()
  fig3, ax3 = plt.subplots()


avgPredArray size:  99950
ratioQ:  1.0
Fourier transf lenght:  99950
frqs  [ 0.00000000e+00  1.00050025e-05  2.00100050e-05 ... -3.00150075e-05
 -2.00100050e-05 -1.00050025e-05]
First 10 (positive) frqs:  [0.00000000e+00 1.00050025e-05 2.00100050e-05 3.00150075e-05
 4.00200100e-05 5.00250125e-05 6.00300150e-05 7.00350175e-05
 8.00400200e-05 9.00450225e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.00050025e-05 -2.00100050e-05 -3.00150075e-05 -4.00200100e-05
 -5.00250125e-05 -6.00300150e-05 -7.00350175e-05 -8.00400200e-05
 -9.00450225e-05 -1.00050025e-04]
Up to stop (2000) with frq 0.020010, the max (min) modulus of Fourier coeff is: 38689.003906 (1.614296)
After stop (2000) the max (min) modulus of Fourier coeff is: 723.897400 (0.540813)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr16/GCbias_predReturn_R64_chr16_seg100000_segment1_avgRevCompl0_win1_step1
1 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.

  fig0, ax0 = plt.subplots()
  fig1_loc, ax1_loc = plt.subplots()
  fig2_loc, ax2_loc = plt.subplots()
  fig3, ax3 = plt.subplots()


/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr16/GCbias_predReturn_R64_chr16_seg100000_segment5_avgRevCompl0_win1_step1
5 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-05 6.e-05 7.e-05 8.e-05 9.e-05]
Largest (positive) frq: -0.500000
First 10 (negative) frqs:  [-1.e-05 -2.e-05 -3.e-05 -4.e-05 -5.e-05 -6.e-05 -7.e-05 -8.e-05 -9.e-05
 -1.e-04]
Up to stop (2000) with frq 0.020000, the max (min) modulus of Fourier coeff is: 37816.000000 (2.446056)
After stop (2000) the max (min) modulus of Fourier coeff is: 1092.704956 (0.352133)
/content/drive/MyDrive/DNA_proj/results/GCbias/R64_chr16/GCbias_predReturn_R64_chr16_seg100000_segment6_avgRevCompl0_win1_step1
6 (100000,)
100000
ratioQ:  1.0
Fourier transf lenght:  100000
frqs  [ 0.e+00  1.e-05  2.e-05 ... -3.e-05 -2.e-05 -1.e-05]
First 10 (positive) frqs:  [0.e+00 1.e-05 2.e-05 3.e-05 4.e-05 5.e-

The output plots should now sit in FourierOnPredReturn/20To2000 subfolders of the GCbias/chrNN folders.