<a href="https://colab.research.google.com/github/glenamac/TensorFlowHurstExponent/blob/main/TensorFlowHurstExponent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Training a Neural Network to Recognize Hurst Exponents from Fractal Brownian Motion



## Imports and Setup

In [17]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [18]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import tensorflow as tf


## Load data from Google Drive

In [None]:
input=np.array(pd.read_csv("drive/My Drive/input3.dat",header=None))

## Reshape Data

The raw data loaded are in single-column, plain ascii files. The reason for this is that Python/Pandas is optimized to load many rows of data but only for small numbers of columns. 

NOTE: The constant 307201 is not a magic number ... it is 640*480, or the number of pixels in the .png graphics being used plus one column for labels. 

One reason to reshape the data, but perhaps not a great one, is that the labels for each light curve are stored as a single integer at the end of the each 307201 length light curve. The first is at 307201, the second is at 614402, and so on. By reshaping the data into a rectangle of width 307201, the labels all appear in the last column and are trivial to slice thru. Of course, the rectangular data, with the labels removed are once again flattened before passing them onto the internal layers of the model. See below. 

In [None]:
num_pixels_and_label = 640*480+1

In [None]:
numrows=int(input.shape[0]/num_pixels_and_label)

At this point, every row in reshaped_input (numrows) represents a light curve

In [None]:
reshaped_input=np.reshape(input,[numrows,num_pixels_and_label])


### Recast Data as Integer Type

In [None]:
data = np.array(reshaped_input,dtype="int")


## Shuffle The Data

We want to surprise the model so as not to see many of the same types of light curves grouped together. 

In [None]:
m, n = data.shape
np.random.shuffle(data) # shuffle before splitting into dev and training sets
data.shape

Slice out the last column where the labels are stord and add them to *train_labels*.

In [None]:
train_labels=data[:,-1]
#train_labels

Add the remainder of the data (the actual pixel data) to *train_images* and confirm the shape. 

In [None]:
train_images=np.reshape(data[:,0:n-1],[numrows,480,640])
(j,k,l)=train_images.shape
j,k,l

## Print an example light curve as a quick sanity check.

In [None]:
Index=0
plt.imshow(train_images[Index,:,:])
print("Hurst exponent = %0.2f\n" % (data[Index,-1]/10.))

Clean up object in case memory starts getting scarce.

In [None]:
#del(input)
#del(dev)
#del(reshaped_input)
#del(data)

## Create and compile model



In [None]:
model = tf.keras.Sequential([
                             tf.keras.layers.Flatten(input_shape=(480, 640)),
                             tf.keras.layers.Dense(128,activation="relu"),
                             tf.keras.layers.Dense(10)

])

In [None]:
model.compile(optimizer="adam",
              loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=["accuracy"]
              )

## Train the model



In [None]:
model.fit(train_images,train_labels,epochs=10)

Show user model summary; num. of layers, shape, num. of parameters.

In [None]:
model.summary()
