-----Down and Dirty Demo----

So you don't have hours to spend on training up a neural network?
Not to worry, step in to our super quick down and dirty edition!
If you followed the steps in the README about what you would need loaded before you get started this will take you a little under 10 minutes to complete. (Who doesn't have 10 minutes for this kind of awesomeness?)

Alternatively here is our larger [demo](https://colab.research.google.com/drive/1te5e-wfNxUqyPRLCPVMau8Nf3AMlprOB?usp=sharing)  

And a much much smaller [dataset](https://drive.google.com/drive/folders/10ldZfWHfqrv20AnGLO9QQdpr2S9493Tz?usp=sharing ) you can use in it! 


Quick check list to make sure you are ready to go.
1.  Jupyter Notebook      https://jupyter.org/install
2.  Python                https://www.python.org/downloads/
3.  Python dependencies (use these at you command line)
        $ pip install --upgrade tensorflow
        $ pip install numpy scipy
        $ pip install scikit-learn
        $ pip install pillow
        $ pip install h5py
        $ pip install keras
4.  Dataset to test with  https://drive.google.com/drive/folders/156QEK8JuWnFqsHnptc--pRW0lYam8fis?usp=sharing
5   Info for saved model  https://drive.google.com/file/d/1LyMNrvk72AVxLWGEQP-B4DAm9wkoc93b/view?usp=sharing

6. Alternatively here is our larger demo https://colab.research.google.com/drive/1te5e-wfNxUqyPRLCPVMau8Nf3AMlprOB?usp=sharing and a smaller dataset to use aswell! 

https://drive.google.com/drive/folders/10ldZfWHfqrv20AnGLO9QQdpr2S9493Tz?usp=sharing

If this all checks out, you are good to go!

Start with loading all necessacy modules.
Tensorflow and Keras will do all the heavy lifting for us putting all the proper components in place to build a neural network.

In [None]:
# import all necessary modules
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
from IPython.display import display
import matplotlib.pyplot as plt
%matplotlib inline

Next step we need to set up the variables we used in the model.

We processed the images as 256 X 256.

Be sure to use 3 for rgb (red, green, blue) setting to designate the images are in color.

Next up, set your batch size (how many images you want to process at a time) and the number or epochs (how many times you want to process the batch). 

In [None]:
# set the variables to work with throughout model
img_height = 256
img_width= 256
rgb = 3
batch_size = 10  
epochs = 50

We are going to hit the fastforward button now.

The next set of steps will just represent additional pieces to the network, but we did our thing that way you don't have.  It is important to take a look over these so you can understand how this works.

---DATA---

This whole nerual net thing doesn't work without data. Sometimes they don't work well without enough data (kind of the point of our whole project).  The following cell (visual purposes only) shows how we brought in our original data and our custom data.  Prepare to be impressed.

If you just want to get to the good stuff, you can skip over this one.

In [None]:
# We needed to created 4 sets of data to properly train and test the models for accuracy.  This is how we did it
""""

# set-up training data from the original dataset only
ods_train = tf.keras.preprocessing.image_dataset_from_directory(
    '**local path to directory**', 
    labels = 'inferred',
    label_mode = 'categorical',
    batch_size = batch_size,
    image_size = (img_height, img_width),
    shuffle = True,
    seed = 123,
    validation_split = 0.3,
    subset = 'training',
)

# set-up training data using original plus the augmented data as well
plusAugmented_train = tf.keras.preprocessing.image_dataset_from_directory(
    '**local path to directory**',
    labels = 'inferred',
    label_mode = 'categorical',
    batch_size = batch_size,
    image_size = (img_height, img_width),
    shuffle = True,
    seed = 123,
    validation_split = 0.3,
    subset = 'training',
)

# now create a batch to test ont the plusAugmented model with unseen original images
ods_test = tf.keras.preprocessing.image_dataset_from_directory(
   '**local path to directory**', 
    labels = 'inferred',
    label_mode = 'categorical',
    batch_size = batch_size,
    image_size = (img_height, img_width),
    shuffle = True,
    seed = 456,
    validation_split = 0.3,
    subset = "validation",
)

# also need a batch to test with from the original plus augmented dataset
plusAugmented_test = tf.keras.preprocessing.image_dataset_from_directory(
    '**local path to directory**',
    labels = 'inferred',
    label_mode = 'categorical',
    batch_size = batch_size,
    image_size = (img_height, img_width),
    shuffle = True,
    seed = 456,
    validation_split = 0.3,
    subset = "validation",
)
""""

We have model variables in place and datasets ready to go at this point.
Let's build a model!

There is  lot of stuff going on here.  The short version of what is happening is that we are using the spectrogram images (which are pictures of sound across time) to train a neural network that can identify the spoken digits 0-9.  Crazy, right?  Each one ot the layers below serves a specific purpose towards our end goal.  We won't bog you down with details, but if you want to know more I highly advise taking Dr. Phillip's Neural Networks course at Middle Tenneessee State University.  Mind blowing.  

In [None]:
#  build up model
model = keras.models.Sequential()
model.add(keras.layers.Convolution2D(16, kernel_size = (5, 5), input_shape=(img_height, img_width, rgb ), activation='relu'))
model.add(keras.layers.Convolution2D(16, kernel_size = (3, 3), activation='relu'))
model.add(keras.layers.BatchNormalization())
model.add(keras.layers.MaxPooling2D(pool_size=(2, 2)))
model.add(keras.layers.Dropout(0.2))
# have to make sure we pass the correct shape to LSTM
# use reshape to convert from 2D to 3D!
model.add(keras.layers.Reshape((125, -1)))
# using TimeDistributed layer
# this makes sure the dense layer keeps track of the 
# temporal aspects of the data
model.add(keras.layers.TimeDistributed(layers.Dense(50)))
model.add(keras.layers.LSTM(50, return_sequences=False))
model.add(keras.layers.Dense(10,activation = 'softmax'))

Now we have a model.  It would be appropriate to high-five yourself at this point.

So what in the world does that mean?  Well, the next cell will give us a good visual of what a model is actually comprised of.  Go ahead and run this, then we will explain a bit after.

In [None]:
# display model layers w/ input and output from layers
tf.keras.utils.plot_model(model, show_shapes=True, expand_nested=True)

The display is of each layer's input and output within the model.  This will give you an idea of how each layer connects moving downstream.  Notice the input of a layer has to match the output of the previous layer.  At the end, you will see the final output is 10 of something.  That my friends, is the final classifications of the images coming through.  The network is going to let us know if the the images it has reviewed are a spoken digit of 0-9.  How cool is that?!?

Okay, so we are almost to the good part.  Next up we would compile our model.  This is going to give us a working function that actually does the hard part of analyzing all the data.

In [None]:
# compile model   
model.compile(optimizer = keras.optimizers.Adam(),
             loss = [ keras.losses.CategoricalCrossentropy(from_logits=True)],
             metrics = [ keras.metrics.CategoricalAccuracy()])

We saw a visual of the model and its layers. Let's take a quick look now that it is compiled and see what is really going to be moving through this behemoth.

In [None]:
# show a summary of layers and the parameters
model.summary()

This gives you a bit more info on what is about to go down.  It displays the total number of parameters that we will be using.  If you are finding this intriguing, after completing this demo I suggest you take a look at the long demo for more details about how you can tweak those numbers.

Almost there!!

So for the next step, we again have doen the heavy lifting for you (and you are welcome).
This is where the actual training would occur.  This generally is very time consuming.  The code we used is below.

""""
# training the original dataset
ods_history = model.fit(ods_train,
                    batch_size = batch_size,
                    epochs = epochs,
                    verbose = 1)
""""

Once the training is done, you need to evaluate it for accuracy.  Yep, we did that as well.  Take a look below to see the code and our results.

""""
# now test the model against the reserved unseen images from the original dataset
ods_t = model.evaluate(ods_test, verbose = 1)
print("Using only original test samples:")
print("Accuracy: ", ods_t[1] * 100)
print("Loss: ", ods_t[0])
""""

90/90 [==============================] - 5s 44ms/step - loss: 0.9950 - categorical_accuracy: 0.7089
Using only original test samples:
Accuracy:  70.88888883590698
Loss:  0.9950082898139954

So when we ran this our accuracy was 70.8%.  Not terrible, but I think we can do better.

Now we would need to train up the augmented set of data in a model to have something to compare to.  And if you were wondering how long this took, just know that I remodeled my bathroom before it was done. So yes, we saved our results to share with you.

""""
# next, we need to train with the original plus the augmented data on the exact same architecture
plusAugmented_history = model.fit(plusAugmented_train,
                    batch_size = batch_size,
                    epochs = epochs,
                    verbose = 1)
""""

How did it go?

""""
# now let's test with unseen images from the augmented plus original dataset
plusAugmented_history_test = model.evaluate(plusAugmented_test, verbose = 1)
print("Using the augmented plus original test samples:")
print("Accuracy: ", plusAugmented_history_test[1] * 100)
print("Loss: ", plusAugmented_history_test[0])
""""

987/987 [==============================] - 46s 46ms/step - loss: 0.1274 - categorical_accuracy: 0.9627
Using the augmented plus original test samples:
Accuracy:  96.271151304245
Loss:  0.12736131250858307

Not too shabby, huh?
This model validated at 96% accuracy.

Still not done though.  We have one last piece of this mind bending puzzle left.  We have to test the last model against truly unseen images of unaltered sounds.  Use this next piece of code to initialize all of our saved info from this model so you can see for yourself what the results are.

Don't forget to use your local file path to access it!

In [None]:
model = keras.models.load_model("  **local file path**  ")

--Quick note--  this is what we got from this model when we ran it.

11/11 [==============================] - 2s 17ms/step - loss: 0.0760 - categorical_accuracy: 0.9706
Using completely unseen originals only:
Accuracy:  97.0588207244873
Loss:  0.0760301724076271

Comparing the results from the 1st model using zero augmentation (70% accuracy) to this model using the original plus a ton of augmented (97% accuracy) we can see a 27% improvement in accuracy.

"Holy Statistics Batman!!"  It works.

Last step..... Let's Goooooooo!

Plug in your local file path for the unseen_test image set you downloaded earlier and watch this techno wizardy happen right before your vary eyes!!

In [None]:
# NOW, let's test the plusAugmented model with super secret unseen original images
unseen_test_history = model.evaluate(unseen_test, verbose = 1)
print("Using completely unseen originals only:")
print("Accuracy: ", unseen_test_history[1] * 100)
print("Loss: ", unseen_test_history[0])

AAAANNNDDDD done... Hope you enjoyed this.  If this sparked an interest for you (and we all know it did), check out the longer demo that gives a full walk through of all our code and more information on how to change things up a bit.  Thanks for spending some time with us!!!