# Assignment 7 - Tensorflow
## Alexander Mervar - 3.9.2022

In [6]:
# What version of Python do you have?
import sys

import tensorflow.keras
import pandas as pd
import sklearn as sk
import tensorflow as tf

print(f"Tensor Flow Version: {tf.__version__}")
print(f"Keras Version: {tensorflow.keras.__version__}")
print()
print(f"Python {sys.version}")
print(f"Pandas {pd.__version__}")
print(f"Scikit-Learn {sk.__version__}")
gpu = len(tf.config.list_physical_devices('GPU'))>0
print("GPU is", "available" if gpu else "NOT AVAILABLE")

Tensor Flow Version: 2.4.0
Keras Version: 2.4.0

Python 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:33:09) 
[Clang 11.1.0 ]
Pandas 1.4.1
Scikit-Learn 1.0.2
GPU is NOT AVAILABLE


### Exercise 1
**Using the tensorflow Keras API, build and train a deep network that classifies the MNIST-fashion dataset as accurately as possible. Submit your code to Canvas (10 pts)**

In [7]:
import tensorflow as tf
mnist = tf.keras.datasets.mnist

(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.models.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(512, activation='relu'),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[0.0666656568646431, 0.9789999723434448]

### Exercise 2
**Try to optimize your code to achieve a higher percentage accuracy by manipulating parameters such as dropout, stride, and the numbers of units in each layer. Describe in a short essay what network structures and design choices lead to better performance?  Why do you think that is? (10 pts)**

In [13]:
(x_train, y_train),(x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

model = tf.keras.Sequential([
    # Convulutional 2D layer to extract the most relevant features
    tf.keras.layers.Conv2D(filters=64, kernel_size=(2,2),strides=(1, 1), padding='same', activation='relu', input_shape=(28,28,1)),
    # Max pooling layer to reduce the size of the feature maps and account for shifted visual features between images
    tf.keras.layers.MaxPooling2D(pool_size=(2,2)),
    # Add noise to the feature maps to prevent overfitting
    tf.keras.layers.Dropout(0.3),
    # Flatten the feature maps to a 1D vector
    tf.keras.layers.Flatten(),
    # Add a fully connected layer with 256 output nodes
    tf.keras.layers.Dense(256, activation='relu'),
    # Add noise to the feature maps to prevent overfitting
    tf.keras.layers.Dropout(0.5),
    # Add a fully connected layer with 10 output nodes (the final guess)
    tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Reshape for the CNN which expects a 28x28 image
x_train = x_train.reshape(-1,28, 28,1)
x_test = x_test.reshape(-1,28, 28, 1)

model.fit(x_train, y_train, epochs=10, batch_size=30)

model.evaluate(x_test, y_test, batch_size=30)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


[0.04672878608107567, 0.9866999983787537]

### Short Essay
The first model is very similar to the neural networks that we built for previous assignments. That model has one hidden layer of 512 output nodes that connects to 10 nodes signaling what the models guess is. For that particularly simple model. It is actually pretty efficient. But, by implementing a convolusion layer and a max pooling layer, the inputted image if filtered to be solely its most important features, which can then be analyzed to make the model's guess. The convolutionary layer acts as a visual filter to extract the most important features. It does not change the image shape instead, it alters each pixels value to make it easier for the neural network to see important features. Following that, the max pooling layer takes the feature map, which is created from the convolutional layer and reduces it's size for quicker processing. This will also help with overfitting. Following that, we flatten the image to an array of one dimension and process it through a similar network to the one created in the first exercise. I make sure to keep dropout involved in the process so overfitting is handled appropriately. As you can see, the efficiency is much greater in the second model with these addendums. All of these design choices make it faster and easier for the network to pick up what is most important for processing rather than take the raw image and training based on that. By applying these filters we extract the most relevant information. By max pooling, we decrease the amount of information provided and know that we can work with only the important features. After that, a standard neural network is created.