<a href="https://colab.research.google.com/github/shuvechchha/Neural-Network-Project/blob/main/Project_Two.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project Two

This project involves observing the progress of several different network architectures learning an image classification task.

Note that each network does require the cells before Part One to have been run, but they do not require the networks before them to have run. So you can break up your work and not have to run all the networks just to re-run the last one.

You should set the runtime type to GPU under the Runtime menu to speed the computations. Note that this may restrict you to having one notebook open at a time.

See a List of TensorFlow datasets at: https://knowyourdata-tfds.withgoogle.com/

See how to load them:
https://www.tensorflow.org/datasets/overview

See data augmentation discussion: https://www.tensorflow.org/tutorials/images/data_augmentation

In [None]:
# TensorFlow
import tensorflow as tf

# Datasets
import tensorflow_datasets as tfds

In [None]:
# Load in the tf_flowers dataset
# Separate into training, testing and validation sets

train_split = 'train[:80%]'
validation_split = 'train[80%:90%]'
test_split = 'train[90%:]'
(ds_train, ds_validation, ds_test), ds_info = tfds.load(
    'tf_flowers',
    split=[train_split, validation_split, test_split],
    shuffle_files=True,
    as_supervised=True,
    with_info=True,
)

print(ds_info)

[1mDownloading and preparing dataset tf_flowers/3.0.1 (download: 218.21 MiB, generated: 221.83 MiB, total: 440.05 MiB) to /root/tensorflow_datasets/tf_flowers/3.0.1...[0m


local data directory. If you'd instead prefer to read directly from our public
GCS bucket (recommended if you're running on GCP), you can instead pass
`try_gcs=True` to `tfds.load` or set `data_dir=gs://tfds-data/datasets`.



Dl Completed...:   0%|          | 0/5 [00:00<?, ? file/s]


[1mDataset tf_flowers downloaded and prepared to /root/tensorflow_datasets/tf_flowers/3.0.1. Subsequent calls will reuse this data.[0m
tfds.core.DatasetInfo(
    name='tf_flowers',
    version=3.0.1,
    description='A large set of images of flowers',
    homepage='https://www.tensorflow.org/tutorials/load_data/images',
    features=FeaturesDict({
        'image': Image(shape=(None, None, 3), dtype=tf.uint8),
        'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=5),
    }),
    total_num_examples=3670,
    splits={
        'train': 3670,
    },
    supervised_keys=('image', 'label'),
    citation="""@ONLINE {tfflowers,
    author = "The TensorFlow Team",
    title = "Flowers",
    month = "jan",
    year = "2019",
    url = "http://download.tensorflow.org/example_images/flower_photos.tgz" }""",
    redistribution_info=,
)



In [None]:
# Set some network parameters

# Number of classes in the dataset
num_classes = ds_info.features['label'].num_classes
# Size of each image dimension (width and height)
img_size = 80
# Size of a batch of images
batch_size = 128

# Function to apply to each image to prepare it for training and inference
def process_img(image, label):
  # Resize the images, pad if needed to preserve aspect ratio
  #image = tf.image.resize_with_pad(image, img_size, img_size)
  image = tf.image.resize(image, (img_size, img_size))
  # Change from ints in range 0-255 to floats in range 0-1
  image = tf.cast(image, tf.float32) / 255.
  # Change label from integer to one-hot vector
  label = tf.one_hot(label, num_classes)
  return image, label

# Setup data pipeline
# Training images should be shuffled
train = ds_train.map(
    process_img, 
    num_parallel_calls=tf.data.AUTOTUNE
    ).cache().shuffle(ds_info.splits[train_split].num_examples).batch(128).prefetch(tf.data.AUTOTUNE)

# Validation data for tuning hyperparameters
validation = ds_validation.map(
    process_img, 
    num_parallel_calls=tf.data.AUTOTUNE
    ).cache().batch(batch_size).prefetch(tf.data.AUTOTUNE)

# Test data for final evaluation
test = ds_test.map(
    process_img, 
    num_parallel_calls=tf.data.AUTOTUNE
    ).cache().batch(batch_size).prefetch(tf.data.AUTOTUNE)

## Part One - Basic Two-Layer Network

Build the network and the train the model using the cells below.

In [None]:
# Basic two-layer network
two_layer_model = tf.keras.Sequential([
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(192, activation='relu'),
  tf.keras.layers.Dense(64, activation='relu'),
  tf.keras.layers.Dense(num_classes, activation='softmax'),
])
two_layer_model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001),
              loss=tf.keras.losses.CategoricalCrossentropy(),
              metrics=[tf.keras.metrics.CategoricalAccuracy()])

In [None]:
# Train the model
# Watch the training and validation accuracy over time
# A notably higher training accuracy means overfitting
epochs=10
history = two_layer_model.fit(
  train,
  validation_data=validation,
  epochs=epochs
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [None]:
# How big is this model?
two_layer_model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 flatten (Flatten)           (None, 19200)             0         
                                                                 
 dense (Dense)               (None, 192)               3686592   
                                                                 
 dense_1 (Dense)             (None, 64)                12352     
                                                                 
 dense_2 (Dense)             (None, 5)                 325       
                                                                 
Total params: 3,699,269
Trainable params: 3,699,269
Non-trainable params: 0
_________________________________________________________________


### Questions
1. How did the network do? What were the final training and validation accuracies?
2. Is there evidence of overfitting?
3. Why do you think the network had trouble? What are the limitations of this architecture?

**1. The network did not do well. The final training accuracy is 0.3944 and the validation accuracy is 0.3052.**

**2. Yes, there are about 5 instances of overfitting.**

**3. I think the network had trouble because the overfitting caused the model to learn noise. The limitations of this architecture is that it is prone to overfitting.**

## Part Two - Basic Convolutional Network

Build the network and the train the model using the cells below.

In [None]:
# A basic convolutional neural network with two sets of convolution/pooling layers
# Notice that typically only one hidden dense layer is used
basic_conv_model = tf.keras.Sequential([
  tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(24, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(192, activation='relu'),
  tf.keras.layers.Dense(num_classes, activation='softmax'),
])
basic_conv_model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001),
              loss=tf.keras.losses.CategoricalCrossentropy(),
              metrics=[tf.keras.metrics.CategoricalAccuracy()])

In [None]:
# Train the model
# Watch the training and validation accuracy over time
# A notably higher training accuracy means overfitting
epochs=10
history = basic_conv_model.fit(
  train,
  validation_data=validation,
  epochs=epochs
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [None]:
# How big is this model?
basic_conv_model.summary()

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 80, 80, 16)        448       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 40, 40, 16)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 40, 40, 24)        3480      
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 20, 20, 24)       0         
 2D)                                                             
                                                                 
 flatten_1 (Flatten)         (None, 9600)              0         
                                                                 
 dense_3 (Dense)             (None, 192)              

### Questions
1. How did the network do? What were the final training and validation accuracies?
2. Is there evidence of overfitting?
3. What do convolutional networks do to improve learning?

**1. The model did fairly well. The final training accuracy is 0.7997 and the validation accuracy is 0.6431.**

**2. Yes, there are about seven instances of overfitting.**

**3. Convolution networks can use data augmentation to execute random transformations as well as regularizations like drop out, which randomly sets the activations to zero, pushing the network to develop redundancy and therefore improve learning.**

## Augmented Convolutional Neural Network

Build the network and the train the model using the cells below.

In [None]:
# A convolutional neural network with some features to combat overfitting
# Data augmentation spices up the input by applying random transformations
# Dropout nulls out neurons randomly forcing the network to develop
# redundancy and thus fit less to noise
aug_conv_model = tf.keras.Sequential([
  tf.keras.layers.RandomFlip("horizontal_and_vertical"),
  tf.keras.layers.RandomRotation(0.2),
  tf.keras.layers.Conv2D(16, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Conv2D(24, 3, padding='same', activation='relu'),
  tf.keras.layers.MaxPooling2D(),
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(192, activation='relu'),
  tf.keras.layers.Dropout(0.25),
  tf.keras.layers.Dense(num_classes, activation='softmax'),
])
aug_conv_model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001),
              loss=tf.keras.losses.CategoricalCrossentropy(),
              metrics=[tf.keras.metrics.CategoricalAccuracy()])

In [None]:
# Train the model
# Watch the training and validation accuracy over time
# A notably higher training accuracy means overfitting
epochs=10
history = aug_conv_model.fit(
  train,
  validation_data=validation,
  epochs=epochs
)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


In [None]:
# Evaluate its performance on the test set
print("Testing Results:")
aug_conv_model.evaluate(test)

Testing Results:


[0.9145666360855103, 0.6212534308433533]

### Questions
1. How did the network do? What were the final training and validation accuracies?
2. Is there evidence of overfitting?
3. Is the overfitting reduced compared to the previous network?
4. How did the network do on the test data that it hadn't seen before? Was the accuracy comparable?


**1. The network did good. The final training accuracy is 0.6117 and the validation accuracy is 0.5749.**

**2. There is no evidence of overfitting.**

**3. Yes, the overfitting has reduced compared to previous networks.**

**4. The network performed well on the test data with an accuracy of 0.6213. Yes, the accuracy was comparable.**

## Network with Pre-trained Feature Extractors

Build the network and the train the model using the cells below.

In [None]:
# Load in the pre-trained model
# Note that include_top=False means we are including just
# the feature extractors and not the classification layers
pre_trained_model = tf.keras.applications.inception_v3.InceptionV3(
    input_shape=(img_size, img_size, 3), include_top=False)

# Make sure we don't try to train these massive layers
pre_trained_model.trainable = False

# Build the model as before
pre_trained_model = tf.keras.Sequential([
  pre_trained_model,
  tf.keras.layers.Flatten(),
  tf.keras.layers.Dense(192, activation='relu'),
  tf.keras.layers.Dense(num_classes, activation='softmax'),
])
pre_trained_model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=0.001),
              loss=tf.keras.losses.CategoricalCrossentropy(),
              metrics=[tf.keras.metrics.CategoricalAccuracy()])



In [None]:
# Train the model
# Watch the training and validation accuracy over time
# A notably higher training accuracy means overfitting
epochs=5
history = pre_trained_model.fit(
  train,
  validation_data=validation,
  epochs=epochs
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [None]:
pre_trained_model.summary()

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 inception_v3 (Functional)   (None, 1, 1, 2048)        21802784  
                                                                 
 flatten_5 (Flatten)         (None, 2048)              0         
                                                                 
 dense_11 (Dense)            (None, 192)               393408    
                                                                 
 dense_12 (Dense)            (None, 5)                 965       
                                                                 
Total params: 22,197,157
Trainable params: 394,373
Non-trainable params: 21,802,784
_________________________________________________________________


### Questions
1. How did the network do? What were the final training and validation accuracies?
2. Is there evidence of overfitting?
3. Why might you include pre-trained layers? Are they improving performance?
4. How does the number of parameters (weights) compare to the previous networks (Hint: look at the summaries).


**1. The network performed well. The final training accuracy is 0.8416 and the validation accuracy is 0.6757.**

**2. Yes, there are around four evidences of overfitting.**

**3. I would include pre-trained layers to increase the efficiency of the model. They are definately improving preformances.**

**4. The number of parameters in this network was much higher than previous networks.**