# Inception V3

Inception V3 is a deep convolutional neural network (CNN) architecture that was developed by Google researchers as an evolution of the original Inception model. It was introduced in 2015 and has since become a widely used and influential model in the field of computer vision.

The main goal of Inception V3 is to improve the efficiency and performance of CNNs by addressing some of the limitations of previous architectures. One of the key ideas behind Inception V3 is the use of "Inception modules" that allow for multiple filter sizes to be processed in parallel at each layer of the network. This enables the model to capture features at different scales and resolutions, leading to better representation of complex visual patterns.

Here are some notable features and design principles of Inception V3:

1. Inception modules: The core building blocks of Inception V3 are the Inception modules, which consist of a set of parallel convolutional layers with different filter sizes (1x1, 3x3, 5x5) and pooling operations. By using multiple filter sizes, the model can capture information at different spatial scales.
2. Dimensionality reduction: In order to reduce the computational cost and the number of parameters, Inception V3 incorporates 1x1 convolutions as a way to reduce the dimensions of the input feature maps before applying larger convolutions. This helps to maintain the richness of information while reducing the computational burden.
3. Auxiliary classifiers: Inception V3 introduces auxiliary classifiers at intermediate layers of the network. These classifiers have two purposes: they provide additional gradients during training to combat the vanishing gradient problem and they act as regularizers that help improve generalization.
4. Factorization: Inception V3 incorporates a factorization strategy called "bottleneck layers." This involves using 1x1 convolutions to reduce the number of input channels before applying more computationally expensive operations. This helps to reduce the number of parameters and allows the model to capture complex features more efficiently.
5. Pre-training: Inception V3 is typically pre-trained on large-scale image classification datasets such as ImageNet. The pre-training phase involves training the network on a large collection of labeled images to learn general visual representations. This pre-trained model can then be fine-tuned on specific tasks or datasets.

Overall, Inception V3 is known for its strong performance on a variety of computer vision tasks, including image classification, object detection, and image segmentation. It strikes a balance between accuracy and computational efficiency, making it a popular choice for many real-world applications.

The Inception V3 architecture is a deep convolutional neural network (CNN) model that builds upon the original Inception model. It was developed by Google researchers for image classification tasks and has achieved impressive results on various computer vision challenges. Here's a detailed description of the Inception V3 architecture:

1. Input Layer: The input to the network is an RGB image typically of size 299x299 pixels.
2. Stem: The stem module serves as the initial building block of the network. It consists of several convolutional layers with different filter sizes, max pooling, and batch normalization operations. The purpose of the stem is to extract low-level features from the input image.
3. Inception Modules: The core component of Inception V3 is the Inception module. These modules are stacked on top of each other to form the bulk of the network. Each Inception module is designed to capture features at different scales by performing parallel convolutions with different filter sizes.
    a. 1x1 Convolution: The module begins with a 1x1 convolution that acts as a bottleneck layer, reducing the number of input channels.
    b. 3x3 Convolution: This branch consists of a 3x3 convolutional layer that processes the input with a medium-sized receptive field.
    c. 5x5 Convolution: Another branch employs a 5x5 convolutional layer to capture information with a larger receptive field.
    d. Max Pooling: The module also includes a max pooling layer with a stride of 1x1 to capture spatial information.
    e. 1x1 Convolution (Dimension Reduction): To further reduce the computational complexity, a 1x1 convolutional layer is employed to reduce the number of channels before applying larger convolutions.
    f. Concatenation: The outputs from all branches are concatenated along the channel dimension to create the module's output.
4. Reduction Modules: Inception V3 incorporates reduction modules to decrease the spatial dimensions and increase the number of channels. These modules are introduced periodically in the network to facilitate information flow and reduce computational costs.
5. Auxiliary Classifiers: Inception V3 includes auxiliary classifiers at intermediate layers. These auxiliary classifiers are additional branches that perform classification tasks and provide regularization during training. They help combat the vanishing gradient problem and improve generalization.
6. Fully Connected Layers: After several Inception modules, the feature maps are fed into fully connected layers, which serve as the final classification layers of the network. The fully connected layers are followed by a softmax activation to produce the class probabilities.

Overall, the Inception V3 architecture leverages the power of parallel convolutions with different filter sizes, dimensionality reduction, auxiliary classifiers, and factorization to efficiently capture and represent features at various scales. It strikes a balance between model size, computational complexity, and accuracy, making it a popular choice for image classification and related tasks.

## Defining the Inception V3 Architecture

### Importing Required Dependencies

In [42]:
import tensorflow as tf
from tensorflow import keras
from keras.layers import concatenate
import time
import matplotlib.pyplot as plt

### Defining Inception Blocks

#### Defining a Convolutional with Batch Normalisation Block

In [43]:
def conv_with_Batch_Normalisation(prev_layer, nbr_kernels, filter_Size, strides =(1,1), padding = 'same'):
    x = tf.keras.layers.Conv2D(filters=nbr_kernels, kernel_size = filter_Size, strides=strides , padding=padding)(prev_layer)
    x = tf.keras.layers.BatchNormalization(axis=3)(x)
    x = tf.keras.layers.Activation(activation='relu')(x)
    return x

#### Defining a Stem Block

In [44]:
def StemBlock(prev_layer):
    x = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 32, filter_Size=(3,3) , strides=(2,2))
    x = conv_with_Batch_Normalisation(x, nbr_kernels = 32, filter_Size=(3,3))
    x = conv_with_Batch_Normalisation(x, nbr_kernels = 64, filter_Size=(3,3))
    x = tf.keras.layers.MaxPool2D(pool_size=(3,3) , strides=(2,2)) (x)
    x = conv_with_Batch_Normalisation(x, nbr_kernels = 80, filter_Size=(1,1))
    x = conv_with_Batch_Normalisation(x, nbr_kernels = 192, filter_Size=(3,3))
    x = tf.keras.layers.MaxPool2D(pool_size=(3,3) , strides=(2,2)) (x)
    return x  

#### Defining Inception Block A

In [45]:
def InceptionBlock_A(prev_layer, nbr_kernels):
    branch1 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 64, filter_Size = (1,1))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels=96, filter_Size=(3,3))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels=96, filter_Size=(3,3))
    branch2 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels=48, filter_Size=(1,1))
    branch2 = conv_with_Batch_Normalisation(branch2, nbr_kernels=64, filter_Size=(3,3))
    branch3 = tf.keras.layers.AveragePooling2D(pool_size=(3,3) , strides=(1,1) , padding='same') (prev_layer)
    branch3 = conv_with_Batch_Normalisation(branch3, nbr_kernels = nbr_kernels, filter_Size = (1,1))
    branch4 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels=64, filter_Size=(1,1))
    output = concatenate([branch1 , branch2 , branch3 , branch4], axis=3)
    
    return output

#### Defining Inception Block B

In [46]:
def InceptionBlock_B(prev_layer, nbr_kernels):
    branch1 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = nbr_kernels, filter_Size = (1,1))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = nbr_kernels, filter_Size = (7,1))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = nbr_kernels, filter_Size = (1,7))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = nbr_kernels, filter_Size = (7,1))    
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = 192, filter_Size = (1,7))
    branch2 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = nbr_kernels, filter_Size = (1,1))
    branch2 = conv_with_Batch_Normalisation(branch2, nbr_kernels = nbr_kernels, filter_Size = (1,7))
    branch2 = conv_with_Batch_Normalisation(branch2, nbr_kernels = 192, filter_Size = (7,1))
    branch3 = tf.keras.layers.AveragePooling2D(pool_size=(3,3) , strides=(1,1) , padding ='same') (prev_layer)
    branch3 = conv_with_Batch_Normalisation(branch3, nbr_kernels = 192, filter_Size = (1,1))
    branch4 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 192, filter_Size = (1,1))
    output = concatenate([branch1 , branch2 , branch3 , branch4], axis = 3)
    return output  

#### Defining Inception Block C

In [47]:
def InceptionBlock_C(prev_layer):
    branch1 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 448, filter_Size = (1,1))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = 384, filter_Size = (3,3))
    branch1_1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = 384, filter_Size = (1,3))    
    branch1_2 = conv_with_Batch_Normalisation(branch1, nbr_kernels = 384, filter_Size = (3,1))
    branch1 = concatenate([branch1_1 , branch1_2], axis = 3)
    branch2 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 384, filter_Size = (1,1))
    branch2_1 = conv_with_Batch_Normalisation(branch2, nbr_kernels = 384, filter_Size = (1,3))
    branch2_2 = conv_with_Batch_Normalisation(branch2, nbr_kernels = 384, filter_Size = (3,1))
    branch2 = concatenate([branch2_1 , branch2_2], axis = 3)
    branch3 = tf.keras.layers.AveragePooling2D(pool_size=(3,3) , strides=(1,1) , padding='same')(prev_layer)
    branch3 = conv_with_Batch_Normalisation(branch3, nbr_kernels = 192, filter_Size = (1,1))
    branch4 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 320, filter_Size = (1,1))
    output = concatenate([branch1 , branch2 , branch3 , branch4], axis = 3)
    return output

#### Defining Reduction Block A

In [48]:
def ReductionBlock_A(prev_layer):
    branch1 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 64, filter_Size = (1,1))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = 96, filter_Size = (3,3))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = 96, filter_Size = (3,3) , strides=(2,2) ) #, padding='valid'
    branch2 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 384, filter_Size=(3,3) , strides=(2,2) )
    branch3 = tf.keras.layers.MaxPool2D(pool_size=(3,3) , strides=(2,2) , padding='same')(prev_layer)
    output = concatenate([branch1 , branch2 , branch3], axis = 3)
    return output

#### Defining Reduction Block B

In [49]:
def ReductionBlock_B(prev_layer):
    branch1 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 192, filter_Size = (1,1))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = 192, filter_Size = (1,7))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = 192, filter_Size = (7,1))
    branch1 = conv_with_Batch_Normalisation(branch1, nbr_kernels = 192, filter_Size = (3,3) , strides=(2,2) , padding = 'valid')
    branch2 = conv_with_Batch_Normalisation(prev_layer, nbr_kernels = 192, filter_Size = (1,1) )
    branch2 = conv_with_Batch_Normalisation(branch2, nbr_kernels = 320, filter_Size = (3,3) , strides=(2,2) , padding='valid' )
    branch3 = tf.keras.layers.MaxPool2D(pool_size=(3,3) , strides=(2,2) )(prev_layer)
    output = concatenate([branch1 , branch2 , branch3], axis = 3)
    return output

#### Defining Auxiliary Classifier

In [56]:
def auxiliary_classifier(prev_Layer):
    x = tf.keras.layers.AveragePooling2D(pool_size=(5,5) , strides=(3,3)) (prev_Layer)
    x = conv_with_Batch_Normalisation(x, nbr_kernels = 128, filter_Size = (1,1))
    x = tf.keras.layers.Flatten()(x)
    x = tf.keras.layers.Dense(units = 768, activation='relu') (x)
    x = tf.keras.layers.Dropout(rate = 0.2) (x)
    x = tf.keras.layers.Dense(units = 34, activation='softmax') (x)
    return x

#### Defining the Inception V3 Architecture

In [57]:
def InceptionV3():
    input_layer = tf.keras.layers.Input(shape=(299,299,3))
    x = StemBlock(input_layer)
    x = InceptionBlock_A(prev_layer = x, nbr_kernels = 32)
    x = InceptionBlock_A(prev_layer = x, nbr_kernels = 64)
    x = InceptionBlock_A(prev_layer = x, nbr_kernels = 64)
    x = ReductionBlock_A(prev_layer = x)
    x = InceptionBlock_B(prev_layer = x, nbr_kernels = 128)
    x = InceptionBlock_B(prev_layer = x, nbr_kernels = 160)
    x = InceptionBlock_B(prev_layer = x, nbr_kernels = 160)
    x = InceptionBlock_B(prev_layer = x, nbr_kernels = 192)
    Aux = auxiliary_classifier(prev_Layer = x)
    x = ReductionBlock_B(prev_layer = x)
    x = InceptionBlock_C(prev_layer = x)
    x = InceptionBlock_C(prev_layer = x)
    x = tf.keras.layers.GlobalAveragePooling2D()(x)
    x = tf.keras.layers.Dense(units=2048, activation='relu') (x)
    x = tf.keras.layers.Dropout(rate = 0.2) (x)
    x = tf.keras.layers.Dense(units=1000, activation='softmax') (x)
    model = tf.keras.models.Model(inputs = input_layer , outputs = [x , Aux] , name = 'Inception-V3')
    return model

### Defining the Inception Model

#### Defining the Instance of the Inception Model

In [58]:
model = InceptionV3()
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

Model: "Inception-V3"
__________________________________________________________________________________________________
 Layer (type)                   Output Shape         Param #     Connected to                     
 input_12 (InputLayer)          [(None, 299, 299, 3  0           []                               
                                )]                                                                
                                                                                                  
 conv2d_729 (Conv2D)            (None, 150, 150, 32  896         ['input_12[0][0]']               
                                )                                                                 
                                                                                                  
 batch_normalization_729 (Batch  (None, 150, 150, 32  128        ['conv2d_729[0][0]']             
 Normalization)                 )                                                      

#### Generating Graph of the Inception Model

In [55]:
tf.keras.utils.plot_model(model, to_file="inceptionv3.pdf", show_shapes=True)

## Data Processing and Model Training

In [None]:
data_dir = "/kaggle/input/wafer-dataset-new/Dataset"

### Training and Validation Data Generators

In [None]:
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split = 0.2,
    subset="training",
    label_mode='categorical',
    seed=123,
    image_size=(229, 229),
    batch_size=32
)

val_ds = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    validation_split=0.2,
    subset="validation",
    label_mode='categorical',
    seed=123,
    image_size=(229, 229),
    batch_size=32
)

In [None]:
normalization_layer = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)
norm_train_ds = train_ds.map(lambda x, y: (normalization_layer(x), y))
norm_val_ds = val_ds.map(lambda x, y: (normalization_layer(x), y))

In [None]:
AUTOTUNE = tf.data.AUTOTUNE
norm_train_ds = norm_train_ds.cache().prefetch(buffer_size=AUTOTUNE)
norm_val_ds = norm_val_ds.cache().prefetch(buffer_size=AUTOTUNE)

In [None]:
callbacks = [
    tf.keras.callbacks.EarlyStopping(
        monitor="val_loss",
        min_delta=1e-2,
        patience=10,
        verbose=1
    )
]

In [None]:
import time
start = time.time()
with tf.device('/gpu:0'):
    model_inception = model.fit(
        norm_train_ds,
        validation_data=norm_val_ds,
        epochs=40,
    )
stop = time.time()
print(f'Training on GPU took: {(stop-start)/60} minutes')

In [None]:
print(model.history.keys())

plt.figure(figsize=(12, 10))

# summarize history for accuracy
plt.subplot(2, 1, 1)
plt.plot(model.history['accuracy'])
plt.plot(model.history['val_accuracy'])
plt.title('model accuracy')
plt.ylabel('accuracy')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')

# summarize history for loss
plt.subplot(2, 1, 2)
plt.plot(model.history['loss'])
plt.plot(model.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'val'], loc='upper left')
plt.show()

In [None]:
model_inception.evaluate(norm_val_ds)