![header.jpg](attachment:dd64fa12-6140-4b8f-91e9-649b551e8926.jpg)

# <p style="font-size: 36px;background-color:#6A0DAD; color:white; padding:10px 20px; width:95%; border-radius:15px; text-align:center;margin: 0 auto; box-shadow:0 4px 8px rgba(0, 0, 0, 0.1);font-weight: bold;">Introduction</p>

**EfficientNet** is a family of convolutional neural networks (CNNs) designed for image classification and other visual recognition tasks. It was introduced by researchers at Google AI in 2019 as a solution to improve the accuracy and efficiency of deep learning models. The key motivation behind EfficientNet is to achieve a better trade-off between accuracy and computational efficiency, making it suitable for a wide range of devices and applications, from mobile devices to large-scale cloud computing systems.

# <p style="font-size: 36px;background-color:#6A0DAD; color:white; padding:10px 20px; width:95%; border-radius:15px; text-align:center;margin: 0 auto; box-shadow:0 4px 8px rgba(0, 0, 0, 0.1);font-weight: bold;">Key Concepts Behind EfficientNet</p>

![Scalling Figure](https://production-media.paperswithcode.com/methods/Screen_Shot_2020-06-06_at_10.45.54_PM.png)

1. **Compound Scaling**:
   - The main idea behind EfficientNet is to scale up the model efficiently using a technique called **compound scaling**. Instead of simply increasing the depth (number of layers) of a model, EfficientNet scales three dimensions:
     - **Depth**: The number of layers in the network.
     - **Width**: The number of channels in each layer.
     - **Resolution**: The size of the input image.
   - Traditional scaling methods tend to focus on just one of these dimensions (e.g., just increasing depth), but EfficientNet scales all three dimensions in a balanced way, which leads to better overall performance without exponentially increasing computational cost.

3. **Baseline Model (EfficientNet-B0)**:
   - EfficientNet starts with a baseline model, EfficientNet-B0, which is designed to be both highly efficient and effective. This model uses **Mobile Inverted Bottleneck Convolutions (MBConv)** and **Swish activation functions** for better performance and computational efficiency.
   - MBConv is a variant of depthwise separable convolutions, which are computationally cheaper than traditional convolutions.

4. **Swish Activation Function**:
   - EfficientNet uses the **Swish activation function**, a self-gated activation function, as an alternative to the commonly used ReLU. Swish has been shown to outperform ReLU in many deep learning tasks.

5. **Search for Optimal Architecture**:
   - To design EfficientNet-B0, the authors performed a neural architecture search (NAS) to find the best possible architecture for the base model. NAS is an automated process where the architecture of the network is optimized to achieve the best accuracy with minimal computational cost.

6. **Scalability and Variants**:
   - Once EfficientNet-B0 was established, the researchers applied compound scaling to create a family of models (B0 through B7). Each model in the EfficientNet family scales in terms of depth, width, and resolution, with larger models like EfficientNet-B7 being more accurate but also more computationally expensive.
   - The efficient scaling allows EfficientNet to achieve state-of-the-art accuracy while being much more computationally efficient than traditional models like ResNet or DenseNet.

# <p style="font-size: 36px;background-color:#6A0DAD; color:white; padding:10px 20px; width:95%; border-radius:15px; text-align:center;margin: 0 auto; box-shadow:0 4px 8px rgba(0, 0, 0, 0.1);font-weight: bold;">Advantages of EfficientNet</p>

![comparison.jpg](attachment:680b66de-c889-4cec-81e8-0e4cde971748.jpg)

1. **Better Accuracy and Efficiency**:
   - EfficientNet models achieve higher accuracy than other models of similar size. They outperform many traditional architectures, such as ResNet, Inception, and DenseNet, on the ImageNet benchmark while using fewer parameters and less computation.

2. **Efficient Resource Usage**:
   - By balancing the three scaling dimensions (depth, width, and resolution), EfficientNet is able to maintain high accuracy while reducing computational costs. This makes it well-suited for deployment on mobile and edge devices with limited resources.

3. **Wide Applicability**:
   - EfficientNet can be used not only for image classification but also for other tasks such as object detection, segmentation, and transfer learning. Its efficiency makes it a good choice for applications where computational resources are limited, such as mobile apps, autonomous systems, and cloud-based AI services.

# <p style="font-size: 36px;background-color:#6A0DAD; color:white; padding:10px 20px; width:95%; border-radius:15px; text-align:center;margin: 0 auto; box-shadow:0 4px 8px rgba(0, 0, 0, 0.1);font-weight: bold;">EfficientNet Variants</p>

- **EfficientNet-B0** to **EfficientNet-B7**: These are the base models that range in size and computational complexity. B0 is the smallest, while B7 is the largest and most accurate.
- **EfficientNet-Lite**: This is a lightweight variant designed specifically for resource-constrained environments like mobile devices, where computational resources are limited.

# <p style="font-size: 36px;background-color:#6A0DAD; color:white; padding:10px 20px; width:95%; border-radius:15px; text-align:center;margin: 0 auto; box-shadow:0 4px 8px rgba(0, 0, 0, 0.1);font-weight: bold;">EfficientNet Architecture Overview</p>

![image](https://viso.ai/wp-content/smush-webp/2024/03/EfficientNet-Architecture-diagram-1060x524.png.webp)

EfficientNet starts with a baseline model called **EfficientNet-B0**, which was designed using **neural architecture search (NAS)**. NAS is an automated search process where the architecture is optimized for the best performance using minimal computational resources. EfficientNet-B0 serves as the foundation for all other models in the EfficientNet family (B1 through B7). 

Some key architectural features of EfficientNet-B0 include:

![image](https://www.researchgate.net/publication/354893361/figure/fig3/AS:1073027328520192@1632841059375/llustration-of-the-MBConv-blocks-in-the-EfficientNet-CNN-model-The-figure-shows-two.png)

- **MBConv Blocks** (Mobile Inverted Bottleneck Convolutions):
  - EfficientNet uses a novel type of convolutional block called **MBConv**, which is a variant of depthwise separable convolutions.

  - **Depthwise separable convolutions** are more computationally efficient than traditional convolutions, as they separate the convolution operation into two parts: a depthwise convolution (filtering individual channels) and a pointwise convolution (combining them).

  - **Inverted Bottleneck** design: The input to each block has fewer channels, then expands into a larger number of channels, and then contracts back to a smaller number of output channels. This design improves the network’s capacity without significantly increasing computational complexity.

- **Squeeze-and-Excitation**:
  - In the EfficientNet-B0 architecture, each block has a **Squeeze-and-Excitation (SE)** module. The SE module is used to recalibrate channel-wise feature responses by learning dynamic channel-wise importance. This helps the model learn which features are most important.

- **Swish Activation Function**:
  - EfficientNet uses the **Swish** activation function instead of the more commonly used ReLU. Swish has been shown to outperform ReLU in many deep learning applications, as it allows for smoother gradients and less information loss in deeper networks.

![image](https://miro.medium.com/v2/resize:fit:363/1*gBdEHDWWQhm9sjH3m7mBMg.png)

- **Efficient Layer Arrangement**:
  - The EfficientNet architecture uses a sequence of **convolutional layers** followed by **batch normalization** and **activation functions** (specifically Swish) in a structured manner.
  - The model also uses **global average pooling** followed by a softmax layer for classification tasks.

# <p style="font-size: 36px;background-color:#6A0DAD; color:white; padding:10px 20px; width:95%; border-radius:15px; text-align:center;margin: 0 auto; box-shadow:0 4px 8px rgba(0, 0, 0, 0.1);font-weight: bold;">Application of EfficientNet-B0</p>

In this section, I will demonstrate how EfficientNet-B0 can be applied to a multiclass classification task

## <p style="font-size: 24px;background-color:#6A0DAD; color:white; padding:10px 20px; width:95%; border-radius:15px; text-align:center;margin: 0 auto; box-shadow:0 4px 8px ,rgba(0, 0, 0, 0.1);font-weight: bold;">Dataset Overview</p><br>

The dataset contains images collected from Google searches, processed through custom tools for organization and quality control. It includes 10 classes, with a total of 2,339 training images, 50 test images, and 50 validation images, all resized to 224x224x3 JPG format. The images are carefully cropped and processed to ensure high quality for classification tasks.

- **Total classes**: 10  
- **Training images**: 2,339  
- **Test images**: 50  
- **Validation images**: 50  
- **Image size**: 224x224x3 JPG  
- **Processing**: Cropped, cleaned, and resized for classification

In [None]:
import pandas as pd
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers,models
from tensorflow.keras.applications import EfficientNetB0
import matplotlib.pyplot as plt

import os

import warnings
warnings.filterwarnings('ignore')
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

tf.config.list_physical_devices('GPU')

In [None]:
train_set = keras.preprocessing.image_dataset_from_directory(
    '/kaggle/input/cats-in-the-wild-image-classification/train',
    seed=123,
    interpolation='nearest',
    batch_size=32,
    image_size=(224,224)
)

val_set = keras.preprocessing.image_dataset_from_directory(
    '/kaggle/input/cats-in-the-wild-image-classification/valid',
    seed=123,
    interpolation='nearest',
    batch_size=32,
    image_size=(224,224)
)

test_set = keras.preprocessing.image_dataset_from_directory(
    '/kaggle/input/cats-in-the-wild-image-classification/test',
    seed=123,
    interpolation='nearest',
    batch_size=32,
    image_size=(224,224)
)


In [None]:
class_names = train_set.class_names
print(class_names)

In [None]:
_, ax = plt.subplots(8,4,figsize=(8,16))
ax = ax.flatten()
imgs, clss = next(iter(train_set.take(1)))
for i in range(32):
    ax[i].imshow(imgs[i])
    actual_class = np.argmax(clss[i]) if len(clss[i].shape) > 1 else clss[i]
    ax[i].set_title(class_names[actual_class])
    ax[i].set_xticks([])
    ax[i].set_yticks([])
plt.tight_layout()
plt.show()

In [None]:
input_shape = (224,224,3)
num_classes = len(class_names)

image_input = layers.Input(shape=input_shape)
effnet = EfficientNetB0(weights='imagenet', include_top=False, input_shape=input_shape)
x = effnet(image_input)
x = layers.GlobalAveragePooling2D()(x)
x = layers.Flatten()(x)  
x = layers.Dense(1024, activation='relu')(x) 
x = layers.Dense(num_classes,activation='softmax')(x)
model =  models.Model(inputs=image_input, outputs=x)
model.summary()

In [None]:
model.compile(
    optimizer=tf.optimizers.Adam(epsilon=0.0001),
    loss='sparse_categorical_crossentropy',
    metrics=[tf.metrics.SparseCategoricalAccuracy(name='accuracy')]
)

chkpnt_loss = tf.keras.callbacks.ModelCheckpoint(
    'best_model_loss.keras',            # Path to save the model
    monitor='val_loss',         # Metric to monitor 
    verbose=1,                  # Print messages when saving the model
    save_best_only=True,        # Save only the best model (with highest metric)
    mode='min',                 
    save_weights_only=False,     # Save the entire model (not just weights)
)

early_stopping = tf.keras.callbacks.EarlyStopping(monitor='val_loss', patience=15, restore_best_weights=True)

history = model.fit(train_set,
                    validation_data=val_set,
                    epochs=64,
                    callbacks=[chkpnt_loss,early_stopping])

In [None]:
plt.figure(figsize=(8,3))

plt.subplot(121)
hist_df = pd.DataFrame(history.history)
hist_df.loc[:,['loss','val_loss']].plot(ax=plt.gca())

plt.subplot(122)
hist_df.loc[:,['accuracy','val_accuracy']].plot(ax=plt.gca())

plt.tight_layout()
plt.show()

In [None]:
loss,accuracy = model.evaluate(test_set)

print('Loss:',loss)
print('Accuracy:',accuracy)

In [None]:
imgs, clss = next(iter(test_set.take(1)))
pred_prob = np.squeeze(model.predict(imgs, verbose=0))
pred_cls = np.argmax(pred_prob, axis=1)

# Create subplots
fig, ax = plt.subplots(8, 4, figsize=(8, 16))
ax = ax.flatten()

# Loop through each image and display it with predicted and actual class
for i, img in enumerate(imgs):
    ax[i].imshow(img)
    ax[i].set_xticks([])
    ax[i].set_yticks([])
    
    # Get the actual class from one-hot encoding (if it's one-hot encoded)
    actual_class = np.argmax(clss[i]) if len(clss[i].shape) > 1 else clss[i]
    
    ax[i].set_title(f'Actual: {class_names[actual_class]}\n'
                    f'Pred: {class_names[pred_cls[i]]} ({round(np.max(pred_prob[i])*100, 2)}%)', fontsize=9)

plt.tight_layout()
plt.show()

# <p style="font-size: 36px;background-color:#6A0DAD; color:white; padding:10px 20px; width:95%; border-radius:15px; text-align:center;margin: 0 auto; box-shadow:0 4px 8px rgba(0, 0, 0, 0.1);font-weight: bold;">Thank You!</p>
<br>
Thank you for taking the time to explore this notebook! I hope you found the content helpful and informative. If you have any suggestions for improvements or feedback, please feel free to leave a comment. I would love to hear your thoughts and learn from your insights!

If you found this notebook valuable, I would really appreciate it if you could upvote it. Your support motivates me to continue sharing more useful content with the Kaggle community.

Happy learning and coding!