### Mar 2024: END SEMESTER ASSESSMENT (ESA)
## M TECH DATA SCIENCE AND MACHINE LEARNING_ SEMESTER II

### UE20CS935: Introduction to Deep Learning and Applications


#### Section B: Question No:2   (10 marks)
Build a Convolution Neural Network to classify 5 classes of Indian food items.
Dataset_Folder Name: Food classification.

Conditions to consider:

--Parameters should not cross 300000

--Should not use more than 4 layers (except input and output, including convolution and dense layers)

--Use Adam Optimizer


In [1]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPool2D, Flatten, Dense, GlobalAveragePooling2D, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
import os

os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0"

In [2]:
train_dir="Food classification/Train"
test_dir="Food classification/Test"

In [None]:
# Use ImageDataGenerator to scale the images and read the data from all subfolders
#Hint: train_datagen=ImageDataGenerator(rescale=1/255.)
#      train_data=train_datagen.flow_from_directory(train_dir,
#                                             target_size=(128,128),
#                                             batch_size=32,
#                                             class_mode='categorical')
# do the same for Test data

In [3]:
# Image size and batch size
img_size = (128, 128)
batch_size = 32

In [4]:
# Image generators
train_datagen = ImageDataGenerator(rescale=1./255)
test_datagen = ImageDataGenerator(rescale=1./255)

train_data = train_datagen.flow_from_directory(
    train_dir, target_size=img_size, batch_size=batch_size, class_mode='categorical'
)

test_data = test_datagen.flow_from_directory(
    test_dir, target_size=img_size, batch_size=batch_size, class_mode='categorical'
)

Found 90 images belonging to 5 classes.
Found 50 images belonging to 5 classes.


In [5]:
#create the sequestial model with 2-3 layers ov Conv2D and Pooling
#Compile the model
#Infer the model summary

In [6]:
# Model creation (reduced filters to stay under 20K params)
model = Sequential([
    Conv2D(8, (3, 3), activation='relu', input_shape=(128, 128, 3)),
    MaxPool2D(pool_size=(2, 2)),

    Conv2D(16, (3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2)),

    Conv2D(32, (3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2)),

    Flatten(),
    Dense(5, activation='softmax')  # Use 5 if you have 5 classes
])
# Compile the model
model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])

# Infer model summary
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 126, 126, 8)       224       
                                                                 
 max_pooling2d (MaxPooling2  (None, 63, 63, 8)         0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 61, 61, 16)        1168      
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 30, 30, 16)        0         
 g2D)                                                            
                                                                 
 conv2d_2 (Conv2D)           (None, 28, 28, 32)        4640      
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 14, 14, 32)        0

In [None]:
# fit the model for train data and run it for 5 epoch

In [7]:
# Fit the model on training data for 5 epochs
model.fit(
    train_data,
    validation_data=test_data,
    epochs=5
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x1fbad975990>

In [None]:
# evaluate the model for test data
# Justify whether the model is overfitting or underfitting

In [8]:
# Evaluate the trained model on test data
loss, accuracy = model.evaluate(test_data)

print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")

Test Loss: 1.5256
Test Accuracy: 0.3200


The model performs well on the training data but shows low accuracy on test data (42%).
This significant gap indicates that the model is overfitting – it has learned patterns specific to training images but fails to generalize on unseen test data.

#### Section B: Question No:3   (20 marks)

Improve the baseline model (model build in question2) performance and save the weights of improved model

Conditions to consider:

- Apply Data Augmentation if required

- No parameter limit

- Can use any number of layers

- Use any optimizers of your choice

- Use early stopping and save best model callbacks


In [None]:
# Perform data augmentation with the following operations to train data (feel free to choose the augmentation operations)
# Hint: train_datagen=ImageDataGenerator(rescale=1/255.,
#                                rotation_range=45,
#                                width_shift_range=0.2,
#                                height_shift_range=0.2,
#                                shear_range=0.2,
#                                zoom_range=0.2,
#                                horizontal_flip=True,
#                                fill_mode='reflect')


# Scale the test data
# Read the train and test data from directory
#Hint: train_datagen.flow_from_directory


In [11]:
# Augmentation for training data
train_datagen = ImageDataGenerator(
    rescale=1/255.,
    rotation_range=45,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='reflect'
)

# Only rescale for test data
test_datagen = ImageDataGenerator(rescale=1/255.)


# Load data from directory
train_data = train_datagen.flow_from_directory(
    train_dir,
    target_size=(128, 128),
    batch_size=32,
    class_mode='categorical'
)

test_data = test_datagen.flow_from_directory(
    test_dir,
    target_size=(128, 128),
    batch_size=32,
    class_mode='categorical'
)

Found 90 images belonging to 5 classes.
Found 50 images belonging to 5 classes.


In [13]:
#create the sequestial model with 2-3 layers ov Conv2D and Pooling
#Compile the model
#Use callback to fetch the best model

# Build the model
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(128, 128, 3)),
    MaxPool2D(pool_size=(2, 2)),

    Conv2D(64, (3, 3), activation='relu'),
    MaxPool2D(pool_size=(2, 2)),

    Flatten(),
    Dense(5, activation='softmax')  # Update to your actual number of classes
])

# Compile the model
model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])

keras_callback = [
    EarlyStopping(monitor='val_loss', mode='min', patience=5, min_delta=0.01),
    ModelCheckpoint('best_transfer_model.h5', monitor='val_loss', save_best_only=True)
]

In [14]:
#Infer the model summary
# fit the model for train data and run it for 5 epoch
# Infer model summary
model.summary()

# Fit the model for 5 epochs
model.fit(
    train_data,
    validation_data=test_data,
    epochs=5,
    callbacks=keras_callback
)



Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_7 (Conv2D)           (None, 126, 126, 32)      896       
                                                                 
 max_pooling2d_7 (MaxPoolin  (None, 63, 63, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_8 (Conv2D)           (None, 61, 61, 64)        18496     
                                                                 
 max_pooling2d_8 (MaxPoolin  (None, 30, 30, 64)        0         
 g2D)                                                            
                                                                 
 flatten_3 (Flatten)         (None, 57600)             0         
                                                                 
 dense_3 (Dense)             (None, 5)                

  saving_api.save_model(


Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x1fbb26eeb90>

In [None]:
# evaluate the model for test data
# Justify whether the model is improved then the earlier model

In [15]:
# Evaluate improved model on test data
loss, accuracy = model.evaluate(test_data)

print(f"Improved Model - Test Loss: {loss:.4f}")
print(f"Improved Model - Test Accuracy: {accuracy:.4f}")

Improved Model - Test Loss: 1.4257
Improved Model - Test Accuracy: 0.3800


The improved model shows a modest gain in test accuracy (from 42% to 44%).
This improvement is attributed to data augmentation, deeper architecture, and callbacks like early stopping and model checkpointing.
Hence, the model's generalization has slightly improved, indicating the enhancement was beneficial.

### March 2024: END SEMESTER ASSESSMENT (ESA) 
## M TECH DATA SCIENCE AND MACHINE LEARNING_ SEMESTER II

### UE20CS935: Introduction to Deep Learning and Applications


#### Section C: Question 4:

Use the Transfer learning technique to improve the previous section model’s classification performance. 
The pre-trained models weights are given to you. The architecture of pre-trained model till convolution layers and its corresponding weights are already saved under the folder ‘base_model’. The given model convolution layers already freezed. (Note: This pre-trained model provided is MobileNet).

Load these weights along with architecture using the following syntax:

cust_model=tf.keras.models.load_model("base_model") 

“base_model” is the folder name under all the required models files are exist. 

Design the remaining layers of network in your own way (from flattening to output layer) and train only its weights with the dataset given.


In [16]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential,Model
from tensorflow.keras.layers import Input, Conv2D, MaxPool2D, Flatten, Dense, GlobalAveragePooling2D
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
import os
os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"]="0"
from tensorflow.keras.applications.mobilenet_v2 import preprocess_input

In [17]:
train_dir="Food classification/Train"
test_dir="Food classification/Test"

In [18]:
train_image_generator = ImageDataGenerator(
    validation_split=0.2,
    rotation_range=60,
    zoom_range=0.2,
    height_shift_range=0.2,
    width_shift_range=0.2,
    horizontal_flip=True,
    shear_range=0.2,
    preprocessing_function=preprocess_input  # Only this for MobileNetV2
)

test_image_generator = ImageDataGenerator(
    preprocessing_function=preprocess_input
)

target_size = (224, 224)

train_data = train_image_generator.flow_from_directory(
    train_dir,
    subset='training',
    shuffle=True,
    target_size=target_size
)

validation_data_generator = train_image_generator.flow_from_directory(
    train_dir,
    subset='validation',
    shuffle=True,
    target_size=target_size
)

test_data = test_image_generator.flow_from_directory(
    test_dir,
    target_size=target_size
)

Found 75 images belonging to 5 classes.
Found 15 images belonging to 5 classes.
Found 50 images belonging to 5 classes.


In [19]:
#Load the pre-trained model using the below command
cust_model=tf.keras.models.load_model("base_model")
cust_model.summary()





Model: "mobilenetv2_1.00_224"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
 input_3 (InputLayer)        [(None, 224, 224, 3)]        0         []                            
                                                                                                  
 Conv1_pad (ZeroPadding2D)   (None, 225, 225, 3)          0         ['input_3[0][0]']             
                                                                                                  
 Conv1 (Conv2D)              (None, 112, 112, 32)         864       ['Conv1_pad[0][0]']           
                                                                                                  
 bn_Conv1 (BatchNormalizati  (None, 112, 112, 32)         128       ['Conv1[0][0]']               
 on)                                                                           

In [None]:
# load the model and add 2 layers (1. do global Average pooling, 2. add the output layer as dense layer )
# add the input layer with image size (224,224,3) and then add the pre-trained model (Hint: Base_model(inputs,training=False))
# do global Average pooling (Hint: tf.keras.layers.GlobalAveragePooling2D()(previous layer o/p))
# add denselayer with activation function= softmax and with number of output classes)
# Create the model with [tf.keras.Model(inputs,outputs)]
# Infer the model summary

In [20]:
# Step 1: Define input layer with image size (224, 224, 3)
input_layer = Input(shape=(224, 224, 3))

# Step 2: Load the pre-trained model provided in exam
base_model = tf.keras.models.load_model("base_model")
base_model.trainable = False  # Freeze weights

# Step 3: Pass input through base model
x = base_model(input_layer, training=False)

# Step 4: Global Average Pooling
x = GlobalAveragePooling2D()(x)

# Step 5: Output layer with softmax activation (adjust class count if needed)
output_layer = Dense(5, activation='softmax')(x)

# Step 6: Create final model
model = tf.keras.Model(inputs=input_layer, outputs=output_layer)

# Step 7: Show model summary
model.summary()









Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 mobilenetv2_1.00_224 (Func  (None, 7, 7, 1280)        2257984   
 tional)                                                         
                                                                 
 global_average_pooling2d (  (None, 1280)              0         
 GlobalAveragePooling2D)                                         
                                                                 
 dense_4 (Dense)             (None, 5)                 6405      
                                                                 
Total params: 2264389 (8.64 MB)
Trainable params: 6405 (25.02 KB)
Non-trainable params: 2257984 (8.61 MB)
_________________________________________________________________


In [None]:
# compile the model
# Use model checkpoint to fetch the best model
# fit the model to train data 
    #epochs=2,
    #validation_data=test_data,
    #callbacks=keras_callback

In [21]:
# Step 1: Compile the model
model.compile(
    optimizer=tf.keras.optimizers.Adam(),
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Step 2: Define callbacks
keras_callback = [
    EarlyStopping(monitor='val_loss', mode='min', patience=3, min_delta=0.01),
    ModelCheckpoint(
        filepath='best_transfer_model.h5',  # ✅ Use .h5 for Keras 2.x
        monitor='val_loss',
        save_best_only=True
    )
]


# Step 3: Fit the model for 2 epochs
history = model.fit(
    train_data,
    epochs=2,
    validation_data=test_data,
    callbacks=keras_callback
)

Epoch 1/2

  saving_api.save_model(


Epoch 2/2


In [22]:
model.summary()

Model: "model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 mobilenetv2_1.00_224 (Func  (None, 7, 7, 1280)        2257984   
 tional)                                                         
                                                                 
 global_average_pooling2d (  (None, 1280)              0         
 GlobalAveragePooling2D)                                         
                                                                 
 dense_4 (Dense)             (None, 5)                 6405      
                                                                 
Total params: 2264389 (8.64 MB)
Trainable params: 6405 (25.02 KB)
Non-trainable params: 2257984 (8.61 MB)
_________________________________________________________________


###  END SEMESTER ASSESSMENT (ESA)
## M. TECH DATA SCIENCE AND MACHINE LEARNING_ SEMESTER II

### UE20CS935 : Introduction to Deep Learning and Applications

### Section C Q5

#### Section C: Question 5:    (15 Marks)

Develop a Semantic segmentation model using Unet architecture on the given dataset.

Dataset contains the images and the corresponding masks. Find the dataset under the folder “Unet_Dataset”. Dataset contains the Chest X-ray images of Pneumothorax diseases and the corresponding masks.

Students can make use of pre-trained Unet segmentation model using the library

import segmentation_models as sm

Hints :
    1. Load all the images in one array of size 96x128x128x1 Where 96 is total number of trained images 128x128x3 is each image size
    2. Load all the masks in one array of size 96x128x128x1
    3. Scale both the above two arrays
    4. Split the data into train and test
    5. Define the pre-trained segmentation model. Use encoder_weight=None, If internet access is not available.
    6. Compile with appropriate loss and metric and fit the data into it.
    7. Reduce the batch_size to 1 or 2, if you get any memory related error
    
Run the model for minimum 2 epochs and present your result. The solution will be evaluated based on approach only as it take lot of epochs to produce good result.



In [24]:
# hint : uncomment  below to fetch path
image_dir='Unet_Dataset/CXR_png/'
mask_dir='Unet_Dataset/masks/'

In [25]:
#Read all the data from both the folders X-ray images and mask images.

# store the data in the following folders
# img_dataset=[]
# mask_dataset=[]


#Read the X-ray images and masks from the directories; hint: images=os.listdir(image_dir) and masks=os.listdir(mask_dir)

#for i,image_name in enumerate(images):
#    if (image_name.split('.')[1]=='png'):
#       image=cv2.imread(image_dir+image_name,0)
#       image=Image.fromarray(image)
#       image=image.resize((SIZE,SIZE))
#       img_dataset.append(np.array(image))

# Do the similar steps for masks, make sure your mask images are binary images.

In [26]:
# Set resize size
SIZE = 128

img_dataset=[]
mask_dataset=[]
# Read and preprocess X-ray images
images=os.listdir(image_dir)
masks=os.listdir(mask_dir)
for i,image_name in enumerate(images):
    if (image_name.split('.')[1]=='png'):
       image=cv2.imread(image_dir+image_name,0)
       image=Image.fromarray(image)
       image=image.resize((SIZE,SIZE))
       img_dataset.append(np.array(image))


In [27]:
# Read and preprocess mask images
masks = os.listdir(mask_dir)
for i, mask_name in enumerate(masks):
    if mask_name.endswith('.png'):
        mask = cv2.imread(mask_dir +mask_name,0)  # read as grayscale
        mask = Image.fromarray(mask)
        mask = mask.resize((SIZE, SIZE))
        # Ensure mask is binary
        mask = np.array(mask)
        mask = np.where(mask > 127, 1, 0)
        mask_dataset.append(mask)

In [28]:
# convert the image data to array format and normalize/scale using (tensorflow.keras.utils.normalize()) function or (image data/255.)

In [29]:
# Convert to NumPy arrays and reshape for channel dimension
from tensorflow.keras.utils import normalize
img_dataset = np.array(img_dataset).reshape(-1, SIZE, SIZE, 1)
mask_dataset = np.array(mask_dataset).reshape(-1, SIZE, SIZE, 1)

# Normalize the image data (0 to 1)
img_dataset = normalize(img_dataset, axis=1)

# Also normalize masks if needed (in case values are 0–255)
mask_dataset = mask_dataset / 255.0
mask_dataset = (mask_dataset > 0.5).astype(np.float32)  # Ensure binary masks

# Expand dimensions to add channel
#img_dataset = np.expand_dims(img_dataset, axis=-1)   # shape: (N, 128, 128, 1)
#mask_dataset = np.expand_dims(mask_dataset, axis=-1) # shape: (N, 128, 128, 1)

In [30]:
# split the data into train test with following specifications
#Hint: train_test_split(img_dataset,mask_dataset,test_size=0.20,random_state=0)
X_train, X_test, Y_train, Y_test = train_test_split(
    img_dataset, mask_dataset, test_size=0.20, random_state=0
)

<IPython.core.display.Javascript object>

In [31]:
import segmentation_models as sm1

# Set backbone
BACKBONE = 'resnet34'

# Get preprocessing function for the chosen backbone
preprocess_input = sm1.get_preprocessing(BACKBONE)


Segmentation Models: using `keras` framework.


In [32]:
# use the preprocessed train input for model fitting
X_train_prepr = preprocess_input(X_train)
X_test_prepr = preprocess_input(X_test)

In [33]:
#load the Unet model using the below syntax
model= sm1.Unet(BACKBONE, input_shape=(128,128,1),
                                encoder_weights=None, classes=1, activation='sigmoid')
# Model compilation with the following specifications
#Hint: optimizer='Adam'

# Compile the model
#    loss=sm.losses.bce_jaccard_loss
#    metrics=[sm.metrics.iou_score])
model.compile(
    optimizer='Adam',
    loss=sm1.losses.bce_jaccard_loss,
    metrics=[sm1.metrics.iou_score]
)

In [34]:
# fit the model for X_train_prepr and y_train.
# use batch_size=2 and epochs=5 (maximum)
model.fit(
    X_train, Y_train,
    validation_data=(X_test, Y_test),
    batch_size=2,
    epochs=5
)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.src.callbacks.History at 0x1fbbfc32b50>

In [35]:
loss, iou = model.evaluate(X_test, Y_test)
print(f"Test Loss: {loss:.4f}")
print(f"IOU Score: {iou:.4f}")

Test Loss: 1.0089
IOU Score: 0.0000


In [37]:
loss, accuracy = model.evaluate(Y_test)
print(f"Test Accuracy: {accuracy:.4f}")

Test Accuracy: 0.0000


Very few training epochs (you ran for 2–5 only).

Low complexity model or random weight initialization (encoder weights were None due to offline mode).

Mask imbalance — too many black pixels (0s), not enough segmentation area.