# Question-1
Write a Python code to build a deep neural network using Keras and compute a number of parameters, memory and FLOPs for the following model. Use relu activations functions in the hidden layers and sigmoid activations function in the output layers.


CPU that performs 1 GFLOPS (1,000,000,000) per seconds and computes the inference time of the Deep neural network model.



<img src="2.png" width="700" height="500">


##  Parameters calculation in Deep neural network

The number of internal parameters in a neural network is the total number of weights + the total number of biases. The total number of weights equals the sum of the products of each pair of adjacent layers. The total number of biases equals the number of hidden neurons + the number of output neurons.

## Model Size calculations


Model Size (in bytes)=Number of Parameters×Bytes Per Parameter
Model Size (in KB)=Model Size (in bytes)/1024

##  FLOPs calculation in Deep neural network
FLOPs of  FC=2*(input size x output size )+(output size x activation)

## Activation functions FLOPS for  Tensor Flow

Relu  -->      1FLOPs

Sigmoid   -->   1FLOPs

Tanh   -->      1FLOPs

Softmax    -->  6FLOPs

## Infrences time calculations

The inference time = FLOPs/FLOPS.

FLOPs-> measures computational complexity of the model.

FLOPS-> measures the hardware’s processing capability



In [4]:
%pip install tensorflow==2.13.0

Collecting tensorflow==2.13.0
  Downloading tensorflow-2.13.0-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.4 kB)
Collecting gast<=0.4.0,>=0.2.1 (from tensorflow==2.13.0)
  Downloading gast-0.4.0-py3-none-any.whl.metadata (1.1 kB)
Collecting keras<2.14,>=2.13.1 (from tensorflow==2.13.0)
  Downloading keras-2.13.1-py3-none-any.whl.metadata (2.4 kB)
Collecting numpy<=1.24.3,>=1.22 (from tensorflow==2.13.0)
  Downloading numpy-1.24.3-cp311-cp311-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (5.6 kB)
Collecting tensorboard<2.14,>=2.13 (from tensorflow==2.13.0)
  Downloading tensorboard-2.13.0-py3-none-any.whl.metadata (1.8 kB)
Collecting tensorflow-estimator<2.14,>=2.13.0 (from tensorflow==2.13.0)
  Downloading tensorflow_estimator-2.13.0-py2.py3-none-any.whl.metadata (1.3 kB)
Collecting typing-extensions<4.6.0,>=3.6.6 (from tensorflow==2.13.0)
  Downloading typing_extensions-4.5.0-py3-none-any.whl.metadata (8.5 kB)
Collecting google-auth-oauthlib<1.1,>=0

In [1]:
### Write  your code here
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv3D, MaxPooling3D, Flatten, Dense


model=Sequential([Dense(5,activation='relu',input_shape=(5,)),
                  Dense(5,activation='relu'),
                  Dense(1,activation='relu')
                  ])


In [2]:
def compute(model):
    totparm=0
    totflops=0
    for layer in model.layers:
        layertype=layer.__class__.__name__
        if layertype =='Dense':
            inputunits=layer.input_shape[-1]
            outputunits=layer.output_shape[-1]
            parm = inputunits*outputunits + outputunits
            flops = 2*inputunits*outputunits
        totparm+=parm
        totflops+=flops
    return totparm,totflops

In [3]:
compute(model)

(66, 110)

# Question-2
Write a Python code to build a deep neural network using Keras and compute a number of parameters, memory and FLOPs for the following model. Use relu activations functions in the hidden layers and softmax activations function in the output layers. Write a Python code plot  the bar graph of the question 1 and question 2 output and compare.



<img src="1.png" width="700" height="500">

In [4]:
### Write  your code here
model=Sequential([Dense(9,activation='relu',input_shape=(4,)),
                  Dense(6,activation='relu'),
                  Dense(3,activation='softmax')
                  ])

In [5]:
def compute(model):
  totparam=0
  totflops=0
  acti_flops_map={
      "relu": 1,
      "sigmoid": 1,
      "tanh": 1,
      "softmax": 6,
  }

  for layer in model.layers:
    layertype=layer.__class__.__name__
    acti_flops=0
    acti=getattr(layer, "activation" , None)
    acti_name=acti.__name__ if acti else None

    if layertype in ["Dense"]:
     inputunits=layer.input_shape[-1]
     outputunits=layer.output_shape[-1]
     param=inputunits*outputunits+outputunits
     flops=2*outputunits*inputunits
     acti_flops=outputunits*acti_flops_map.get(acti_name,0)

     totparam+=param
     totflops+=flops+acti_flops
  return totparam,totflops

In [6]:
compute(model)

(126, 249)

## Output Dimensions Formula for 2D Convolution
<img src="10.png" width="600" height="400">

## Parameter calculation of 2DCNN


<img src="4.png" width="400" height="200">

## FLOPs calculation of 2DCNN
<img src="6.png" width="600" height="400">



## FLOPs calculation for Pooling Layers
<img src="12.png" width="600" height="400">


# Question-3

Write a Python code build 2DCNN model for the following specifications using Keras and compute the number of parameters ,model size and FLOPs

The model architecture consists of several layers designed for image classification tasks, such as recognizing digits from the MNIST dataset. The architecture begins with a 2D convolutional layer (Conv2D), which applies 32 filters of size 3x3 to the input image (28x28x1), followed by the ReLU activation function to introduce non-linearity. This is followed by a max-pooling layer (MaxPooling2D) with a pool size of 2x2, reducing the spatial dimensions of the feature maps while retaining important information. A second convolutional layer with 64 filters of size 3x3 is then applied, again using ReLU activation. Another max-pooling layer  (2x2 ) follows to further downsample the feature maps. The output of the convolutional layers is then flattened into a one-dimensional vector using the Flatten layer, which is fed into the fully connected dense layers. The first dense layer has 64 neurons with ReLU activation, allowing the model to learn complex representations, while the final dense layer has 10 neurons with a softmax activation function, providing probabilities for each of the 10 possible digit classes.


In [7]:
### Write  your code here

from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten
from tensorflow.keras.utils import plot_model

model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D((2, 2)),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2)),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

model.summary()

Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 26, 26, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2  (None, 13, 13, 32)        0         
 D)                                                              
                                                                 
 conv2d_1 (Conv2D)           (None, 11, 11, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 5, 5, 64)          0         
 g2D)                                                            
                                                                 
 flatten (Flatten)           (None, 1600)              0         
                                                                 
 dense_6 (Dense)             (None, 64)               

In [8]:
def get_model_size(model):
    param_size = 4
    params = model.count_params()
    size = params * param_size
    return size

In [9]:
model_size = get_model_size(model)
print(f"Model size: {model_size / (1024 ** 2):.2f} MB")

Model size: 0.47 MB


In [10]:
def compute(model):
    totparam = 0
    totflops = 0

    acti_flops_map = {
        "relu": 1,
        "sigmoid": 1,
        "tanh": 1,
        "softmax": 6,
    }

    for layer in model.layers:
        layertype = layer.__class__.__name__
        acti_flops = 0
        acti = getattr(layer, "activation", None)
        acti_name = acti.__name__ if acti else None

        if layertype in ["Dense"]:
            inputunits = layer.input_shape[-1]
            outputunits = layer.output_shape[-1]
            param = inputunits * outputunits + outputunits
            flops = 2 * outputunits * inputunits
            acti_flops = outputunits * acti_flops_map.get(acti_name, 0)

        elif layertype in ["Conv2D", "Conv3D"]:
            kernel_size = layer.kernel_size
            input_shape = layer.input_shape
            output_shape = layer.output_shape
            filters = layer.filters

            param = (filters * kernel_size[0] * kernel_size[1] *
                     input_shape[-1]) + filters

            output_elements = output_shape[1] * output_shape[2] * filters
            flops = 2 * kernel_size[0] * kernel_size[1] * input_shape[-1] * output_elements
            acti_flops = output_elements * acti_flops_map.get(acti_name, 0)

        else:
            param = 0
            flops = 0

        totparam += param
        totflops += flops + acti_flops

    return totparam, totflops


In [11]:
compute(model)

(121930, 5085500)

# Question-4
Write a Python code to build CNN using Keras and compute a  number of parameters,memory and FLOPs for the following model.





<img src="3.jpg" width="900" height="700">

In [12]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    MaxPooling2D((2, 2),strides=2),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D((2, 2),strides=2),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])
model.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_2 (Conv2D)           (None, 30, 30, 32)        896       
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 15, 15, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_3 (Conv2D)           (None, 13, 13, 64)        18496     
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 6, 6, 64)          0         
 g2D)                                                            
                                                                 
 flatten_1 (Flatten)         (None, 2304)              0         
                                                                 
 dense_8 (Dense)             (None, 64)               

In [13]:
model_size = get_model_size(model)
print(f"Model size: {model_size / (1024 ** 2):.2f} MB")


Model size: 0.64 MB


In [14]:
compute(model)

(167562, 8121148)

## Output Shape of 3DCNN

<img src="13.png" width="400" height="300">
<img src="14.png" width="400" height="300">






### 3DCNN parameters calculations
<img src="7.png" width="900" height="700">


### 3DCNN FLOPs calculations

<img src="8.png" width="900" height="700">



### 3DCNN FLOPs calculation for Pooling Layers
<img src="15.png" width="900" height="700">


# Question-5

You are tasked with designing a 3D Convolutional Neural Network (3D CNN) to classify video clips into one of five categories, such as walking, running, jumping, swimming, and cycling. Each video clip consists of 16 frames of size 64x64, and the data has a single channel (grayscale). The model should include two 3D convolutional layers followed by max-pooling layers, a flattening layer, and fully connected dense layers. Specifically, the architecture should satisfy the following requirements:

The input layer should accept a shape of (16, 64, 64, 1) corresponding to the temporal, height, width, and channel dimensions.
The first 3D convolutional layer should have 32 filters of size (3, 3, 3) and use ReLU activation.
The first max-pooling layer should have a pool size of (2, 2, 2) to downsample the feature maps.
The second 3D convolutional layer should have 64 filters of size (3, 3, 3) and use ReLU activation.
The second max-pooling layer should again have a pool size of (2, 2, 2).
The flattened layer should connect to a dense layer with 128 neurons using ReLU activation, followed by the output layer with 5 neurons and a softmax activation.
Design and implement this 3D CNN architecture, compute the number of parameters for each layer, compute the model size and compute the FLOPs.







In [15]:
### Write  your code here
model = Sequential([
    Conv3D(32, (3, 3, 3), activation='relu', input_shape=(16, 64, 64, 1)),
    MaxPooling3D((2, 2, 2)),
    Conv3D(64, (3, 3, 3), activation='relu'),
    MaxPooling3D((2, 2, 2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(5, activation='softmax')
])

model.summary()

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv3d (Conv3D)             (None, 14, 62, 62, 32)    896       
                                                                 
 max_pooling3d (MaxPooling3  (None, 7, 31, 31, 32)     0         
 D)                                                              
                                                                 
 conv3d_1 (Conv3D)           (None, 5, 29, 29, 64)     55360     
                                                                 
 max_pooling3d_1 (MaxPoolin  (None, 2, 14, 14, 64)     0         
 g3D)                                                            
                                                                 
 flatten_2 (Flatten)         (None, 25088)             0         
                                                                 
 dense_10 (Dense)            (None, 128)              

In [None]:
def compute(model):
    totparam = 0
    totflops = 0

    acti_flops_map = {
        "relu": 1,
        "sigmoid": 1,
        "tanh": 1,
        "softmax": 6,
    }

    for layer in model.layers:
        layertype = layer.__class__.__name__
        acti_flops = 0
        acti = getattr(layer, "activation", None)
        acti_name = acti.__name__ if acti else None

        if layertype == "Dense":
            inputunits = layer.input_shape[-1]
            outputunits = layer.output_shape[-1]
            param = inputunits * outputunits + outputunits
            flops = 2 * outputunits * inputunits
            acti_flops = outputunits * acti_flops_map.get(acti_name, 0)

        elif layertype in ["Conv2D", "Conv3D"]:
            kernel_size = layer.kernel_size
            input_shape = layer.input_shape
            output_shape = layer.output_shape
            filters = layer.filters

            if layertype == "Conv2D":
                param = (filters * kernel_size[0] * kernel_size[1] *
                         input_shape[-1]) + filters
            elif layertype == "Conv3D":
                param = (filters * kernel_size[0] * kernel_size[1] *
                         kernel_size[2] * input_shape[-1]) + filters

            output_elements = (
                output_shape[1] * output_shape[2] * output_shape[3] * filters
                if layertype == "Conv3D"
                else output_shape[1] * output_shape[2] * filters
            )

            flops = (
                2 * kernel_size[0] * kernel_size[1] *
                (kernel_size[2] if layertype == "Conv3D" else 1) *
                input_shape[-1] * output_elements
            )
            acti_flops = output_elements * acti_flops_map.get(acti_name, 0)

        else:
            param = 0
            flops = 0

        totparam += param
        totflops += flops + acti_flops

    return totparam, totflops


In [16]:
model_size = get_model_size(model)
print(f"Model size: {model_size / (1024 ** 2):.2f} MB")

Model size: 12.47 MB


In [17]:
compute(model)

(3230853, 12306270)

# Question-6

A company is building a system to predict customer sentiment (positive or negative) based on a sequence of customer reviews.
Each review is represented as a feature vector of size 4, where each feature corresponds to a specific aspect of the review, such as tone, length, and keyword presence. To process this sequential data, the team decides to use a Recurrent Neural Network (RNN).

The input size n<sub>x</sub> is 4, meaning each input vector x<sup>t</sup> has 4 features.  
The hidden layer has 3 hidden units n<sub>a</sub>=3.  
The output size n<sub>y</sub> is 2, corresponding to the two possible sentiment classes (positive or negative).  
The sequence length T<sub>x</sub> is 5, meaning the RNN will process a sequence of 5 reviews at a time.
use sigmoid activation functions in the output layers and compute the number of parameters and memory in the RNN model.




<img src="9.png" width="300" height="100">


Number of parameter of RNN = g × [a(a+i) + a]

a --> hidden unit

i ---> input unit



In [20]:
from keras.layers import Dense, SimpleRNN

model = Sequential()
model.add(SimpleRNN(3, input_shape=(5,4), activation='sigmoid'))
model.add(Dense(units=2, activation='sigmoid'))

model.summary()
model

Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 simple_rnn_1 (SimpleRNN)    (None, 3)                 24        
                                                                 
 dense_13 (Dense)            (None, 2)                 8         
                                                                 
Total params: 32 (128.00 Byte)
Trainable params: 32 (128.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


<keras.src.engine.sequential.Sequential at 0x7bbabce8bd50>

In [21]:
def compute(model):
    totparam = 0
    totflops = 0

    acti_flops_map = {
        "relu": 1,
        "sigmoid": 1,
        "tanh": 1,
        "softmax": 6,
    }

    for layer in model.layers:
        layertype = layer.__class__.__name__
        acti_flops = 0
        acti = getattr(layer, "activation", None)
        acti_name = acti.__name__ if acti else None

        if layertype == "Dense":
            inputunits = layer.input_shape[-1]
            outputunits = layer.output_shape[-1]
            param = inputunits * outputunits + outputunits
            flops = 2 * outputunits * inputunits
            acti_flops = outputunits * acti_flops_map.get(acti_name, 0)

        elif layertype == "SimpleRNN":
            inputunits = layer.input_shape[-1]
            hiddenunits = layer.units
            param = hiddenunits * (hiddenunits + inputunits + 1)
            flops = 2 * hiddenunits * (hiddenunits + inputunits)
            acti_flops = hiddenunits * acti_flops_map.get(acti_name, 0)

        else:
            param = 0
            flops = 0

        totparam += param
        totflops += flops + acti_flops

    return totparam, totflops


In [25]:
model_size = get_model_size(model)

In [26]:
compute(model)

(32, 59)

# Question 7

Write a Python code to implement a single LSTM unit for the follwoing and compute the parameter of the follwoing model using Keras.
    
<img src="https://github.com/kmkarakaya/ML_tutorials/blob/master/images/LSTM_internal2.png?raw=true" width="500">


 Notice that we can guess the size (shape) of W,U and b given:
 * Input size ($h_{t-1}$ and $x_{t}$ )
 * Output size ($h_{t-1}$)

 Since output must equal to Hidden State (hx1) size:

  * for W param =  ($h$ × $x$)
  * for U param =  ($h$ × $h$)
  * for Biases  param =   $h$

 * total params = W param + U param + Biases param
  
    =  ($h$ × $x$) +  ($h$ × $h$) +  $h$

    =  ( ($h$ × $x$) +  ($h$ × $h$) +   $h$ )

    =  ( ($x$ + $h$) ×  $h$  +   $h$ )

* there are 4 functions which are exactly defined in the same way, in the LSTM layer, there will be

 ##   **LSTM parameter number = 4 × (($x$ + $h$) × $h$ +$h$)**



In [27]:
### Write  your code here
from tensorflow.keras.layers import LSTM, Dense

input_size = 4
hidden_size = 3

model = Sequential([
    LSTM(hidden_size, input_shape=(None, input_size)),
    Dense(1, activation='sigmoid')
])

model.summary()

Model: "sequential_7"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 lstm (LSTM)                 (None, 3)                 96        
                                                                 
 dense_14 (Dense)            (None, 1)                 4         
                                                                 
Total params: 100 (400.00 Byte)
Trainable params: 100 (400.00 Byte)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [28]:
get_model_size(model)

400

In [30]:
def compute(model):
    totparam = 0
    totflops = 0

    acti_flops_map = {
        "relu": 1,
        "sigmoid": 1,
        "tanh": 1,
        "softmax": 6,
    }

    for layer in model.layers:
        layertype = layer.__class__.__name__
        acti_flops = 0
        acti = getattr(layer, "activation", None)
        acti_name = acti.__name__ if acti else None

        if layertype == "Dense":
            inputunits = layer.input_shape[-1]
            outputunits = layer.output_shape[-1]
            param = inputunits * outputunits + outputunits
            flops = 2 * outputunits * inputunits
            acti_flops = outputunits * acti_flops_map.get(acti_name, 0)

        elif layertype == "SimpleRNN":
            inputunits = layer.input_shape[-1]
            hiddenunits = layer.units
            param = hiddenunits * (hiddenunits + inputunits + 1)
            flops = 2 * hiddenunits * (hiddenunits + inputunits)
            acti_flops = hiddenunits * acti_flops_map.get(acti_name, 0)

        elif layertype == "LSTM":
            inputunits = layer.input_shape[-1]
            hiddenunits = layer.units
            param = 4 * (hiddenunits * (inputunits + hiddenunits + 1))
            flops = 4 * (2 * hiddenunits * (inputunits + hiddenunits))
            acti_flops = 4 * hiddenunits * acti_flops_map.get(acti_name, 0)

        else:
            param = 0
            flops = 0

        totparam += param
        totflops += flops + acti_flops

    return totparam, totflops

In [31]:
compute(model)

(100, 187)