In [114]:
import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense

In [115]:
text = '''
Top Deep Learning Interview Questions and Answers for 2026
Deep Learning Interview Questions and Answers
1. What is Deep Learning?
If you are going for a deep learning interview, you definitely know what exactly deep learning is. However, with this question the interviewee expects you to give an in-detail answer, with an example. Deep Learning involves taking large volumes of structured or unstructured data and using complex algorithms to train neural networks. It performs complex operations to extract hidden patterns and features (for instance, distinguishing the image of a cat from that of a dog).
Deep Learning
2. What is a Neural Network?
Neural Networks replicate the way humans learn, inspired by how the neurons in our brains fire, only much simpler.
Neural Network
The most common Neural Networks consist of three network layers:
An input layer
A hidden layer (this is the most important layer where feature extraction takes place, and adjustments are made to train faster and function better)
An output layer
Each sheet contains neurons called “nodes,” performing various operations. Neural Networks are used in deep learning algorithms like CNN, RNN, GAN, etc.
3. What Is a Multi-layer Perceptron(MLP)?
As in Neural Networks, MLPs have an input layer, a hidden layer, and an output layer. It has the same structure as a single layer perceptron with one or more hidden layers. A single layer perceptron can classify only linear separable classes with binary output (0,1), but MLP can classify nonlinear classes.
Except for the input layer, each node in the other layers uses a nonlinear activation function. This means the input layers, the data coming in, and the activation function is based upon all nodes and weights being added together, producing the output. MLP uses a supervised learning method called “backpropagation.” In backpropagation, the neural network calculates the error with the help of cost function. It propagates this error backward from where it came (adjusts the weights to train the model more accurately).
4. What Is Data Normalization, and Why Do We Need It?
The process of standardizing and reforming data is called “Data Normalization.” It’s a pre-processing step to eliminate data redundancy. Often, data comes in, and you get the same information in different formats. In these cases, you should rescale values to fit into a particular range, achieving better convergence.
Become an AI and Machine Learning Expert
With the Professional Certificate in AI and MLExplore ProgramBecome an AI and Machine Learning Expert
5. What is the Boltzmann Machine?
One of the most basic Deep Learning models is a Boltzmann Machine, resembling a simplified version of the Multi-Layer Perceptron. This model features a visible input layer and a hidden layer -- just a two-layer neural net that makes stochastic decisions as to whether a neuron should be on or off. Nodes are connected across layers, but no two nodes of the same layer are connected.
6. What Is the Role of Activation Functions in a Neural Network?
At the most basic level, an activation function decides whether a neuron should be fired or not. It accepts the weighted sum of the inputs and bias as input to any activation function. Step function, Sigmoid, ReLU, Tanh, and Softmax are examples of activation functions.
Role of Activation Functions in a Neural Network
7. What Is the Cost Function?
Also referred to as “loss” or “error,” cost function is a measure to evaluate how good your model’s performance is. It’s used to compute the error of the output layer during backpropagation. We push that error backward through the neural network and use that during the different training functions.
What is the Cost function?
8. What Is Gradient Descent?
Gradient Descent is an optimal algorithm to minimize the cost function or to minimize an error. The aim is to find the local-global minima of a function. This determines the direction the model should take to reduce the error.
Gradient Descent
9. What Do You Understand by Backpropagation?
This is one of the most frequently asked deep learning interview questions. Backpropagation is a technique to improve the performance of the network. It backpropagates the error and updates the weights to reduce the error.
What do you understand by Backpropogation?
10. What Is the Difference Between a Feedforward Neural Network and Recurrent Neural Network?
In this deep learning interview question, the interviewee expects you to give a detailed answer.
A Feedforward Neural Network signals travel in one direction from input to output. There are no feedback loops; the network considers only the current input. It cannot memorize previous inputs (e.g., CNN).
A Recurrent Neural Network’s signals travel in both directions, creating a looped network. It considers the current input with the previously received inputs for generating the output of a layer and can memorize past data due to its internal memory.
Recurrent Neural Network'''

In [116]:
tokenized = Tokenizer()
tokenized.fit_on_texts([text])

In [117]:
tokenized.word_index

{'the': 1,
 'a': 2,
 'and': 3,
 'is': 4,
 'to': 5,
 'of': 6,
 'layer': 7,
 'neural': 8,
 'in': 9,
 'learning': 10,
 'network': 11,
 'what': 12,
 'function': 13,
 'deep': 14,
 'an': 15,
 'it': 16,
 'input': 17,
 'you': 18,
 'this': 19,
 'error': 20,
 'are': 21,
 'with': 22,
 'data': 23,
 'output': 24,
 'activation': 25,
 'or': 26,
 'interview': 27,
 'for': 28,
 'networks': 29,
 'hidden': 30,
 'most': 31,
 'layers': 32,
 'as': 33,
 'cost': 34,
 'that': 35,
 '”': 36,
 'perceptron': 37,
 'one': 38,
 'backpropagation': 39,
 'should': 40,
 'machine': 41,
 'functions': 42,
 'questions': 43,
 'train': 44,
 'from': 45,
 'by': 46,
 'only': 47,
 'called': 48,
 'mlp': 49,
 'same': 50,
 'can': 51,
 'nodes': 52,
 'weights': 53,
 'model': 54,
 'do': 55,
 'ai': 56,
 'inputs': 57,
 'gradient': 58,
 'descent': 59,
 'recurrent': 60,
 'answers': 61,
 '1': 62,
 'question': 63,
 'interviewee': 64,
 'expects': 65,
 'give': 66,
 'answer': 67,
 'complex': 68,
 'algorithms': 69,
 'operations': 70,
 'features': 

In [118]:
input_sequences = []
for sentence in text.split('\n'):
    tokenized_sequence =  tokenized.texts_to_sequences([sentence])[0]
    
    for i in range(1, len(tokenized_sequence)):
        input_sequences.append(tokenized_sequence[:i+1])

In [119]:
input_sequences

[[115, 14],
 [115, 14, 10],
 [115, 14, 10, 27],
 [115, 14, 10, 27, 43],
 [115, 14, 10, 27, 43, 3],
 [115, 14, 10, 27, 43, 3, 61],
 [115, 14, 10, 27, 43, 3, 61, 28],
 [115, 14, 10, 27, 43, 3, 61, 28, 116],
 [14, 10],
 [14, 10, 27],
 [14, 10, 27, 43],
 [14, 10, 27, 43, 3],
 [14, 10, 27, 43, 3, 61],
 [62, 12],
 [62, 12, 4],
 [62, 12, 4, 14],
 [62, 12, 4, 14, 10],
 [117, 18],
 [117, 18, 21],
 [117, 18, 21, 118],
 [117, 18, 21, 118, 28],
 [117, 18, 21, 118, 28, 2],
 [117, 18, 21, 118, 28, 2, 14],
 [117, 18, 21, 118, 28, 2, 14, 10],
 [117, 18, 21, 118, 28, 2, 14, 10, 27],
 [117, 18, 21, 118, 28, 2, 14, 10, 27, 18],
 [117, 18, 21, 118, 28, 2, 14, 10, 27, 18, 119],
 [117, 18, 21, 118, 28, 2, 14, 10, 27, 18, 119, 120],
 [117, 18, 21, 118, 28, 2, 14, 10, 27, 18, 119, 120, 12],
 [117, 18, 21, 118, 28, 2, 14, 10, 27, 18, 119, 120, 12, 121],
 [117, 18, 21, 118, 28, 2, 14, 10, 27, 18, 119, 120, 12, 121, 14],
 [117, 18, 21, 118, 28, 2, 14, 10, 27, 18, 119, 120, 12, 121, 14, 10],
 [117, 18, 21, 118, 2

In [133]:
max([len(x) for x in input_sequences])

83

In [120]:
max_length = max([len(x) for x in input_sequences])


In [121]:
padded_sequence_text = pad_sequences(input_sequences, maxlen=max_length, padding='pre')
padded_sequence_text

array([[  0,   0,   0, ...,   0, 115,  14],
       [  0,   0,   0, ..., 115,  14,  10],
       [  0,   0,   0, ...,  14,  10,  27],
       ...,
       [  0,   0,   0, ..., 316, 317, 318],
       [  0,   0,   0, ...,   0,  60,   8],
       [  0,   0,   0, ...,  60,   8,  11]])

In [122]:
X = padded_sequence_text[:,:-1]
y = padded_sequence_text[:,-1]

In [123]:
X.shape, y.shape

((775, 82), (775,))

In [124]:
y = to_categorical(y, num_classes=len(tokenized.word_index)+1)
y.shape

(775, 319)

In [140]:
model = Sequential([
    Embedding(input_dim=319, output_dim=64, input_length=82),
    LSTM(180), 
    Dense(319, activation='softmax')
])

In [130]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(X, y, epochs=100, verbose=1)
model.summary()


Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

In [138]:
import time
import numpy as np
text = "Neural Networks replicate"

for i in range(10):
# tokenize
    token_text = tokenized.texts_to_sequences([text])[0]
# padding
    padded_token_text = pad_sequences([token_text], maxlen=82, padding='pre')
# predict
    pos = np.argmax(model.predict(padded_token_text)) + 1


    for word,index in tokenized.word_index.items():
        if index == pos:
            text = text + " " + word
            print(text)
            time.sleep(2)

Neural Networks replicate a
Neural Networks replicate a humans
Neural Networks replicate a humans inspired
Neural Networks replicate a humans inspired only
Neural Networks replicate a humans inspired only a
Neural Networks replicate a humans inspired only a there
Neural Networks replicate a humans inspired only a there with
Neural Networks replicate a humans inspired only a there with a
Neural Networks replicate a humans inspired only a there with a there
Neural Networks replicate a humans inspired only a there with a there with


In [None]:
Neural Networks replicate the way humans learn, inspired by how the neurons in our brains fire, only much