# Advanced Programme in Deep Learning (Foundations and Applications)
## A Program by IISc and TalentSprint
### Assignment : Natural Language Processing - II (Bidirectional GRU for Sentence Classification)

### Learning Objectives:

At the end of the experiment, you will be able to:

*  generate vector representation of words in the data using Glove embeddings
*  implement the multi-layer bidirectional GRU (Gated
Recurrent Unit) for solving the sentence classification problem

### Setup Steps:

In [None]:
#@title Please enter your registration id to start: { run: "auto", display-mode: "form" }
Id = "2239822" #@param {type:"string"}

In [None]:
#@title Please enter your password (normally your phone number) to continue: { run: "auto", display-mode: "form" }
password = "9167668365" #@param {type:"string"}

In [None]:
#@title Run this cell to complete the setup for this Notebook
from IPython import get_ipython
import warnings
warnings.filterwarnings("ignore")

ipython = get_ipython()

notebook= "M3_AST_17_Bidirectional_GRU_for_Sentence_Classification_C" #name of the notebook

def setup():
    ipython.magic("sx wget -qq https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/glove.6B.zip")
    ipython.magic("sx unzip glove.6B.zip")
    ipython.magic("sx wget https://cdn.iisc.talentsprint.com/DLFA/Experiment_related_data/rt-polarity.zip")
    ipython.magic("sx unzip rt-polarity.zip")
    from IPython.display import HTML, display
    display(HTML('<script src="https://dashboard.talentsprint.com/aiml/record_ip.html?traineeId={0}&recordId={1}"></script>'.format(getId(),submission_id)))
    print("Setup completed successfully")
    return

def submit_notebook():
    ipython.magic("notebook -e "+ notebook + ".ipynb")

    import requests, json, base64, datetime

    url = "https://dashboard.talentsprint.com/xp/app/save_notebook_attempts"
    if not submission_id:
      data = {"id" : getId(), "notebook" : notebook, "mobile" : getPassword()}
      r = requests.post(url, data = data)
      r = json.loads(r.text)

      if r["status"] == "Success":
          return r["record_id"]
      elif "err" in r:
        print(r["err"])
        return None
      else:
        print ("Something is wrong, the notebook will not be submitted for grading")
        return None

    elif getAnswer1() and getAnswer2() and getComplexity() and getAdditional() and getConcepts() and getComments() and getMentorSupport():
      f = open(notebook + ".ipynb", "rb")
      file_hash = base64.b64encode(f.read())

      data = {"complexity" : Complexity, "additional" :Additional,
              "concepts" : Concepts, "record_id" : submission_id,
              "answer1" : Answer1, "answer2" : Answer2, "id" : Id, "file_hash" : file_hash,
              "notebook" : notebook,
              "feedback_experiments_input" : Comments,
              "feedback_mentor_support": Mentor_support}
      r = requests.post(url, data = data)
      r = json.loads(r.text)
      if "err" in r:
        print(r["err"])
        return None
      else:
        print("Your submission is successful.")
        print("Ref Id:", submission_id)
        print("Date of submission: ", r["date"])
        print("Time of submission: ", r["time"])
        print("View your submissions: https://dlfa-iisc.talentsprint.com/notebook_submissions")
        #print("For any queries/discrepancies, please connect with mentors through the chat icon in LMS dashboard.")
        return submission_id
    else: submission_id


def getAdditional():
  try:
    if not Additional:
      raise NameError
    else:
      return Additional
  except NameError:
    print ("Please answer Additional Question")
    return None

def getComplexity():
  try:
    if not Complexity:
      raise NameError
    else:
      return Complexity
  except NameError:
    print ("Please answer Complexity Question")
    return None

def getConcepts():
  try:
    if not Concepts:
      raise NameError
    else:
      return Concepts
  except NameError:
    print ("Please answer Concepts Question")
    return None


# def getWalkthrough():
#   try:
#     if not Walkthrough:
#       raise NameError
#     else:
#       return Walkthrough
#   except NameError:
#     print ("Please answer Walkthrough Question")
#     return None

def getComments():
  try:
    if not Comments:
      raise NameError
    else:
      return Comments
  except NameError:
    print ("Please answer Comments Question")
    return None


def getMentorSupport():
  try:
    if not Mentor_support:
      raise NameError
    else:
      return Mentor_support
  except NameError:
    print ("Please answer Mentor support Question")
    return None

def getAnswer1():
  try:
    if not Answer1:
      raise NameError
    else:
      return Answer1
  except NameError:
    print ("Please answer Question 1")
    return None

def getAnswer2():
  try:
    if not Answer2:
      raise NameError
    else:
      return Answer2
  except NameError:
    print ("Please answer Question 2")
    return None


def getId():
  try:
    return Id if Id else None
  except NameError:
    return None

def getPassword():
  try:
    return password if password else None
  except NameError:
    return None

submission_id = None
### Setup
if getPassword() and getId():
  submission_id = submit_notebook()
  if submission_id:
    setup()
else:
  print ("Please complete Id and Password cells before running setup")



Setup completed successfully


### Dataset Description

The **sentence polarity dataset v1.0** contains two data files which are:
  * **rt-polarity.pos**: It contains 5331 positive examples
  * **rt-polarity.neg**: It contains 5331 negative examples

Each line in these two files corresponds to a single snippet (usually
containing roughly one single sentence) that includes the review of a movie.

**Note:** Here is the source [link](https://www.cs.cornell.edu/people/pabo/movie-review-data/) to the Movie  dataset





### Introduction

The aim of this assignment is to study the use of multi-layer bidirectional GRU (Gated
Recurrent Unit) for solving the sentence classification problem. You will study the effect of adding
layers of BiGRU units on the test set performance of the model. The dataset used will be the same
as that was used in the first assignment.

### Importing the libraries and packages

In [None]:
import pandas as pd
import numpy as np
from sklearn.utils import shuffle

import nltk
nltk.download('stopwords')
from nltk.corpus import stopwords
import string
from gensim.utils import simple_preprocess

import keras
from keras.preprocessing.text import Tokenizer
from keras.utils import pad_sequences
from sklearn.preprocessing import LabelEncoder

from keras.layers import Input, Embedding, Dense, Bidirectional, Dropout, GRU
from keras.models import Sequential   # the model

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


### Loading the data

In [None]:
# Read the positive and negative files and split the sentences into a list
with open('rt-polarity.neg',"r") as data_neg:
  data_neg_set = data_neg.read().splitlines()

with open('rt-polarity.pos',"r") as data_pos:
  data_pos_set = data_pos.read().splitlines()

In [None]:
# Length of the positive and negative reviews
len(data_neg_set), len(data_pos_set)

(5331, 5331)

In [None]:
# Loading the negative reviews
data_neg_set = pd.DataFrame(data_neg_set, columns=["Review"])

# Loading the positive reviews
data_pos_set = pd.DataFrame(data_pos_set, columns=["Review"])

In [None]:
# Print the first five rows of the positive examples
data_pos_set.head()

Unnamed: 0,Review
0,the rock is destined to be the 21st century's ...
1,"the gorgeously elaborate continuation of "" the..."
2,effective but too-tepid biopic
3,if you sometimes like to go to the movies to h...
4,"emerges as something rare , an issue movie tha..."


In [None]:
# Print the first five rows of the negative examples
data_neg_set.head()

Unnamed: 0,Review
0,"simplistic , silly and tedious ."
1,"it's so laddish and juvenile , only teenage bo..."
2,exploitative and largely devoid of the depth o...
3,[garbus] discards the potential for pathologic...
4,a visually flashy but narratively opaque and e...


#### Giving the labels to the data

Let us give the labels as positive and negative for the sentences present in the two files.

In [None]:
data_neg_set['Polarity'] = 'Negative'
data_pos_set['Polarity'] = 'Positive'

Let us have a glance at few of the values present in the data with negative and positive reviews that we have labeled in the previous step.

In [None]:
data_neg_set.head()

Unnamed: 0,Review,Polarity
0,"simplistic , silly and tedious .",Negative
1,"it's so laddish and juvenile , only teenage bo...",Negative
2,exploitative and largely devoid of the depth o...,Negative
3,[garbus] discards the potential for pathologic...,Negative
4,a visually flashy but narratively opaque and e...,Negative


#### Combining the positive and negative data

Now, we have to work on the combined data containing the positive and negative reviews, so, let us concatenate both the dataframes.

In [None]:
dataframes = [data_neg_set, data_pos_set]
rt_polarity_data = pd.concat(dataframes)
rt_polarity_data

Unnamed: 0,Review,Polarity
0,"simplistic , silly and tedious .",Negative
1,"it's so laddish and juvenile , only teenage bo...",Negative
2,exploitative and largely devoid of the depth o...,Negative
3,[garbus] discards the potential for pathologic...,Negative
4,a visually flashy but narratively opaque and e...,Negative
...,...,...
5326,both exuberantly romantic and serenely melanch...,Positive
5327,mazel tov to a film about a family's joyous li...,Positive
5328,standing in the shadows of motown is the best ...,Positive
5329,it's nice to see piscopo again after all these...,Positive



From above we can see that due to concatenation the last row index is 5330 that we can see below but that should be 10661.

Therefore, we will reset the index so that last row has index of 10661.

In [None]:
rt_polarity_data.reset_index(inplace=True, drop=True)

In [None]:
rt_polarity_data

Unnamed: 0,Review,Polarity
0,"simplistic , silly and tedious .",Negative
1,"it's so laddish and juvenile , only teenage bo...",Negative
2,exploitative and largely devoid of the depth o...,Negative
3,[garbus] discards the potential for pathologic...,Negative
4,a visually flashy but narratively opaque and e...,Negative
...,...,...
10657,both exuberantly romantic and serenely melanch...,Positive
10658,mazel tov to a film about a family's joyous li...,Positive
10659,standing in the shadows of motown is the best ...,Positive
10660,it's nice to see piscopo again after all these...,Positive


When you combine the negative and positive examples, it is a good idea to shuffle the examples so that the negative and positive examples are spread throughout. If we do not shuffle it, then, it may happen that in some mini-batches, examples from only one class(positive or negative) will be present. Therefore, it is better to avoid such scenarios.


In [None]:
rt_polarity_data = shuffle(rt_polarity_data)

In [None]:
rt_polarity_data

Unnamed: 0,Review,Polarity
3988,it's been 20 years since 48 hrs . made eddie m...,Negative
10061,"an admirable , sometimes exceptional film",Positive
7830,it would take a complete moron to foul up a sc...,Positive
9926,just watch bettany strut his stuff . you'll kn...,Positive
4142,"nelson's intentions are good , but the end res...",Negative
...,...,...
6393,"it provides the grand , intelligent entertainm...",Positive
6374,"no one goes unindicted here , which is probabl...",Positive
854,heavy with flabby rolls of typical toback mach...,Negative
7026,a compelling motion picture that illustrates a...,Positive


Let us check the value counts of negative and positive reviews.

In [None]:
rt_polarity_data['Polarity'].value_counts()

Negative    5331
Positive    5331
Name: Polarity, dtype: int64

Checking whether there are any null values present in the data.

In [None]:
rt_polarity_data.isnull().values.any()

False

### Label Encoding

In [None]:
# Converting the labels from categorical to numerical
le = LabelEncoder()
rt_polarity_data['Polarity'] = le.fit_transform(rt_polarity_data['Polarity'])
rt_polarity_data.head()

Unnamed: 0,Review,Polarity
3988,it's been 20 years since 48 hrs . made eddie m...,0
10061,"an admirable , sometimes exceptional film",1
7830,it would take a complete moron to foul up a sc...,1
9926,just watch bettany strut his stuff . you'll kn...,1
4142,"nelson's intentions are good , but the end res...",0


### Data Preprocessing


We can preprocess the text using gensim package. Gensim provides function **simple_preprocess** for more effective preprocessing of the corpus. In such kind of preprocessing, we can convert a document into a list of lowercase tokens. We can also ignore tokens that are too short or too long.

**Note:** Refer to the following [link](https://radimrehurek.com/gensim/utils.html#gensim.utils.simple_preprocess) for gensim `simple_preprocess` method

In [None]:
rt_polarity_data['Review'] = rt_polarity_data['Review'].apply(lambda x:simple_preprocess(x, max_len=30))

In [None]:
# Remove stop words
stop_words = set(stopwords.words('english'))

rt_polarity_data['Review'] = rt_polarity_data['Review'].apply(lambda x: [w for w in x if not w in stop_words])

In [None]:
rt_polarity_data.head()

Unnamed: 0,Review,Polarity
3988,"[years, since, hrs, made, eddie, murphy, movie...",0
10061,"[admirable, sometimes, exceptional, film]",1
7830,"[would, take, complete, moron, foul, screen, a...",1
9926,"[watch, bettany, strut, stuff, know, star, see...",1
4142,"[nelson, intentions, good, end, result, justic...",0


### Hyperparameters

In [None]:
# Hyperparameters
MAX_SENT_LEN = 30   # Number of words to consider from each review
MAX_VOCAB_SIZE = 20000  # Max vocabulary size
BATCH_SIZE = 32
N_EPOCHS = 15

### Tokenize and Pad sequences

A Neural Network only accepts numeric data, so we need to encode the reviews. Here use keras.Tokenizer() to encode the reviews into integers, where each unique word is automatically indexed (using `fit_on_texts` method) calculates the frequency of each word in our corpus/messages.

`texts_to_sequences` method finally converts our array of sequences of strings to list of sequences of integers (most frequent word is assigned 1 and so on).

Each reviews has a different length, so we need to add padding (by adding 0) or truncating the words to the same length (in this case, it is the mean of all reviews length) using `keras.preprocessing.sequence.pad_sequences.`

`post`, pad or truncate the words in the back of a sentence
`pre`, pad or truncate the words in front of a sentence

Each word is assigned an integer and that integer is placed in a list.


For example if we have a sentence “How text to sequence and padding works”. Each word is assigned a number. We suppose how = 1, text = 2, to = 3, sequence = 4, and = 5, padding = 6, works = 7. After texts_to_sequences is called our sentence will look like [1, 2, 3, 4, 5, 6, 7 ]. Now for suppose our MAX_SEQUENCE_LENGTH = 10. After padding our sentence will look like `pre` = [0, 0, 0, 1, 2, 3, 4, 5, 6, 7 ], `post` = [1, 2, 3, 4, 5, 6, 7, 0, 0, 0]

In [None]:
tokenizer = Tokenizer(num_words=MAX_VOCAB_SIZE)
tokenizer.fit_on_texts([' '.join(seq[:MAX_SENT_LEN]) for seq in rt_polarity_data['Review']])

print("Number of words in vocabulary:", len(tokenizer.word_index))

Number of words in vocabulary: 18007


In [None]:
# Convert the sequence of words to sequnce of indices
X = tokenizer.texts_to_sequences([' '.join(seq[:MAX_SENT_LEN]) for seq in rt_polarity_data['Review']])
X = pad_sequences(X, maxlen=MAX_SENT_LEN, padding='post', truncating='post')

y = rt_polarity_data['Polarity']

### Splitting the data into train and test sets

In [None]:
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(X, y, stratify=y, random_state=42, train_size=10000)

In [None]:
X_train.shape, X_test.shape, y_train.shape, y_test.shape

((10000, 30), (662, 30), (10000,), (662,))

### Load the GloVe word embeddings

**What is GloVe?**

GloVe stands for global vectors for word representation. It is an unsupervised learning algorithm developed by Stanford for generating word embeddings by aggregating global word-word co-occurrence matrix from a corpus. Word embeddings are basically a form of word representation that bridges the human understanding of language to that of a machine. Meaning that two similar words are represented by almost similar vectors that are very closely placed in a vector space. These are essential for solving most Natural language processing problems.The resulting embeddings show interesting linear substructures of the word in vector space.

Thus when using word embeddings, all individual words are represented as real-valued vectors in a predefined vector space. Each word is mapped to one vector and the vector values are learned in a way that resembles a neural network.

Now, let us load the 300-dimensional GloVe embeddings.

In [None]:
embeddings_index = {}
# Loading the 300-dimensional vector of the model
f = open('/content/glove.6B.300d.txt')
for line in f:
    values = line.split()
    word = values[0]
    coefs = np.asarray(values[1:], dtype='float32')
    embeddings_index[word] = coefs
f.close()

print('Found %s word vectors.' % len(embeddings_index))

Found 400000 word vectors.


In [None]:
# Adding 1 because of reversed 0 index
words_not_found = []
vocab_size = len(tokenizer.word_index) + 1
print('Loaded %s word vectors.' % len(embeddings_index))

embedding_dim = 300

# Create a weight matrix for words in the training data
embedding_matrix = np.zeros((vocab_size, embedding_dim))
for word, i in tokenizer.word_index.items():
    if i >= vocab_size:
        continue
    embedding_vector = embeddings_index.get(word)
    if (embedding_vector is not None) and len(embedding_vector) > 0:
                embedding_matrix[i] = embedding_vector
    else:
        words_not_found.append(word)

Loaded 400000 word vectors.


In [None]:
print(tokenizer.word_index)



In [None]:
print(len(tokenizer.word_index))

18007


### Define the Bi-directional GRU model



### LSTM vs GRU
<center>
<img src="https://cdn.iisc.talentsprint.com/DLFA/Experiment_related_data/GRU.png" width=700px, height=500/>
</center>
<br><br>

Simple RNNs have a very short memory, due to the issue of vanishing gradients. More complicated cell architectures try to solve the short memory problem. The most famous one is probably the Long Short Term Memory (LSTM) cell:


It uses a gated cell architecture to update and forget information selectively in the network memory (cell and hidden states). The Gated Recurrent Units (GRU) have a slightly simpler architecture (and only one hidden state). GRUs are usually faster than LSTMs, while still often have competitive performances for many applications.

### GRU - The subtle differences

* The **update gate** acts similar to the **forget gate** and **input gate** of an LSTM

* The **update gate** decides how much of the past information (from previous time steps) needs to be passed along to the future.

* The **reset gate** decides how much of the past information to forget

* Some tensor ops and speedier to train than LSTMs

<center>
<img src="https://cdn.iisc.talentsprint.com/DLFA/Experiment_related_data/GRU_Subtle_difference.png" width=700px, height=500/>
</center>
<br><br>

### The need for Bi-directional GRUs

* Bi-directional GRUs are just putting two independent GRUs together

* The input sequence is fed in forward order for one GRU, and reverse order for the other

* The otputs of the two networks are usually concatenated at each time step

* Preserving information from both past and future helps understand context better

<center>
<img src="https://cdn.iisc.talentsprint.com/DLFA/Experiment_related_data/Bi-GRU.jpg" width=700px, height=500/>
</center>


A bidirectional GRU consists of forward layer and a backward layer. The input sequence is fed to the forward layer in the regular way, while in the backward layer the input is processed in the reverse order, starting from the last word, then proceed to the next to last word and so on up to to first word.

The hidden states are then concatenated for each token generating an intermediate representation sequence. Hence, for each intermediate representation the information from the sequence before and after the respective token are taken into account. That means for each iteration step the network has access to the complete document and can deduce the right label from that information.

In [None]:
# Build a sequential model by stacking neural net units
model = Sequential()
embedding_layer = Embedding(vocab_size,
                            embedding_dim,
                            weights = [embedding_matrix],
                            input_length = MAX_SENT_LEN,
                            trainable=False)
model.add(embedding_layer)
model.add(Bidirectional(GRU(128, return_sequences=True, dropout=0.50, name='first_gru_layer')))
model.add(Dropout(0.5))
model.add(Bidirectional(GRU(64, name='second_gru_layer')))
model.add(Dropout(0.5))
model.add(Dense(32, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid', name='output_layer'))

In [None]:
print('Summary of the built model...')
model.summary()

Summary of the built model...
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, 30, 300)           5402400   
                                                                 
 bidirectional (Bidirection  (None, 30, 256)           330240    
 al)                                                             
                                                                 
 dropout (Dropout)           (None, 30, 256)           0         
                                                                 
 bidirectional_1 (Bidirecti  (None, 128)               123648    
 onal)                                                           
                                                                 
 dropout_1 (Dropout)         (None, 128)               0         
                                                                 
 dense (Dense)            

### Compile and train the model

In [None]:
model.compile(loss='binary_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

In [None]:
model.fit(X_train, y_train,
          batch_size=BATCH_SIZE,
          epochs=N_EPOCHS,
          validation_data=(X_test, y_test))

Epoch 1/15
Epoch 2/15
Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


<keras.src.callbacks.History at 0x79bc0217cfa0>

### Evaluate the model

In [None]:
print('Testing...')
model.evaluate(X_test, y_test)

Testing...


[0.6616038680076599, 0.7598187327384949]

In [None]:
# model predictions on the test data
preds = model.predict(X_test)



In [None]:
preds.shape

(662, 1)

In [None]:
# Get the text sequences for the preprocessed movie reviews
reviews_list_idx = tokenizer.texts_to_sequences([' '.join(seq[:MAX_SENT_LEN]) for seq in rt_polarity_data['Review']])

In [None]:
print(reviews_list_idx[1])

[1253, 158, 2471, 1]


In [None]:
# Function to get the predictions on the movie reviews using GRU model
def add_score_predictions(data, reviews_list_idx):

  # Pad the sequences of the data
  reviews_list_idx = pad_sequences(reviews_list_idx, maxlen=MAX_SENT_LEN, padding='post', truncating='post')

  # Get the predictons by using GRU model
  review_preds = model.predict(reviews_list_idx)

  # Add the predictions to the movie reviews data
  rt_polarity_data['polarity score'] = review_preds

  # Set the threshold for the predictions
  pred_sentiment = np.array(list(map(lambda x : 'positive' if x > 0.5 else 'negative', review_preds)))

  # Add the sentiment predictions to the movie reviews
  rt_polarity_data['predicted polarity'] = pred_sentiment

  return rt_polarity_data

In [None]:
# Call the above function to get the sentiment score and the predicted sentiment
data = add_score_predictions(rt_polarity_data, reviews_list_idx)



In [None]:
# Display the data
data[:20]

Unnamed: 0,Review,Polarity,polarity score,predicted polarity
3988,"[years, since, hrs, made, eddie, murphy, movie...",0,0.081872,negative
10061,"[admirable, sometimes, exceptional, film]",1,0.997505,positive
7830,"[would, take, complete, moron, foul, screen, a...",1,0.989449,positive
9926,"[watch, bettany, strut, stuff, know, star, see...",1,0.780068,positive
4142,"[nelson, intentions, good, end, result, justic...",0,0.044514,negative
5802,"[although, much, like, first, movie, based, ro...",1,0.916731,positive
9366,"[katz, documentary, much, panache, material, r...",1,0.997156,positive
4685,"[ambitious, guilt, suffused, melodrama, crippl...",0,0.016338,negative
5923,"[quietly, reflective, melancholy, new, zealand...",1,0.999251,positive
10395,"[beautifully, directed, convincingly, acted]",1,0.998049,positive


### Please answer the questions below to complete the experiment:




**Consider the following statements and answer Q1:**


A. The GRU controls the past information by having two gates: an update gate and reset gate, the update gate determines how much of the past knowledge needs to be passed along into the future and the reset gate is used to decide how much of the past information to forget.

B. GRUs handles the vanishing gradient problem in recurrent neural networks very efficiently.

C. GRUs cannot handle the vanishing gradient problem in recurrent neural networks very efficiently.

D. GRUs solve the problem of exploding gradients using gradient clipping in which a defined threshold value is set on the gradients, which means that even if a gradient increases beyond the predefined value during training, its value will still be limited to the set threshold.



In [None]:
#@title Q1. Which of the above statement(s) is/are True? { run: "auto", form-width: "500px", display-mode: "form" }
Answer1 = "A, B and D" #@param ["", "Only A and B", "A, B and D", "Only C", "only A", "only D"]

**Consider the following statements and answer Q2:**


A. Bidirectional GRU, or BiGRU, is a sequence processing model, consisting of two GRUs, one taking the input in a forward direction, and the other in a backwards direction.

B. Bidirectional GRUs are  a type of bidirectional recurrent neural networks with only forget gates.

C. GRUs use less training parameters and therefore use less memory and execute faster and generally have a better performance than LSTMs when dealing with smaller datasets.

D. For GRU bidirectional layer output, we take the sum of outputs of both forward and backward direction layers




In [None]:
#@title Q2. Which of the above statement(s) is/are True? { run: "auto", form-width: "500px", display-mode: "form" }
Answer2 = "A and C" #@param ["", "A and B","A and C","B and D","C and D", "only A", "only C"]

In [None]:
#@title How was the experiment? { run: "auto", form-width: "500px", display-mode: "form" }
Complexity = "Good and Challenging for me" #@param ["","Too Simple, I am wasting time", "Good, But Not Challenging for me", "Good and Challenging for me", "Was Tough, but I did it", "Too Difficult for me"]


In [None]:
#@title If it was too easy, what more would you have liked to be added? If it was very difficult, what would you have liked to have been removed? { run: "auto", display-mode: "form" }
Additional = "Nil" #@param {type:"string"}


In [None]:
#@title Can you identify the concepts from the lecture which this experiment covered? { run: "auto", vertical-output: true, display-mode: "form" }
Concepts = "Yes" #@param ["","Yes", "No"]


In [None]:
#@title  Text and image description/explanation and code comments within the experiment: { run: "auto", vertical-output: true, display-mode: "form" }
Comments = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title Mentor Support: { run: "auto", vertical-output: true, display-mode: "form" }
Mentor_support = "Very Useful" #@param ["","Very Useful", "Somewhat Useful", "Not Useful", "Didn't use"]


In [None]:
#@title Run this cell to submit your notebook for grading { vertical-output: true }
try:
  if submission_id:
      return_id = submit_notebook()
      if return_id : submission_id = return_id
  else:
      print("Please complete the setup first.")
except NameError:
  print ("Please complete the setup first.")

Your submission is successful.
Ref Id: 1646
Date of submission:  30 Sep 2023
Time of submission:  15:49:41
View your submissions: https://dlfa-iisc.talentsprint.com/notebook_submissions
